This site requires JavaScript to be enabled

High Performance Computing: Submitting SLURM Job Scripts

949 views

SLURM Job Flags

The job flags are used with SBATCH command. The syntax for the SLURM directive in a script is  "#SBATCH <flag>". Some of the flags are used with the srun and salloc commands, as well as the fisbatch wrapper script for interactive jobs.

 

ResourceFlag SyntaxDescriptionNotes
partition--partition=general-computePartition is a queue for jobs.default is compute
time--time=01:00:00Time limit for the job.1 hour; default is 72 hours
nodes--nodes=2Number of compute nodes for the job.default is 1
cpus/cores--ntasks-per-node=8Corresponds to number of cores on the compute node.default is 1
node type

--constraint=IB

or

--constraint=IB&CPU-E564

Node type feature.

IB requests nodes with InfiniBand

default is no node type specified
resource feature--gres=gpu:2Request use of GPUs on compute nodesdefault is no feature specified;
memory--mem=24000Memory limit per compute node for the  job.  Do not use with mem-per-cpu flag.memory in MB; default limit is 3000MB per core
memory--mem-per-cpu=4000Per core memory limit.  Do not use the mem flag,memory in MB; default limit is 3000MB per core
account--account=user-accountUsers may belong to groups or accounts.default is the user's primary group.
job name--job-name="hello_test"Name of job.default is the JobID
output file--output=test.outName of file for stdout.default is the JobID
email address--mail-user=username@rowan.eduUser's email addressrequired
email notification

--mail-type=ALL

--mail-type=END

When email is sent to user.omit for no email
access--exclusiveExclusive access to compute nodes.default is sharing nodes

Sample Hello World SLURM Script

#!/bin/sh
##SBATCH --partition=compute
#SBATCH --time=00:15:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
# Memory per node specification is in MB. It is optional.
# The default limit is 3GB per core.
##SBATCH --mem=24000
#SBATCH --job-name="hello_test"
#SBATCH --output=test-srun.out
#SBATCH --mail-user=username@rowan.edu
#SBATCH --mail-type=ALL
#Specifies that the job will be requeued after a node failure.
#The default is that the job will not be dequeued.
##SBATCH --requeue
  
echo "SLURM_JOBID="$SLURM_JOBID
echo "SLURM_JOB_NODELIST"=$SLURM_JOB_NODELIST
echo "SLURM_NNODES"=$SLURM_NNODES
echo "SLURMTMPDIR="$SLURMTMPDIR cd $SLURM_SUBMIT_DIR
echo "working directory = "$SLURM_SUBMIT_DIR
module load intel-mpi/4.1.0
module list
ulimit -s unlimited
echo "Launch helloworld with sun"
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
srun ./helloworld
echo "All Done!"

 

More sample SLURM scripts can be found on rucc @ /cm/shared/examples/workload/slurm/jobscripts