Create a Slurm sbatch script to run the CONUS 12-km model:
cd /shared/conus_12km/
cat > slurm-wrf-conus12km.sh << EOF
#!/bin/bash
#SBATCH --job-name=WRF
#SBATCH --output=conus-%j.out
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --exclusive
export I_MPI_OFI_LIBRARY_INTERNAL=0
spack load intel-oneapi-mpi
spack load wrf
module load libfabric-aws
wrf_exe=\$(spack location -i wrf)/run/wrf.exe
set -x
ulimit -s unlimited
ulimit -a
export OMP_NUM_THREADS=6
export FI_PROVIDER=efa
export I_MPI_FABRICS=ofi
export I_MPI_OFI_PROVIDER=efa
export I_MPI_PIN_DOMAIN=omp
export KMP_AFFINITY=compact
export I_MPI_DEBUG=4
time mpiexec.hydra -np \$SLURM_NTASKS --ppn \$SLURM_NTASKS_PER_NODE \$wrf_exe
EOF
In the above job script we’ve set environment variables to ensure that Intel MPI and OpenMP are pinned to the correct cores and EFA is enabled. See
Environment Variable | Value |
---|---|
OMP_NUM_THREADS=6 | Number of OpenMP threads. We’re using 16 MPI procs, each with 6 OMP threads to use all 16 x 6 = 96 cores. |
FI_PROVIDER=efa | Enable EFA. This tells libfabric to use the EFA fabric. |
I_MPI_FABRICS=ofi | This tells Intel MPI to use libfabric. |
I_MPI_OFI_LIBRARY_INTERNAL=0 | This tells Intel MPI to use the version of libfabric packaged on the OS and not the one built into impi. |
I_MPI_OFI_PROVIDER=efa | This tells Intel MPI to use the libfabric efa provider. |
I_MPI_PIN_DOMAIN=omp | The domain size is equal to OMP_NUM_THREADS , this ensures that each MPI rank is associated with it’s own non-overlapping domain. |
KMP_AFFINITY=compact | Specifying compact assigns the OpenMP thread N+1 to a free thread context as close as possible to the thread context where the N OpenMP thread was placed. |
I_MPI_DEBUG=4 | Debugging info including process pinning information. |
Submit the job:
sbatch slurm-wrf-conus12km.sh
Monitor the job’s status with squeue
:
squeue
Using 192 cores, the job took 4 mins 17 seconds to complete.