All,
I am trying to run OPM Flow simulations on multiple nodes. I have built OPM Flow from source on Oracle Linux 7 OS (binary compatible with RHEL) with: . GCC-8.3.1 . openmpi-4.0.2 (built from source) . boost-1.72.0 (built from source) . cmake-3.16.4 (built from source) . parmetis-4.0.3 (built from source) . dune-2.6.0: dune-common, dune-geometry, dune-grid, dune-istl (built from source) . Zoltan-3.83 (built from source) . OPM Flow modules are built using following commads: o cmake -DCMAKE_BUILD_TYPE=Release -DUSE_MPI=ON -DUSE_OPENMP=ON -DBLAS_LIBRARIES=/usr/lib64 -DCMAKE_INSTALL_PREFIX=/usr/local .. o sudo make For Norne data set, following is the input file (params) content: ecl-deck-file-name=NORNE_ATW2013.DATA output-dir=out_parallel output-mode=none output-interval=1000000 enable-opm-rst-file=false threads-per-process=1 Simulation is being run on 4 nodes with 32 processors each using following command: mpirun --display-map -mca btl self -x UCX_TLS=rc,self,sm -x HCOLL_ENABLE_MCAST_ALL=0 -mca coll_hcoll_enable 0 -x UCX_IB_TRAFFIC_CLASS=105 -x UCX_IB_GID_INDEX=3 --cpu-set 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35 -np 144 --hostfile /etc/opt/rdma/hostfile /mnt/nfs-share/etc/opm-flow/opm-simulators/build/bin/flow --parameter-file=/mnt/nfs-share/data/norne/params The simulation get stuck indefinitely at the domain decomposition step. I am able to finish a parallel run up to 3 nodes, but always getting stuck at 4 nodes. I have also created some customized simulation decks with about 11 million cells to rule-out that fewer number of cells in the Norne model may be a reason, but the simulation gets stuck as soon as I scale from 1 node to 2 nodes. Can someone please help me understand, what might be causing it? Thank you, Yogi _______________________________________________ Opm mailing list Opm@opm-project.org https://opm-project.org/cgi-bin/mailman/listinfo/opm