Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Bruno Chareyre posted a new comment: Hi, for clarity: - did you compile openmpi with intel or gcc? - then do you use that with singularity or with yade compiled on the cluster? Bruno -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Luis Barbosa posted a new comment: Hi all, just to inform that we solved this: Solution was to install OpenMpi 4.1.0 on my home directory on the cluster. Here are the steps I performed: Download https://download.open-mpi.org/release/open-mpi/v4.1/openmpi-4.1.0.tar.gz open a console on the cluster in your home folder: cd ~ mkdir openMPI cd openMPI > copy or download openmpi-4.1.0.tar.gz tar ball into the openMPI folder, and unpack it gtar zxf openmpi-4.1.0.tar.gz cd openmpi-4.1.0 > create a output folder mkdir /home/pires/bin/openmpi > create a build folder, and compile openMPI mkdir build cd build ../configure --prefix=/home/pires/bin/openmpi --with-slurm --with-pmi make all make install -- at this point you have openMPI 4.1.0 with slurm support installed into your home. Next step is to make your own module. mkdir /home/../privatemodules cd /home/.../privatemodules A module file is just a bunch of linux environment variables. Hope it can help someone in this situation. Luis -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Bruno Chareyre posted a new comment: A walkthrough of Intel compilation would be valuable, especially if it's less trivial than just changing the compiler (it needs linkage to more dynamic libraries iirc). But in fact installing gcc on the cluster would make a lot of sense too. I don't know that a clusters can have only one compiler (not the case in Grenoble at least). Bruno -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Status: Answered => Solved Luis Barbosa confirmed that the question is solved: Hi Bruno, We will try to compile yade with intel on the cluster directly. Since I need IT for that it will take some time to make it. I will close the issue for now, but when I finish this I will post the solution to keep documented here. cheers, Luis -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Bruno Chareyre requested more information: Is it a cluster you're admin of? -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Status: Open => Answered Chareyre proposed the following answer: Since different versions of openMPI is already a problem there's no way it will work with intel+openMPI. Indeed generating a new docker is an option, provided that you manage to compile yade with intel compiler therein. I see no escape and it's not really a problem with singularity : either you get openMPI installed on the cluster or you compile yade with intel on the cluster directly (or you compile yade with intel then you make a docker->singularity of this, but it's only adding another layer of possible trouble). -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Luis Barbosa gave more information on the question: Hello all, Since the OpenMPI and Intel MPI in my cluster may not be compatible and I dont think it is possible to get these two different MPI systems running. Would you suggest some work arround? I thought in creating a new docker image replacing OpenMPI with Intel MPI. However, I don't think this is supported by Yade or would run in dependency problems. Maybe I could run multiple scripts on different nodes? not sure how it would work acctually... Any insight would be welcome. Luis -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Status: Answered => Open Luis Barbosa is still having a problem: > "- also mind that the OpenMPI version on the host must be the same (or sufficiently close to) the one in the singularity image, else it won't connect communicate properly." I run yade --test to check MPI: | mpi | 3.1 | ompi:4.1.0 | +---+---+-+ | mpi4py| 3.0.3 | My cluster has Intel(R) MPI Library, Version 2019 Update 8 installed. Do you know if this is compatible? -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Chareyre proposed the following answer: Please check my command again. It's "mpiexec singularity" not "singularity mpiexec". That kind of message, if it persists may mean a local problem in your installation (independent of Yade): > "Actually The SLURM process starter for OpenMPI was unable to locate a usable "srun" command in its path." Hence why I suggested to try with a real dummy program (you may find some in mpi4py test suite) independently of Yade. Or even more basic (but is will not test mpi4py): echo "print('hello')" > test.py mpiexec -np 10 singularity *.sif python test.py Does it print "hello" 10 times? Bruno -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Luis Barbosa posted a new comment: Hi Bruno, thanks. I just run a simple script[1] to try non-interactive execution (mpiexec -np NUMSUBD+1 yade script.py) Got this message: + singularity run /beegfs/common/singularity/yade/yade_debian_bookwarm_1.0.sif mpiexec -n 4 yade Parallel.py -- The SLURM process starter for OpenMPI was unable to locate a usable "srun" command in its path. Please check your path and try again. -- -- An internal error has occurred in ORTE: [[58454,0],0] FORCE-TERMINATE AT (null):1 - error ../../../../../../orte/mca/plm/slurm/plm_slurm_module.c(475) This is something that should be reported to the developers. -- I will try this mpiexec -np N singularity run. Kind regards and thanks for all insights. [1]https://gitlab.com/yade- dev/trunk/blob/master/examples/mpi/vtkRecorderExample.py -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Status: Open => Answered Chareyre proposed the following answer: Hi, This: "mpi4py.MPI.Exception: MPI_ERR_SPAWN: could not spawn processes" is a message I've seen so many times. It isn't related to yade in most cases, I would first try and run some dummy mpi4py scripts to see what happens. There can be: - some version conflicts between OpenMPI and mpi4py, or some bug in one of them - hardware-specific or infrastructure-specific problems (not all HPC may enable spawning processes - see [1]) - also mind that the OpenMPI version on the host must be the same (or sufficiently close to) the one in the singularity image, else it won't connect communicate properly. - on a cluster in Grenoble I found that the combination MPI/Singularity would run properly with a ubuntu16.04 image but not with a 20.04 image (no clue why at this point). [1] If spawning is disabled on purpose on the cluster it might help to run mpi explicitely, as in: mpiexec -np N singularity run /beegfs/common/singularity/yade/yade_debian_bookwarm_1.0.sif yade Case2_rotating_drum_mpi.py Last note: if you run "yade -j5" on 10 mpi processes it actually means 10x5=50 cores (mpi x openMP). Bruno -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
Re: [Yade-users] [Question #700315]: Open MPI - spawn processes
Question #700315 on Yade changed: https://answers.launchpad.net/yade/+question/700315 Luis Barbosa posted a new comment: Please consider Parallel.py=Case2_rotating_drum_mpi.py -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp
[Yade-users] [Question #700315]: Open MPI - spawn processes
New question #700315 on Yade: https://answers.launchpad.net/yade/+question/700315 Hi guys, I am new in OpenMPI so I will try to be as clear as possible here. I instaled Yade 2021.01a on my cluster singularity/yade/yade_debian_bookwarm_1.0.sif. I can run simulations using all cores I want. My cluster has 160 nodes, each node 80 cpu's. So far so good. I am now trying to run multiple nodes. For this, I am checking out this example [1]. When I run it I am getting the following message: + singularity run /beegfs/common/singularity/yade/yade_debian_bookwarm_1.0.sif yade -j5 Parallel.py /usr/lib/x86_64-linux-gnu/yade/py/yade/__init__.py:76: RuntimeWarning: to-Python converter for boost::shared_ptr already registered; second conversion method ignored. boot.initialize(plugins,config.confDir) TCP python prompt on localhost:9000, auth cookie `ksaeuc' Welcome to Yade 2021.01a Using python version: 3.9.7 (default, Sep 24 2021, 09:43:00) [GCC 10.3.0] Warning: no X rendering available (see https://bbs.archlinux.org/viewtopic.php?id=13189) XMLRPC info provider on http://localhost:21000 Running script Parallel.py Traceback (most recent call last): File "/usr/bin/yade", line 343, in runScript execfile(script,globals()) File "/usr/lib/python3/dist-packages/past/builtins/misc.py", line 87, in execfile exec_(code, myglobals, mylocals) File "Parallel.py", line 28, in mp.initialize(numMPIThreads) File "/usr/lib/x86_64-linux-gnu/yade/py/yade/mpy.py", line 288, in initialize comm_slave = MPI.COMM_WORLD.Spawn(yadeArgv[0], args=yadeArgv[1:],maxprocs=numThreads-process_count) File "mpi4py/MPI/Comm.pyx", line 1534, in mpi4py.MPI.Intracomm.Spawn mpi4py.MPI.Exception: MPI_ERR_SPAWN: could not spawn processes [95mMaster: will spawn 9 workers running: /usr/bin/yade ['-j5', 'Parallel.py'] [0m [[ ^L clears screen, ^U kills line. [1mF8[0m plot. ]] In [1]: Do you really want to exit ([y]/n)? I am not sure from where it is comming. Any idea? This is how I am running it in my Batch: #!/bin/bash -x #SBATCH --nodes=2 #SBATCH --ntasks=2 #SBATCH --cpus-per-task=80 #SBATCH --partition=compute #SBATCH --job-name=DEM_PFV_Parallel #SBATCH --time=10:00:00 singularity run /beegfs/common/singularity/yade/yade_debian_bookwarm_1.0.sif yade -j5 Case2_rotating_drum_mpi.py PS. I am supposing that numMPIThreads = 10 in the python script is equal to nodes*-j (2*5 in this case). [1]https://gitlab.com/yade-dev/trunk/-/blob/master/examples/DEM2020Benchmark/Case2_rotating_drum_mpi.py -- You received this question notification because your team yade-users is an answer contact for Yade. ___ Mailing list: https://launchpad.net/~yade-users Post to : yade-users@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-users More help : https://help.launchpad.net/ListHelp