Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?
If I'm not mistaken, hcoll is playing with the opal_progress in a way that conflicts with the blessed usage of progress in OMPI and prevents other components from advancing and timely completing requests. The impact is minimal for sequential applications using only blocking calls, but is jeopardizing performance when multiple types of communications are simultaneously executing or when multiple threads are active. The solution might be very simple: hcoll is a module providing support for collective communications so as long as you don't use collectives, or the tuned module provides collective performance similar to hcoll on your cluster, just go ahead and disable hcoll. You can also reach out to Mellanox folks asking them to fix the hcoll usage of opal_progress. George. On Mon, Feb 3, 2020 at 11:09 AM Angel de Vicente via users < users@lists.open-mpi.org> wrote: > Hi, > > in one of our codes, we want to create a log of events that happen in > the MPI processes, where the number of these events and their timing is > unpredictable. > > So I implemented a simple test code, where process 0 > creates a thread that is just busy-waiting for messages from any > process, and which is sent to stdout/stderr/log file upon receiving > them. The test code is at https://github.com/angel-devicente/thread_io > and the same idea went into our "real" code. > > As far as I could see, this behaves very nicely, there are no deadlocks, > no lost messages and the performance penalty is minimal when considering > the real application this is intended for. > > But then I found that in a local cluster the performance was very bad > (from ~5min 50s to ~5s for some test) when run with the locally > installed OpenMPI and my own OpenMPI installation (same gcc and OpenMPI > versions). Checking the OpenMPI configuration details, I found that the > locally installed OpenMPI was configured to use the Mellanox IB driver, > and in particular the hcoll component was somehow killing performance: > > running with > > mpirun --mca coll_hcoll_enable 0 -np 51 ./test_t > > was taking ~5s, while enabling coll_hcoll was killing performance, as > stated above (when run in a single node the performance also goes down, > but only about a factor 2X). > > Has anyone seen anything like this? Perhaps a newer Mellanox driver > would solve the problem? > > We were planning on making our code public, but before we do so, I want > to understand under which conditions we could have this problem with the > "Threaded I/O" approach and if possible how to get rid of it completely. > > Any help/pointers appreciated. > -- > Ángel de Vicente > > Tel.: +34 922 605 747 > Web.: http://research.iac.es/proyecto/polmag/ > > - > ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protección de > Datos, acceda a http://www.iac.es/disclaimer.php > WARNING: For more information on privacy and fulfilment of the Law > concerning the Protection of Data, consult > http://www.iac.es/disclaimer.php?lang=en > >
Re: [OMPI users] OpenFabrics
> On Feb 3, 2020, at 12:35 PM, Bennet Fauber wrote: > > This is what CentOS installed. > > $ yum list installed hwloc\* > Loaded plugins: langpacks > Installed Packages > hwloc.x86_64 1.11.8-4.el7 > @os > hwloc-devel.x86_64 1.11.8-4.el7 > @os > hwloc-libs.x86_641.11.8-4.el7 > @os I believe that those versions of hwloc are sufficient. > I will ask my coworker to install a test version. What can I do by > way of flags or environment variables to get the best output to > report? I believe that `srun` is preferred as the process starter on > Slurm clusters, but I think `mpirun`/`orterun` has better debugging > capabilities? It depends on what is wrong. ;-) You mentioned that "something was awry" with the `--with-hwloc=external` installation... -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI users] OpenFabrics
This is what CentOS installed. $ yum list installed hwloc\* Loaded plugins: langpacks Installed Packages hwloc.x86_64 1.11.8-4.el7@os hwloc-devel.x86_64 1.11.8-4.el7@os hwloc-libs.x86_641.11.8-4.el7@os I will ask my coworker to install a test version. What can I do by way of flags or environment variables to get the best output to report? I believe that `srun` is preferred as the process starter on Slurm clusters, but I think `mpirun`/`orterun` has better debugging capabilities? Thanks,-- bennet On Mon, Feb 3, 2020 at 12:02 PM Jeff Squyres (jsquyres) wrote: > > On Feb 3, 2020, at 10:03 AM, Bennet Fauber wrote: > > > > Ah, ha! > > > > Yes, that seems to be it. Thanks. > > Ok, good. I understand that UCX is the "preferred" mechanism for IB these > days. > > > If I might, on a configure related note ask, whether, if we have > > these installed with the CentOS 7.6 we are running > > > > $ yum list installed libevent\* > > Loaded plugins: langpacks > > Installed Packages > > libevent.x86_64 2.0.21-4.el7 > > @anaconda > > libevent-devel.x86_64 2.0.21-4.el7 @os > > > > should be be able to use this? > > > >./configure ... --with-libevent=external --with-hwloc=external > > > > My coworker reported that something was awry using that, and he's put > > instead > > > >./configure ... --with-libevent=external --with-hwloc=/usr > > > > I believe that the problem was that if we did not specify /usr, then > > srun and mpirun were unable to find the interfaces. But I also recall > > from an earlier thread that is very much not recommended. > > I don't know offhand if the hwloc and lib event bundled in Centos 7 are > sufficient. They probably are, but I don't know that for a fact. I'd be > curious to know what the problem was if --with-hwloc=external didn't work > (assuming that the Centos 7-bundled hwloc was the only one found in your > PATH/LD_LIBRARY_PATH/compiler include+linker paths/etc.). > > -- > Jeff Squyres > jsquy...@cisco.com >
Re: [OMPI users] OpenFabrics
On Feb 3, 2020, at 10:03 AM, Bennet Fauber wrote: > > Ah, ha! > > Yes, that seems to be it. Thanks. Ok, good. I understand that UCX is the "preferred" mechanism for IB these days. > If I might, on a configure related note ask, whether, if we have > these installed with the CentOS 7.6 we are running > > $ yum list installed libevent\* > Loaded plugins: langpacks > Installed Packages > libevent.x86_64 2.0.21-4.el7 > @anaconda > libevent-devel.x86_64 2.0.21-4.el7 @os > > should be be able to use this? > >./configure ... --with-libevent=external --with-hwloc=external > > My coworker reported that something was awry using that, and he's put instead > >./configure ... --with-libevent=external --with-hwloc=/usr > > I believe that the problem was that if we did not specify /usr, then > srun and mpirun were unable to find the interfaces. But I also recall > from an earlier thread that is very much not recommended. I don't know offhand if the hwloc and lib event bundled in Centos 7 are sufficient. They probably are, but I don't know that for a fact. I'd be curious to know what the problem was if --with-hwloc=external didn't work (assuming that the Centos 7-bundled hwloc was the only one found in your PATH/LD_LIBRARY_PATH/compiler include+linker paths/etc.). -- Jeff Squyres jsquy...@cisco.com
[OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?
Hi, in one of our codes, we want to create a log of events that happen in the MPI processes, where the number of these events and their timing is unpredictable. So I implemented a simple test code, where process 0 creates a thread that is just busy-waiting for messages from any process, and which is sent to stdout/stderr/log file upon receiving them. The test code is at https://github.com/angel-devicente/thread_io and the same idea went into our "real" code. As far as I could see, this behaves very nicely, there are no deadlocks, no lost messages and the performance penalty is minimal when considering the real application this is intended for. But then I found that in a local cluster the performance was very bad (from ~5min 50s to ~5s for some test) when run with the locally installed OpenMPI and my own OpenMPI installation (same gcc and OpenMPI versions). Checking the OpenMPI configuration details, I found that the locally installed OpenMPI was configured to use the Mellanox IB driver, and in particular the hcoll component was somehow killing performance: running with mpirun --mca coll_hcoll_enable 0 -np 51 ./test_t was taking ~5s, while enabling coll_hcoll was killing performance, as stated above (when run in a single node the performance also goes down, but only about a factor 2X). Has anyone seen anything like this? Perhaps a newer Mellanox driver would solve the problem? We were planning on making our code public, but before we do so, I want to understand under which conditions we could have this problem with the "Threaded I/O" approach and if possible how to get rid of it completely. Any help/pointers appreciated. -- Ángel de Vicente Tel.: +34 922 605 747 Web.: http://research.iac.es/proyecto/polmag/ - ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protección de Datos, acceda a http://www.iac.es/disclaimer.php WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en
Re: [OMPI users] OpenFabrics
Ah, ha! Yes, that seems to be it. Thanks. If I might, on a configure related note ask, whether, if we have these installed with the CentOS 7.6 we are running $ yum list installed libevent\* Loaded plugins: langpacks Installed Packages libevent.x86_64 2.0.21-4.el7 @anaconda libevent-devel.x86_64 2.0.21-4.el7 @os should be be able to use this? ./configure ... --with-libevent=external --with-hwloc=external My coworker reported that something was awry using that, and he's put instead ./configure ... --with-libevent=external --with-hwloc=/usr I believe that the problem was that if we did not specify /usr, then srun and mpirun were unable to find the interfaces. But I also recall from an earlier thread that is very much not recommended. We are still struggling with new IB hardware, new scheduler (Slurm), PMIx, and OpenMPI, so I am a bit muddled about how all the moving pieces work yet. On Sun, Feb 2, 2020 at 4:16 PM Jeff Squyres (jsquyres) wrote: > > Bennet -- > > Just curious: is there a reason you're not using UCX? > > > > On Feb 2, 2020, at 4:06 PM, Bennet Fauber via users > > wrote: > > > > We get these warnings/error from OpenMPI, version 3.1.4 and 4.0.2 > > > > -- > > WARNING: No preset parameters were found for the device that Open MPI > > detected: > > > > Local host:gl3080 > > Device name: mlx5_0 > > Device vendor ID: 0x02c9 > > Device vendor part ID: 4123 > > > > Default device parameters will be used, which may result in lower > > performance. You can edit any of the files specified by the > > btl_openib_device_param_files MCA parameter to set values for your > > device. > > > > NOTE: You can turn off this warning by setting the MCA parameter > > btl_openib_warn_no_device_params_found to 0. > > -- > > > > -- > > WARNING: There was an error initializing an OpenFabrics device. > > > > Local host: gl3080 > > Local device: mlx5_0 > > -- > > > > Does anyone know how I can find the parameters that should be set in > > $PREFIX/etc/btl_openib_device_param.conf or other OpenMPI > > configuration files so that those warnings do not occur? > > > > How might I find the cause of the initialization error? > > > > Sorry for the ignorance behind this question. > > > -- > Jeff Squyres > jsquy...@cisco.com >