Brice: is there any way to tell if these are coming from Slurm vs OMPI? Given this data, I’m suspicious that this might have something to do with Slurm and not us.
> On Dec 10, 2014, at 9:45 AM, Pim Schellart <p.schell...@gmail.com> wrote: > > Dear Gilles et al., > > we tested with openmpi compiled from source (version 1.8.3) both with: > > ./configure --prefix=/usr/local/openmpi --disable-silent-rules > --with-libltdl=external --with-devel-headers --with-slurm > --enable-heterogeneous --disable-vt --sysconfdir=/etc/openmpi > > and > > ./configure --prefix=/usr/local/openmpi --with-hwloc=/usr > --disable-silent-rules --with-libltdl=external --with-devel-headers > --with-slurm --enable-heterogeneous --disable-vt --sysconfdir=/etc/openmpi > > (e.g. with embedded and external hwloc) and the issue remains the same. > Meanwhile we have found another interesting detail. A job is started > consisting of four tasks split over two nodes. If this is the only job > running on those nodes the out-of-order warnings do not appear. However, if > multiple jobs are running the warnings do appear but only for the jobs that > are started later. We suspect that this is because for the first started job > the CPU cores assigned are 0 and 1 whereas they are different for the later > started jobs. I attached the output (including lstopo —of xml output (called > for each task)) for both the working and broken case again. > > Kind regards, > > Pim Schellart > > <broken.txt><working.txt> > >> On 09 Dec 2014, at 09:38, Gilles Gouaillardet >> <gilles.gouaillar...@iferc.org> wrote: >> >> Pim, >> >> if you configure OpenMPI with --with-hwloc=external (or something like >> --with-hwloc=/usr) it is very likely >> OpenMPI will use the same hwloc library (e.g. the "system" library) that >> is used by SLURM >> >> /* i do not know how Ubuntu packages OpenMPI ... */ >> >> >> The default (e.g. no --with-hwloc parameter in the configure command >> line) is to use the hwloc library that is embedded within OpenMPI >> >> Gilles >> >> On 2014/12/09 17:34, Pim Schellart wrote: >>> Ah, ok so that was where the confusion came from, I did see hwloc in the >>> SLURM sources but couldn’t immediately figure out where exactly it was >>> used. We will try compiling openmpi with the embedded hwloc. Any particular >>> flags I should set? >>> >>>> On 09 Dec 2014, at 09:30, Ralph Castain <r...@open-mpi.org> wrote: >>>> >>>> There is no linkage between slurm and ompi when it comes to hwloc. If you >>>> directly launch your app using srun, then slurm will use its version of >>>> hwloc to do the binding. If you use mpirun to launch the app, then we’ll >>>> use our internal version to do it. >>>> >>>> The two are completely isolated from each other. >>>> >>>> >>>>> On Dec 9, 2014, at 12:25 AM, Pim Schellart <p.schell...@gmail.com> wrote: >>>>> >>>>> The version that “lstopo --version” reports is the same (1.8) on all >>>>> nodes, but we may indeed be hitting the second issue. We can try to >>>>> compile a new version of openmpi, but how do we ensure that the external >>>>> programs (e.g. SLURM) are using the same hwloc version as the one >>>>> embedded in openmpi? Is it enough to just compile hwloc 1.9 separately as >>>>> well and link against that? Also, if this is an issue, should we file a >>>>> bug against hwloc or openmpi on Ubuntu for mismatching versions? >>>>> >>>>>> On 09 Dec 2014, at 00:50, Ralph Castain <r...@open-mpi.org> wrote: >>>>>> >>>>>> Hmmm…they probably linked that to the external, system hwloc version, so >>>>>> it sounds like one or more of your nodes has a different hwloc rpm on it. >>>>>> >>>>>> I couldn’t leaf thru your output well enough to see all the lstopo >>>>>> versions, but you might check to ensure they are the same. >>>>>> >>>>>> Looking at the code base, you may also hit a problem here. OMPI 1.6 >>>>>> series was based on hwloc 1.3 - the output you sent indicated you have >>>>>> hwloc 1.8, which is quite a big change. OMPI 1.8 series is based on >>>>>> hwloc 1.9, so at least that is closer (though probably still a mismatch). >>>>>> >>>>>> Frankly, I’d just download and install an OMPI tarball myself and avoid >>>>>> these headaches. This mismatch in required versions is why we embed >>>>>> hwloc as it is a critical library for OMPI, and we had to ensure that >>>>>> the version matched our internal requirements. >>>>>> >>>>>> >>>>>>> On Dec 8, 2014, at 8:50 AM, Pim Schellart <p.schell...@gmail.com> wrote: >>>>>>> >>>>>>> It is the default openmpi that comes with Ubuntu 14.04. >>>>>>> >>>>>>>> On 08 Dec 2014, at 17:17, Ralph Castain <r...@open-mpi.org> wrote: >>>>>>>> >>>>>>>> Pim: is this an OMPI you built, or one you were given somehow? If you >>>>>>>> built it, how did you configure it? >>>>>>>> >>>>>>>>> On Dec 8, 2014, at 8:12 AM, Brice Goglin <brice.gog...@inria.fr> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> It likely depends on how SLURM allocates the cpuset/cgroup inside the >>>>>>>>> nodes. The XML warning is related to these restrictions inside the >>>>>>>>> node. >>>>>>>>> Anyway, my feeling is that there's a old OMPI or a old hwloc >>>>>>>>> somewhere. >>>>>>>>> >>>>>>>>> How do we check after install whether OMPI uses the embedded or the >>>>>>>>> system-wide hwloc? >>>>>>>>> >>>>>>>>> Brice >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Le 08/12/2014 17:07, Pim Schellart a écrit : >>>>>>>>>> Dear Ralph, >>>>>>>>>> >>>>>>>>>> the nodes are called coma## and as you can see in the logs the nodes >>>>>>>>>> of the broken example are the same as the nodes of the working one, >>>>>>>>>> so that doesn’t seem to be the cause. Unless (very likely) I’m >>>>>>>>>> missing something. Anything else I can check? >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> >>>>>>>>>> Pim >>>>>>>>>> >>>>>>>>>>> On 08 Dec 2014, at 17:03, Ralph Castain <r...@open-mpi.org> wrote: >>>>>>>>>>> >>>>>>>>>>> As Brice said, OMPI has its own embedded version of hwloc that we >>>>>>>>>>> use, so there is no Slurm interaction to be considered. The most >>>>>>>>>>> likely cause is that one or more of your nodes is picking up a >>>>>>>>>>> different version of OMPI. So things “work” if you happen to get >>>>>>>>>>> nodes where all the versions match, and “fail” when you get a >>>>>>>>>>> combination that includes a different version. >>>>>>>>>>> >>>>>>>>>>> Is there some way you can narrow down your search to find the >>>>>>>>>>> node(s) that are picking up the different version? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Dec 8, 2014, at 7:48 AM, Pim Schellart <p.schell...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Dear Brice, >>>>>>>>>>>> >>>>>>>>>>>> I am not sure why this is happening since all code seems to be >>>>>>>>>>>> using the same hwloc library version (1.8) but it does :) An MPI >>>>>>>>>>>> program is started through SLURM on two nodes with four CPU cores >>>>>>>>>>>> total (divided over the nodes) using the following script: >>>>>>>>>>>> >>>>>>>>>>>> #! /bin/bash >>>>>>>>>>>> #SBATCH -N 2 -n 4 >>>>>>>>>>>> /usr/bin/mpiexec /usr/bin/lstopo --version >>>>>>>>>>>> /usr/bin/mpiexec /usr/bin/lstopo --of xml >>>>>>>>>>>> /usr/bin/mpiexec /path/to/my_mpi_code >>>>>>>>>>>> >>>>>>>>>>>> When this is submitted multiple times it gives “out-of-order” >>>>>>>>>>>> warnings in about 9/10 cases but works without warnings in 1/10 >>>>>>>>>>>> cases. I attached the output (with xml) for both the working and >>>>>>>>>>>> `broken` case. Note that the xml is of course printed >>>>>>>>>>>> (differently) multiple times for each task/core. As always, any >>>>>>>>>>>> help would be appreciated. >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> >>>>>>>>>>>> Pim Schellart >>>>>>>>>>>> >>>>>>>>>>>> P.S. $ mpirun --version >>>>>>>>>>>> mpirun (Open MPI) 1.6.5 >>>>>>>>>>>> >>>>>>>>>>>> <broken.log><working.log> >>>>>>>>>>>> >>>>>>>>>>>>> On 07 Dec 2014, at 13:50, Brice Goglin <brice.gog...@inria.fr> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hello >>>>>>>>>>>>> The github issue you're refering to was closed 18 months ago. The >>>>>>>>>>>>> warning (it's not an error) is only supposed to appear if you're >>>>>>>>>>>>> importing in a recent hwloc a XML that was exported from a old >>>>>>>>>>>>> hwloc. I >>>>>>>>>>>>> don't see how that could happen when using Open MPI since the >>>>>>>>>>>>> hwloc >>>>>>>>>>>>> versions on both sides is the same. >>>>>>>>>>>>> Make sure you're not confusing with another error described here >>>>>>>>>>>>> >>>>>>>>>>>>> http://www.open-mpi.org/projects/hwloc/doc/v1.10.0/a00028.php#faq_os_error >>>>>>>>>>>>> Otherwise please report the exact Open MPI and/or hwloc versions >>>>>>>>>>>>> as well >>>>>>>>>>>>> as the XML lstopo output on the nodes that raise the warning >>>>>>>>>>>>> (lstopo >>>>>>>>>>>>> foo.xml). Send these to hwloc mailing lists such as >>>>>>>>>>>>> hwloc-us...@open-mpi.org or hwloc-de...@open-mpi.org >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> Brice >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Le 07/12/2014 13:29, Pim Schellart a écrit : >>>>>>>>>>>>>> Dear OpenMPI developers, >>>>>>>>>>>>>> >>>>>>>>>>>>>> this might be a bit off topic but when using the SLURM scheduler >>>>>>>>>>>>>> (with cpuset support) on Ubuntu 14.04 (openmpi 1.6) hwloc >>>>>>>>>>>>>> sometimes gives a "out-of-order topology discovery” error. >>>>>>>>>>>>>> According to issue #103 on github >>>>>>>>>>>>>> (https://github.com/open-mpi/hwloc/issues/103) this error was >>>>>>>>>>>>>> discussed before and it was possible to sort it out in >>>>>>>>>>>>>> “insert_object_by_parent”, is this still considered? If not, >>>>>>>>>>>>>> what (top level) hwloc API call should we look for in the SLURM >>>>>>>>>>>>>> sources to start debugging? Any help will be most welcome. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Kind regards, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Pim Schellart >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> devel mailing list >>>>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>>>> Link to this post: >>>>>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2014/12/16441.php >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> devel mailing list >>>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>>> Link to this post: >>>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2014/12/16447.php >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> devel mailing list >>>>>>>>>>> de...@open-mpi.org >>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>>> Link to this post: >>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2014/12/16448.php >>>>>>>>>> _______________________________________________ >>>>>>>>>> devel mailing list >>>>>>>>>> de...@open-mpi.org >>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>>> Link to this post: >>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2014/12/16449.php >>>>>>>>> _______________________________________________ >>>>>>>>> devel mailing list >>>>>>>>> de...@open-mpi.org >>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>> Link to this post: >>>>>>>>> http://www.open-mpi.org/community/lists/devel/2014/12/16450.php >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> de...@open-mpi.org >>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> Link to this post: >>>>>>>> http://www.open-mpi.org/community/lists/devel/2014/12/16451.php >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> de...@open-mpi.org >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> Link to this post: >>>>>>> http://www.open-mpi.org/community/lists/devel/2014/12/16452.php >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/devel/2014/12/16453.php >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/devel/2014/12/16458.php >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/devel/2014/12/16460.php >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/12/16464.php >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/12/16465.php > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/12/16492.php