[OMPI devel] Empty-initializer problems w/ PGI
I am pretty sure we fixed something very similar a couple months back. The following is from "make check" with a recent (Tuesday night?) master tarball. CC unpack_ooo.o PGC-S-0155-Empty initializer not supported (/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-pgi-13.10/openmpi-dev-2371-gea935df/test/datatype/unpack_ooo.c: 34) PGC-S-0155-Empty initializer not supported (/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-pgi-13.10/openmpi-dev-2371-gea935df/test/datatype/unpack_ooo.c: 39) PGC/x86-64 Linux 13.10-0: compilation completed with severe errors Running "make -k check" shows no other errors. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Computer Languages & Systems Software (CLaSS) Group Computer Science Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
Re: [OMPI devel] fortran calling MPI_* instead of PMPI_*
I am lost ... from ompi/mpi/fortran/mpif-h/profile/palltoall_f.c void ompi_alltoall_f(char *sendbuf, MPI_Fint *sendcount, MPI_Fint *sendtype, char *recvbuf, MPI_Fint *recvcount, MPI_Fint *recvtype, MPI_Fint *comm, MPI_Fint *ierr) { [...] c_ierr = MPI_Alltoall(sendbuf, OMPI_FINT_2_INT(*sendcount), c_sendtype, recvbuf, OMPI_FINT_2_INT(*recvcount), c_recvtype, c_comm); [...] } $ nm ompi/mpi/fortran/mpif-h/profile/.libs/palltoall_f.o | grep MPI_Alltoall U MPI_Alltoall W MPI_Alltoall_f W MPI_Alltoall_f08 W PMPI_Alltoall_f W PMPI_Alltoall_f08 ompi_alltoall_f() calls MPI_Alltoall() the "natural" way of writing a tool is to write mpi_alltoall_ (that calls pmpi_alltoall_) *and* MPI_Alltoall (that calls PMPI_Alltoall) since ompi_alltoall_f invokes MPI_Alltoall (and not PMPI_Alltoall), the tool is invoked twice, by both the Fortran and C wrapper. my initial question was "why does ompi_alltoall_f invokes MPI_Alltoall instead of PMPI_Alltoall ?" /* since we share the same source code when building with or without mpi profiling, that means we would need to #define MPI_Alltoall PMPI_Alltoall when ompi is configure'd with --enable-mpi-profile */ of course, if the tool does not define its own MPI_Alltoall subroutine, then then PMPI_Alltoall is invoked directly since MPI_Alltoall is a weak symbol pointing to PMPI_Alltoall. Cheers, Gilles On 8/26/2015 9:39 AM, Jeff Squyres (jsquyres) wrote: On Aug 25, 2015, at 11:03 AM, George Bosilca wrote: This seems to be the case only with the TKR interface. All the others are either calling the OMPI version directly (mpif-h), or are calling some other internal (or weak symbol function). Yes, those might need to be updated. Not it! (let's let the TKR interface die...) You're right about the mpif-h interface, though -- they call the PMPI versions of the functions (through weak symbols). However, our use of weak symbols might be confusing to the tool -- is it somehow intercepting our call from ompi_send_f() to PMPI_Send(), for example? You might want to step through with a debugger to see what's happening, because the debugger should show the name of the symbol that is invoked in the call stack, even though the pointer in the source code may show you in "MPI_Send()" (remember: we compile the C code for our functions potential with #defines that turn MPI_Send into PMPI_Send, etc.).
[OMPI devel] OpenMPI 1.8 Bug Report
Dear OpenMPI developers, I noticed a bug in the definition of the 3 MPI-3 RMA functions MPI_Compare_and_swap, MPI_Fetch_and_op and MPI_Raccumulate. According to the MPI standard, the origin_addr and compare_addr parameters of these functions have a const attribute, which is missing in OpenMPI's mpi.h (OpenMPI 1.8.x and 1.10.0). Regards, Michael -- Michael Knobloch Institute for Advanced Simulation (IAS) Jülich Supercomputing Centre (JSC) Telefon: +49 2461 61-3546 Telefax: +49 2461 61-6656 E-Mail: m.knobl...@fz-juelich.de Internet: http://www.fz-juelich.de/jsc Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt
Re: [OMPI devel] OpenMPI 1.8 Bug Report
Oh, I also noticed it yesterday and was about to report it. And one more, the base parameter of MPI_Win_detach. Regards, Takahiro Kawashima > Dear OpenMPI developers, > > I noticed a bug in the definition of the 3 MPI-3 RMA functions > MPI_Compare_and_swap, MPI_Fetch_and_op and MPI_Raccumulate. > > According to the MPI standard, the origin_addr and compare_addr > parameters of these functions have a const attribute, which is missing > in OpenMPI's mpi.h (OpenMPI 1.8.x and 1.10.0). > > Regards, > > Michael
Re: [OMPI devel] OpenMPI 1.8 Bug Report
iirc, the MPI_Win_detach discrepancy with the standard is intentional in fortran 2008, there is a comment in the source code to explain this. On Thursday, August 27, 2015, Kawashima, Takahiro < t-kawash...@jp.fujitsu.com> wrote: > Oh, I also noticed it yesterday and was about to report it. > > And one more, the base parameter of MPI_Win_detach. > > Regards, > Takahiro Kawashima > > > Dear OpenMPI developers, > > > > I noticed a bug in the definition of the 3 MPI-3 RMA functions > > MPI_Compare_and_swap, MPI_Fetch_and_op and MPI_Raccumulate. > > > > According to the MPI standard, the origin_addr and compare_addr > > parameters of these functions have a const attribute, which is missing > > in OpenMPI's mpi.h (OpenMPI 1.8.x and 1.10.0). > > > > Regards, > > > > Michael > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17878.php >
Re: [OMPI devel] OpenMPI 1.8 Bug Report
Gilles, > there is a comment in the source code to explain this. Could you point where the comment is? I can find a comment about MPI_Buffer_detach in ompi/mpi/fortran/mpif-h/buffer_detach_f.c and ompi/mpi/fortran/use-mpi-f08/buffer_detach.c files. But the situation is different form MPI_Buffer_detach. The declaration of MPI_Win_detach is not changed since the one-sided code was merged into the trunk at commit 49d938de (svn r30816). Regards, Takahiro Kawashima > iirc, the MPI_Win_detach discrepancy with the standard is intentional in > fortran 2008, > there is a comment in the source code to explain this. > > On Thursday, August 27, 2015, Kawashima, Takahiro < > t-kawash...@jp.fujitsu.com> wrote: > > > Oh, I also noticed it yesterday and was about to report it. > > > > And one more, the base parameter of MPI_Win_detach. > > > > Regards, > > Takahiro Kawashima > > > > > Dear OpenMPI developers, > > > > > > I noticed a bug in the definition of the 3 MPI-3 RMA functions > > > MPI_Compare_and_swap, MPI_Fetch_and_op and MPI_Raccumulate. > > > > > > According to the MPI standard, the origin_addr and compare_addr > > > parameters of these functions have a const attribute, which is missing > > > in OpenMPI's mpi.h (OpenMPI 1.8.x and 1.10.0). > > > > > > Regards, > > > > > > Michael
Re: [OMPI devel] OpenMPI 1.8 Bug Report
Kawashima-san, you are right, I mixed MPI_Buffer_detach and MPI_Win_detach sorry for the confusion Cheers, Gilles On Thursday, August 27, 2015, Kawashima, Takahiro < t-kawash...@jp.fujitsu.com> wrote: > Gilles, > > > there is a comment in the source code to explain this. > > Could you point where the comment is? > > I can find a comment about MPI_Buffer_detach in > ompi/mpi/fortran/mpif-h/buffer_detach_f.c and > ompi/mpi/fortran/use-mpi-f08/buffer_detach.c files. > But the situation is different form MPI_Buffer_detach. > > The declaration of MPI_Win_detach is not changed since > the one-sided code was merged into the trunk at commit > 49d938de (svn r30816). > > Regards, > Takahiro Kawashima > > > iirc, the MPI_Win_detach discrepancy with the standard is intentional in > > fortran 2008, > > there is a comment in the source code to explain this. > > > > On Thursday, August 27, 2015, Kawashima, Takahiro < > > t-kawash...@jp.fujitsu.com > wrote: > > > > > Oh, I also noticed it yesterday and was about to report it. > > > > > > And one more, the base parameter of MPI_Win_detach. > > > > > > Regards, > > > Takahiro Kawashima > > > > > > > Dear OpenMPI developers, > > > > > > > > I noticed a bug in the definition of the 3 MPI-3 RMA functions > > > > MPI_Compare_and_swap, MPI_Fetch_and_op and MPI_Raccumulate. > > > > > > > > According to the MPI standard, the origin_addr and compare_addr > > > > parameters of these functions have a const attribute, which is > missing > > > > in OpenMPI's mpi.h (OpenMPI 1.8.x and 1.10.0). > > > > > > > > Regards, > > > > > > > > Michael > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17880.php >
[OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
Hi, For some reason that is currently still beyond me, I can't bind to INADDR_ANY for more than 74 ports on a Cray compute node, without getting EADDRINUSE. This impacts my use of the oob_tcp_listener.c:create_listen() code on that machine (through means of orte-submit). I've implemented a proof of concept that gets the address from a hardcoded interface and uses that for the binding, and then everything is hunky dory. Although I'm interested in the root cause also, that may likely be outside of my control, so I wonder whether the hack can be turned into something more appropriate. So some questions: - I can't stop to think that somewhere in the codebase there is probably some portable code to extract addresses from an interface. - Is there already some MCA parameter that can be (re)used to specify the interface to use for this kind of purpose. - (And why is the "client" listening on a socket in the first place?) Thanks! Mark
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
Hi Mark Just to be clear: you are saying that orte-submit is creating a listener? If so, I can correct that as it doesn’t need to do so. > On Aug 27, 2015, at 8:42 AM, Mark Santcroos > wrote: > > Hi, > > For some reason that is currently still beyond me, I can't bind to INADDR_ANY > for more than 74 ports on a Cray compute node, without getting EADDRINUSE. > This impacts my use of the oob_tcp_listener.c:create_listen() code on that > machine (through means of orte-submit). > > I've implemented a proof of concept that gets the address from a hardcoded > interface and uses that for the binding, and then everything is hunky dory. > > Although I'm interested in the root cause also, that may likely be outside of > my control, so I wonder whether the hack can be turned into something more > appropriate. > > So some questions: > > - I can't stop to think that somewhere in the codebase there is probably some > portable code to extract addresses from an interface. > - Is there already some MCA parameter that can be (re)used to specify the > interface to use for this kind of purpose. > - (And why is the "client" listening on a socket in the first place?) > > Thanks! > > Mark > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17882.php
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
Hi Mark, I think you're hitting an RSIP port limit. If you bind to ipogif0 then you should have much better luck, unless you're trying to have open mpi span outside the cray HPN. You can use the oob mca parameter: oob-tcp-if-include ipogif0 You may want to put that in your .openmpi/mca-params.conf file if you have one installed, but don't forget if your home directory is accesible from different machines, some of which may not be Cray XE/XC then probably don't want to do that. Messed me up with runs on carver system at NERSC for a while. Howard 2015-08-27 9:42 GMT-06:00 Mark Santcroos : > Hi, > > For some reason that is currently still beyond me, I can't bind to > INADDR_ANY for more than 74 ports on a Cray compute node, without getting > EADDRINUSE. > This impacts my use of the oob_tcp_listener.c:create_listen() code on that > machine (through means of orte-submit). > > I've implemented a proof of concept that gets the address from a hardcoded > interface and uses that for the binding, and then everything is hunky dory. > > Although I'm interested in the root cause also, that may likely be outside > of my control, so I wonder whether the hack can be turned into something > more appropriate. > > So some questions: > > - I can't stop to think that somewhere in the codebase there is probably > some portable code to extract addresses from an interface. > - Is there already some MCA parameter that can be (re)used to specify the > interface to use for this kind of purpose. > - (And why is the "client" listening on a socket in the first place?) > > Thanks! > > Mark > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17882.php >
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
> On 27 Aug 2015, at 17:44 , Ralph Castain wrote: > Just to be clear: you are saying that orte-submit is creating a listener? If > so, I can correct that as it doesn’t need to do so. Yes, I think it does indeed. At least its hitting that code path that looks suspiciously like a listener! :)
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
Okay, let me take a look > On Aug 27, 2015, at 8:56 AM, Mark Santcroos > wrote: > > >> On 27 Aug 2015, at 17:44 , Ralph Castain wrote: >> Just to be clear: you are saying that orte-submit is creating a listener? If >> so, I can correct that as it doesn’t need to do so. > > Yes, I think it does indeed. At least its hitting that code path that looks > suspiciously like a listener! :) > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17885.php
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
Hi Howard, > On 27 Aug 2015, at 17:48 , Howard Pritchard wrote: > I think you're hitting an RSIP port limit. Thats sounds like it indeed. > If you bind to ipogif0 then you should have much better luck, unless > you're trying to have open mpi span outside the cray HPN. Now you get me wondering. I actually played with both oob-tcp-if-include and -exclude , but possibly not in the right context I realize now. Let me undo my changes and try with only the configuration changes. Thanks! Mark > > You can use the oob mca parameter: > > oob-tcp-if-include ipogif0 > > You may want to put that in your .openmpi/mca-params.conf file if you have > one installed, but > don't forget if your home directory is accesible from different machines, > some of which may > not be Cray XE/XC then probably don't want to do that. Messed me up with > runs on carver > system at NERSC for a while. > > Howard > > > 2015-08-27 9:42 GMT-06:00 Mark Santcroos : > Hi, > > For some reason that is currently still beyond me, I can't bind to INADDR_ANY > for more than 74 ports on a Cray compute node, without getting EADDRINUSE. > This impacts my use of the oob_tcp_listener.c:create_listen() code on that > machine (through means of orte-submit). > > I've implemented a proof of concept that gets the address from a hardcoded > interface and uses that for the binding, and then everything is hunky dory. > > Although I'm interested in the root cause also, that may likely be outside of > my control, so I wonder whether the hack can be turned into something more > appropriate. > > So some questions: > > - I can't stop to think that somewhere in the codebase there is probably some > portable code to extract addresses from an interface. > - Is there already some MCA parameter that can be (re)used to specify the > interface to use for this kind of purpose. > - (And why is the "client" listening on a socket in the first place?) > > Thanks! > > Mark > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17882.php > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17884.php
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
Hi Howard, > On 27 Aug 2015, at 17:59 , Mark Santcroos wrote: >> If you bind to ipogif0 then you should have much better luck, unless >> you're trying to have open mpi span outside the cray HPN. > > > Now you get me wondering. I actually played with both oob-tcp-if-include and > -exclude , but possibly not in the right context I realize now. > Let me undo my changes and try with only the configuration changes. That doesn't seem to work. But by looking at the code at https://github.com/open-mpi/ompi/blob/master/orte/mca/oob/tcp/oob_tcp_listener.c#L279 I also think that it still binds to all interfaces/addresses there regardless of the interfaces one configures with oob_tcp_if_include. Gr, Mark
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
Hmmm…well, that isn’t right either :-) I’ll fix this stuff tonight > On Aug 27, 2015, at 2:49 PM, Mark Santcroos > wrote: > > Hi Howard, > >> On 27 Aug 2015, at 17:59 , Mark Santcroos wrote: >>> If you bind to ipogif0 then you should have much better luck, unless >>> you're trying to have open mpi span outside the cray HPN. >> >> >> Now you get me wondering. I actually played with both oob-tcp-if-include and >> -exclude , but possibly not in the right context I realize now. >> Let me undo my changes and try with only the configuration changes. > > That doesn't seem to work. But by looking at the code at > https://github.com/open-mpi/ompi/blob/master/orte/mca/oob/tcp/oob_tcp_listener.c#L279 > I also think that it still binds to all interfaces/addresses there regardless > of the interfaces one configures with oob_tcp_if_include. > > Gr, > > Mark > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17888.php
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
> On 27 Aug 2015, at 17:58 , Ralph Castain wrote: > Okay, let me take a look Thanks Ralph, please let me know if I can be of any assistance!
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
I committed the change that prevents orte-submit from binding a listener - seems to work fine for me, so please let me know how it works for you. The other issue - binding to all interfaces instead of only the ones specified - is a little more troublesome. If we don’t bind to all interfaces, then we have to consume a socket for each interface we are going to bind to - which means we trade-off binding one socket to all interfaces for consuming one socket per interface we are using. It seems to me that binding to all interfaces doesn’t hurt us given that we will only attempt to connect on the specified interfaces, whereas consuming even more file descriptors can be a problem, but maybe I’m not seeing something. Anyone have an opinion here? Ralph > On Aug 27, 2015, at 3:14 PM, Mark Santcroos > wrote: > > >> On 27 Aug 2015, at 17:58 , Ralph Castain wrote: >> Okay, let me take a look > > Thanks Ralph, please let me know if I can be of any assistance! > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/08/17890.php
Re: [OMPI devel] bind to interface / address oob_tcp_listener.c:create_listen()
Ralph, what about : - if only one interface is specified (e.g. *_if_include eth0), then bind to that interface - otherwise, bind to all interfaces Mark, would that solve your issue ? Cheers, Gilles On 8/28/2015 9:50 AM, Ralph Castain wrote: I committed the change that prevents orte-submit from binding a listener - seems to work fine for me, so please let me know how it works for you. The other issue - binding to all interfaces instead of only the ones specified - is a little more troublesome. If we don’t bind to all interfaces, then we have to consume a socket for each interface we are going to bind to - which means we trade-off binding one socket to all interfaces for consuming one socket per interface we are using. It seems to me that binding to all interfaces doesn’t hurt us given that we will only attempt to connect on the specified interfaces, whereas consuming even more file descriptors can be a problem, but maybe I’m not seeing something. Anyone have an opinion here? Ralph On Aug 27, 2015, at 3:14 PM, Mark Santcroos wrote: On 27 Aug 2015, at 17:58 , Ralph Castain wrote: Okay, let me take a look Thanks Ralph, please let me know if I can be of any assistance! ___ devel mailing list de...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2015/08/17890.php ___ devel mailing list de...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2015/08/17891.php
Re: [OMPI devel] OpenMPI 1.8 Bug Report
Thanks Michael and Kawashima-san, i made PR #838 to fix this it is currently available at https://github.com/open-mpi/ompi/pull/838 Cheers, Gilles On 8/27/2015 6:29 PM, Michael Knobloch wrote: Dear OpenMPI developers, I noticed a bug in the definition of the 3 MPI-3 RMA functions MPI_Compare_and_swap, MPI_Fetch_and_op and MPI_Raccumulate. According to the MPI standard, the origin_addr and compare_addr parameters of these functions have a const attribute, which is missing in OpenMPI's mpi.h (OpenMPI 1.8.x and 1.10.0). Regards, Michael -- Michael Knobloch Institute for Advanced Simulation (IAS) Jülich Supercomputing Centre (JSC) Telefon: +49 2461 61-3546 Telefax: +49 2461 61-6656 E-Mail: m.knobl...@fz-juelich.de Internet: http://www.fz-juelich.de/jsc Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ___ devel mailing list de...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2015/08/17877.php