Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid
On 7/23/07, Jeff Squyres wrote: Does anyone have any opinions on this? If not, I'll go implement option #1. Sorry, Jeff... just reading this. I think your option #1 is the better. However, I want to warn you about to issues: * In my Linux FC6 box, malloc(0) return different pointers for each call. In fact, I believe this is a requeriment for malloc, in the case of MPI_Alloc_mem, this could be relaxed, but it could cause problems (supose some code building a hash table using pointers as keys, or even a stl::map). Just a warn. * malloc(0) return an aligned pointer, here I really think MPI_Alloc_mem should return a pointer with the same aligment a malloc(1) would return. So I am not sure your global char[1] is OK. As reference, I can comment the approach used in Python memory allocator to assure portability across platforms. They always alloc at least 1 byte. This is not so important in an environment like Python, but perhaps this approach in wrong for an MPI implementation. Regards, -- Lisandro Dalcín --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594
Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid
On Tue, Jul 24, 2007 at 11:20:11AM -0300, Lisandro Dalcin wrote: > On 7/23/07, Jeff Squyres wrote: > > Does anyone have any opinions on this? If not, I'll go implement > > option #1. > > Sorry, Jeff... just reading this. I think your option #1 is the > better. However, I want to warn you about to issues: > > * In my Linux FC6 box, malloc(0) return different pointers for each > call. In fact, I believe this is a requeriment for malloc, in the case man malloc tells me this: "If size was equal to 0, either NULL or a pointer suitable to be passed to free() is returned". So may be we should just return NULL and be done with it? -- Gleb.
Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid
I agree with Gleb. Calling malloc with size 0 is just bad practice. As the returned memory is not supposed to be suitable for any use [fact that we can not verify as there is at least one byte] why returning anything else than NULL ? Returning NULL will make the application segfault, which is a good hint for the user that something wasn't right somewhere. Moreover, returning some special memory will force us to check in most MPI functions that this particular pointer is not used [similar to MPI_BOTTOM], which is a huge amount of code. george. On Jul 24, 2007, at 10:28 AM, Gleb Natapov wrote: On Tue, Jul 24, 2007 at 11:20:11AM -0300, Lisandro Dalcin wrote: On 7/23/07, Jeff Squyres wrote: Does anyone have any opinions on this? If not, I'll go implement option #1. Sorry, Jeff... just reading this. I think your option #1 is the better. However, I want to warn you about to issues: * In my Linux FC6 box, malloc(0) return different pointers for each call. In fact, I believe this is a requeriment for malloc, in the case man malloc tells me this: "If size was equal to 0, either NULL or a pointer suitable to be passed to free() is returned". So may be we should just return NULL and be done with it? -- Gleb. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid
On Jul 24, 2007, at 8:28 AM, Gleb Natapov wrote: On Tue, Jul 24, 2007 at 11:20:11AM -0300, Lisandro Dalcin wrote: On 7/23/07, Jeff Squyres wrote: Does anyone have any opinions on this? If not, I'll go implement option #1. Sorry, Jeff... just reading this. I think your option #1 is the better. However, I want to warn you about to issues: * In my Linux FC6 box, malloc(0) return different pointers for each call. In fact, I believe this is a requeriment for malloc, in the case man malloc tells me this: "If size was equal to 0, either NULL or a pointer suitable to be passed to free() is returned". So may be we should just return NULL and be done with it? Which is also what POSIX says: http://www.opengroup.org/onlinepubs/009695399/functions/malloc.html I vote with gleb -- return NULL, don't set errno, and be done with it. The way I read the advice to implementors, this would be a legal response to a 0 byte request. Brian
Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid
On Tue, Jul 24, 2007 at 08:41:27AM -0600, Brian Barrett wrote: > > man malloc tells me this: > > "If size was equal to 0, either NULL or a pointer suitable to be > > passed to free() > > is returned". So may be we should just return NULL and be done with > > it? > > Which is also what POSIX says: > >http://www.opengroup.org/onlinepubs/009695399/functions/malloc.html > > I vote with gleb -- return NULL, don't set errno, and be done with I'd like to second. Just if this is a poll ;) -- Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany private: http://adi.thur.de
Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid
On Jul 24, 2007, at 11:01 AM, Adrian Knoth wrote: Which is also what POSIX says: http://www.opengroup.org/onlinepubs/009695399/functions/ malloc.html I vote with gleb -- return NULL, don't set errno, and be done with I'd like to second. Just if this is a poll ;) Sounds like a pretty strong argument to me. This is easy to implement; I'll do it. Per Lisandro's comments: I think that if you need a random/valid value for an STL map (or similar), malloc(0) is not a good idea to use as a key. -- Jeff Squyres Cisco Systems
[OMPI devel] Hostfiles - yet again
Yo all As you know, I am working on revamping the hostfile functionality to make it work better with managed environments (at the moment, the two are exclusive). The issue that we need to review is how we want the interaction to work, both for the initial launch and for comm_spawn. In talking with Jeff, we boiled it down to two options that I have flow-charted (see attached): Option 1: in this mode, we read any allocated nodes provided by a resource manager (e.g., SLURM). These nodes establish a base pool of nodes that can be used by both the initial launch and any dynamic comm_spawn requests. The hostfile and any -host info is then used to select nodes from within that pool for use with the specific launch. The initial launch would use the -hostfile or -host command line option to provide that info - comm_spawn would use the MPI_Info fields to provide similar info. This mode has the advantage of allowing a user to obtain a large allocation, and then designate hosts within the pool for use by an initial application, and separately designate (via another hostfile or -host spec) another set of those hosts from the pool to support a comm_spawn'd child job. If no resource managed nodes are found, then the hostfile and -host options would provide the list of hosts to be used. Again, comm_spawn'd jobs would be able to specify their own hostfile and -host nodes. The negative to this option is complexity - in the absence of a managed allocation, I either have to deal with hostfile/dash-host allocations in the RAS and then again in RMAPS, or I have "allocation-like" functionality happening in RMAPS. Option 2: in this mode, we read any allocated nodes provided by a resource manager, and then filter those using the command line hostfile and -host options to establish our base pool. Any spawn commands (both the initial one and comm_spawn'd child jobs) would utilize this filtered pool of nodes. Thus, comm_spawn is restricted to using hosts from that initial pool. We could possibly extend this option by only using the hostfile in our initial filter. In other words, let the hostfile downselect the resource manager's allocation for the initial launch. Any -host options on the command line would only apply to the hosts used to launch the initial application. Any comm_spawn would use the hostfile-filtered pool of hosts. The advantage here is simplicity. The disadvantage lies in flexibility for supporting dynamic operations. The major difference between these options really only impacts the initial pool of hosts to be used for launches, both the initial one and any subsequent comm_spawns. Barring any commentary, I will implement option 1 as this provides the maximum flexibility. Any thoughts? Other options we should consider? Thanks Ralph hostfile.pdf Description: Binary data
Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid
Per Lisandro's comments: I think that if you need a random/valid value for an STL map (or similar), malloc(0) is not a good idea to use as a key. OK, regarding comments in this thread, you are completelly right. I am fine with returning NULL. BTW, should'nt this issue be commented in the standard? Perhaps in the errata document? I think there is no a strong need to make it implementation dependent. MPI-2 could mandate/suggest that if size=0, the returned pointer is NULL, but then MPI_Free_mem with a NULL pointer should succeed. Now a question: What about Fortran ? -- Lisandro Dalcín --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594
Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid
On Jul 24, 2007, at 12:02 PM, Lisandro Dalcin wrote: Per Lisandro's comments: I think that if you need a random/valid value for an STL map (or similar), malloc(0) is not a good idea to use as a key. OK, regarding comments in this thread, you are completelly right. I am fine with returning NULL. BTW, should'nt this issue be commented in the standard? Perhaps in the errata document? I think there is no a strong need to make it implementation dependent. MPI-2 could mandate/suggest that if size=0, the returned pointer is NULL, but then MPI_Free_mem with a NULL pointer should succeed. Good point. Do you want to bring it up on the mpi-21 list? Now a question: What about Fortran ? Hmm. Good question. I am not a Fortran expert, but my $0.02 would be that if we can return NULL in C, and since there is no equivalent to NULL in Fortran, then the result should be disallowed -- if you do it, the results are undefined. -- Jeff Squyres Cisco Systems
[OMPI devel] Hostfile - oh yes, again!
Yo all More thoughts on hostfile usage - I'm sure you are all sitting on pins-and-needles awaiting more discussion on this exciting topic! I'm continuing to try and work through the use-cases here so we can get this fixed. It continues to be an issue for users on the list, as well as our own developers. The problem is that we use "hostfile" and "-host" for dual purposes, which means there is an opening for confusion over what should happen. Let's consider two major use-cases. Use-case 1: hostfile and/or -host, no managed environment I believe there is an expected and consistent behavior for the case where we are not in a managed environment, but the user specifies a hostfile and/or -host. In these cases, we use the hostfile (if provided) to completely describe the available hosts, and any -host is used to specify which hosts in that hostfile are to be used for the initial application. At issue, however, is what happens with comm_spawn - is the child job restricted to the -host list, or is it free to use any of the hosts in the hostfile? I have heard it both ways from users, so I believe we are free to decide here. Does anyone have an opinion? Do we need an option to indicate that all child jobs are restricted to the specified -host list? Use-case 2: managed environment, hostfile and/or -host provided You will find a lengthy discourse in Ticket #1018 about how to deal with this use-case - it is messy, with multiple definitions running around. I believe we have hit upon a reasonable path forward in that discussion regarding how to parse a node list from this use-case. However, it left open the question of who has access to the resulting node list. As I tried to indicate in my prior note, the question revolves again around comm_spawn: does the child job have access to all nodes in the original allocation; those nodes in the original allocation that are also listed in a hostfile; those nodes in the original allocation that are also in the -host list; or...? Obviously, as someone primarily focused on the RTE, I couldn't possibly care less which of these modes you select. However, I *do* need to know how you want Open MPI to operate so I can build the system to meet those requirements. I hope this - in combination with the prior note - will help you to understand the question. Any direction would be appreciated as we are kinda stuck until I know how you want the system to behave. Oh yeah - in case you were wondering, prior MPI's like LA-MPI and LAM-MPI avoided these issues (e.g., by ignoring hostfiles in use case 2). So we are kind of charting new territory here - I think our users will be fine either way if we just tell them "this is how it works". Thanks Ralph
Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#433142: openmpi: FTBFS on GNU/kFreeBSD
On Sat, Jul 14, 2007 at 03:55:12PM -0500, Dirk Eddelbuettel wrote: > | the current version fails to build on GNU/kFreeBSD. > | > | It needs small fixups for munmap hackery and stacktrace. > | It also needs to exclude linux specific build-depends. > | Please find attached patch with that. > > Thanks for that patch. > | It would be nice if you can ask upstream > | to include changes to opal/util/stacktrace.c and > | opal/mca/memory/ptmalloc2/opal_ptmalloc2_munmap.c . I've neither seen a ticket nor any discussion within the last days. Did you get any response? AFAIK, kFreeBSD isn't a major target for OMPI, but if these patches doesn't break anything, I don't mind to include them. I'll give them a run inside our virtual testing environment, but I'd feel better with additional feedback from MTT. HTH PS: https://svn.open-mpi.org/trac/ompi/ticket/1105 -- Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany private: http://adi.thur.de