Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread Lisandro Dalcin

On 7/23/07, Jeff Squyres  wrote:

Does anyone have any opinions on this?  If not, I'll go implement
option #1.


Sorry, Jeff... just reading this. I think your option #1 is the
better. However, I want to warn you about to issues:

* In my Linux FC6 box, malloc(0) return different pointers for each
call. In fact, I believe this is a requeriment for malloc, in the case
of MPI_Alloc_mem, this could be relaxed, but it could cause problems
(supose some code building a hash table using pointers as keys, or
even a stl::map). Just a warn.

* malloc(0) return an aligned pointer, here I really think
MPI_Alloc_mem should return a pointer with the same aligment a
malloc(1) would return. So I am not sure your global char[1] is OK.

As reference, I can comment the approach used in Python memory
allocator to assure portability across platforms. They always alloc at
least 1 byte. This is not so important in an environment like Python,
but perhaps this approach in wrong for an MPI implementation.


Regards,

--
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594



Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread Gleb Natapov
On Tue, Jul 24, 2007 at 11:20:11AM -0300, Lisandro Dalcin wrote:
> On 7/23/07, Jeff Squyres  wrote:
> > Does anyone have any opinions on this?  If not, I'll go implement
> > option #1.
> 
> Sorry, Jeff... just reading this. I think your option #1 is the
> better. However, I want to warn you about to issues:
> 
> * In my Linux FC6 box, malloc(0) return different pointers for each
> call. In fact, I believe this is a requeriment for malloc, in the case
man malloc tells me this:
"If size was equal to 0, either NULL or a pointer suitable to be passed to 
free()
is returned". So may be we should just return NULL and be done with it?

--
Gleb.


Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread George Bosilca
I agree with Gleb. Calling malloc with size 0 is just bad practice.  
As the returned memory is not supposed to be suitable for any use  
[fact that we can not verify as there is at least one byte] why  
returning anything else than NULL ? Returning NULL will make the  
application segfault, which is a good hint for the user that  
something wasn't right somewhere. Moreover, returning some special  
memory will force us to check in most MPI functions that this  
particular pointer is not used [similar to MPI_BOTTOM], which is a  
huge amount of code.


  george.

On Jul 24, 2007, at 10:28 AM, Gleb Natapov wrote:


On Tue, Jul 24, 2007 at 11:20:11AM -0300, Lisandro Dalcin wrote:

On 7/23/07, Jeff Squyres  wrote:

Does anyone have any opinions on this?  If not, I'll go implement
option #1.


Sorry, Jeff... just reading this. I think your option #1 is the
better. However, I want to warn you about to issues:

* In my Linux FC6 box, malloc(0) return different pointers for each
call. In fact, I believe this is a requeriment for malloc, in the  
case

man malloc tells me this:
"If size was equal to 0, either NULL or a pointer suitable to be  
passed to free()
is returned". So may be we should just return NULL and be done with  
it?


--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread Brian Barrett

On Jul 24, 2007, at 8:28 AM, Gleb Natapov wrote:


On Tue, Jul 24, 2007 at 11:20:11AM -0300, Lisandro Dalcin wrote:

On 7/23/07, Jeff Squyres  wrote:

Does anyone have any opinions on this?  If not, I'll go implement
option #1.


Sorry, Jeff... just reading this. I think your option #1 is the
better. However, I want to warn you about to issues:

* In my Linux FC6 box, malloc(0) return different pointers for each
call. In fact, I believe this is a requeriment for malloc, in the  
case

man malloc tells me this:
"If size was equal to 0, either NULL or a pointer suitable to be  
passed to free()
is returned". So may be we should just return NULL and be done with  
it?


Which is also what POSIX says:

  http://www.opengroup.org/onlinepubs/009695399/functions/malloc.html

I vote with gleb -- return NULL, don't set errno, and be done with  
it.  The way I read the advice to implementors, this would be a legal  
response to a 0 byte request.


Brian


Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread Adrian Knoth
On Tue, Jul 24, 2007 at 08:41:27AM -0600, Brian Barrett wrote:

> > man malloc tells me this:
> > "If size was equal to 0, either NULL or a pointer suitable to be  
> > passed to free()
> > is returned". So may be we should just return NULL and be done with  
> > it?
> 
> Which is also what POSIX says:
> 
>http://www.opengroup.org/onlinepubs/009695399/functions/malloc.html
> 
> I vote with gleb -- return NULL, don't set errno, and be done with  

I'd like to second. Just if this is a poll ;)


-- 
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de


Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread Jeff Squyres

On Jul 24, 2007, at 11:01 AM, Adrian Knoth wrote:


Which is also what POSIX says:

   http://www.opengroup.org/onlinepubs/009695399/functions/ 
malloc.html


I vote with gleb -- return NULL, don't set errno, and be done with


I'd like to second. Just if this is a poll ;)


Sounds like a pretty strong argument to me.  This is easy to  
implement; I'll do it.


Per Lisandro's comments: I think that if you need a random/valid  
value for an STL map (or similar), malloc(0) is not a good idea to  
use as a key.


--
Jeff Squyres
Cisco Systems



[OMPI devel] Hostfiles - yet again

2007-07-24 Thread Ralph H Castain
Yo all

As you know, I am working on revamping the hostfile functionality to make it
work better with managed environments (at the moment, the two are
exclusive). The issue that we need to review is how we want the interaction
to work, both for the initial launch and for comm_spawn.

In talking with Jeff, we boiled it down to two options that I have
flow-charted (see attached):

Option 1: in this mode, we read any allocated nodes provided by a resource
manager (e.g., SLURM). These nodes establish a base pool of nodes that can
be used by both the initial launch and any dynamic comm_spawn requests. The
hostfile and any -host info is then used to select nodes from within that
pool for use with the specific launch. The initial launch would use the
-hostfile or -host command line option to provide that info - comm_spawn
would use the MPI_Info fields to provide similar info.

This mode has the advantage of allowing a user to obtain a large allocation,
and then designate hosts within the pool for use by an initial application,
and separately designate (via another hostfile or -host spec) another set of
those hosts from the pool to support a comm_spawn'd child job.

If no resource managed nodes are found, then the hostfile and -host options
would provide the list of hosts to be used. Again, comm_spawn'd jobs would
be able to specify their own hostfile and -host nodes.

The negative to this option is complexity - in the absence of a managed
allocation, I either have to deal with hostfile/dash-host allocations in the
RAS and then again in RMAPS, or I have "allocation-like" functionality
happening in RMAPS.


Option 2: in this mode, we read any allocated nodes provided by a resource
manager, and then filter those using the command line hostfile and -host
options to establish our base pool. Any spawn commands (both the initial one
and comm_spawn'd child jobs) would utilize this filtered pool of nodes.
Thus, comm_spawn is restricted to using hosts from that initial pool.

We could possibly extend this option by only using the hostfile in our
initial filter. In other words, let the hostfile downselect the resource
manager's allocation for the initial launch. Any -host options on the
command line would only apply to the hosts used to launch the initial
application. Any comm_spawn would use the hostfile-filtered pool of hosts.

The advantage here is simplicity. The disadvantage lies in flexibility for
supporting dynamic operations.


The major difference between these options really only impacts the initial
pool of hosts to be used for launches, both the initial one and any
subsequent comm_spawns. Barring any commentary, I will implement option 1 as
this provides the maximum flexibility.

Any thoughts? Other options we should consider?

Thanks
Ralph



hostfile.pdf
Description: Binary data


Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread Lisandro Dalcin

Per Lisandro's comments: I think that if you need a random/valid
value for an STL map (or similar), malloc(0) is not a good idea to
use as a key.


OK, regarding comments in this thread, you are completelly right. I am
fine with returning NULL.

BTW, should'nt this issue be commented in the standard? Perhaps in the
errata document? I think there is no a strong need to make it
implementation dependent.

MPI-2 could mandate/suggest that if size=0, the returned pointer is
NULL, but then MPI_Free_mem with a NULL pointer should succeed.

Now a question: What about Fortran ?

--
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594



Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread Jeff Squyres

On Jul 24, 2007, at 12:02 PM, Lisandro Dalcin wrote:


Per Lisandro's comments: I think that if you need a random/valid
value for an STL map (or similar), malloc(0) is not a good idea to
use as a key.


OK, regarding comments in this thread, you are completelly right. I am
fine with returning NULL.

BTW, should'nt this issue be commented in the standard? Perhaps in the
errata document? I think there is no a strong need to make it
implementation dependent.

MPI-2 could mandate/suggest that if size=0, the returned pointer is
NULL, but then MPI_Free_mem with a NULL pointer should succeed.


Good point.  Do you want to bring it up on the mpi-21 list?


Now a question: What about Fortran ?


Hmm.  Good question.  I am not a Fortran expert, but my $0.02 would  
be that if we can return NULL in C, and since there is no equivalent  
to NULL in Fortran, then the result should be disallowed -- if you do  
it, the results are undefined.


--
Jeff Squyres
Cisco Systems



[OMPI devel] Hostfile - oh yes, again!

2007-07-24 Thread Ralph Castain
Yo all

More thoughts on hostfile usage - I'm sure you are all sitting on
pins-and-needles awaiting more discussion on this exciting topic!

I'm continuing to try and work through the use-cases here so we can get this
fixed. It continues to be an issue for users on the list, as well as our own
developers. The problem is that we use "hostfile" and "-host" for dual
purposes, which means there is an opening for confusion over what should
happen.

Let's consider two major use-cases.

Use-case 1: hostfile and/or -host, no managed environment
I believe there is an expected and consistent behavior for the case where we
are not in a managed environment, but the user specifies a hostfile and/or
-host. In these cases, we use the hostfile (if provided) to completely
describe the available hosts, and any -host is used to specify which hosts
in that hostfile are to be used for the initial application.

At issue, however, is what happens with comm_spawn - is the child job
restricted to the -host list, or is it free to use any of the hosts in the
hostfile? I have heard it both ways from users, so I believe we are free to
decide here. Does anyone have an opinion? Do we need an option to indicate
that all child jobs are restricted to the specified -host list?



Use-case 2: managed environment, hostfile and/or -host provided
You will find a lengthy discourse in Ticket #1018 about how to deal with
this use-case - it is messy, with multiple definitions running around. I
believe we have hit upon a reasonable path forward in that discussion
regarding how to parse a node list from this use-case.

However, it left open the question of who has access to the resulting node
list. As I tried to indicate in my prior note, the question revolves again
around comm_spawn: does the child job have access to all nodes in the
original allocation; those nodes in the original allocation that are also
listed in a hostfile; those nodes in the original allocation that are also
in the -host list; or...?


Obviously, as someone primarily focused on the RTE, I couldn't possibly care
less which of these modes you select. However, I *do* need to know how you
want Open MPI to operate so I can build the system to meet those
requirements.

I hope this - in combination with the prior note - will help you to
understand the question. Any direction would be appreciated as we are kinda
stuck until I know how you want the system to behave.

Oh yeah - in case you were wondering, prior MPI's like LA-MPI and LAM-MPI
avoided these issues (e.g., by ignoring hostfiles in use case 2). So we are
kind of charting new territory here - I think our users will be fine either
way if we just tell them "this is how it works".

Thanks
Ralph




Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#433142: openmpi: FTBFS on GNU/kFreeBSD

2007-07-24 Thread Adrian Knoth
On Sat, Jul 14, 2007 at 03:55:12PM -0500, Dirk Eddelbuettel wrote:

> | the current version fails to build on GNU/kFreeBSD.
> | 
> | It needs small fixups for munmap hackery and stacktrace.
> | It also needs to exclude linux specific build-depends.
> | Please find attached patch with that.
> 
> Thanks for that patch.

> | It would be nice if you can ask upstream
> | to include changes to opal/util/stacktrace.c and
> | opal/mca/memory/ptmalloc2/opal_ptmalloc2_munmap.c .

I've neither seen a ticket nor any discussion within the last days. Did
you get any response?

AFAIK, kFreeBSD isn't a major target for OMPI, but if these patches
doesn't break anything, I don't mind to include them.


I'll give them a run inside our virtual testing environment, but I'd
feel better with additional feedback from MTT.


HTH


PS: https://svn.open-mpi.org/trac/ompi/ticket/1105

-- 
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de