Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-04 Thread N.M. Maclaren

On May 4 2010, Terry Dontje wrote:

Ralph Castain wrote:


Is a configure-time test good enough?  For example, are all Linuxes 
the same in this regard.  That is if you built OMPI on RH and it 
configured in the new SysV SM will those bits actually run on other 
Linux systems correctly?  I think Jeff had hinted to this similarly 
when suggesting this may need to be a runtime test. 


I don't think we have ever enforced that requirement, nor am I sure 
the current code would meet it. We have a number of components that 
test for ability to build, but don't check again at run-time.


Generally, the project has followed the philosophy of "build on the 
system you intend to run on".


There is at least one binary distribution that does build on one linux 
and allows to be installed on several others.  That is the reason I 
bring up the above.   The community can make a stance that that one 
distribution does not matter for this case or needs to handle it on its 
own.  In the grand scheme of things it might not matter but I wanted to 
at least stand up and be heard.


There is a gradation involved.  Building on one distribution and using
on another is one thing.  But the same distribution can use differently
built kernels, and the same system can be reconfigured (including both
package updating and parameter changing).  It is highly undesirable to
use volatile parameters in non-volatile context.

A lot of applications need rebuilding when the administrator updates
packages or makes configuration changes; that's not good and should be
avoided if at all possible.  Given the way that systems are currently
configured, and the design of the autoconfigure mechanism, it's probably
not wholly avoidable.  But it's still a very nasty gotcha.


Regards,
Nick Maclaren.





Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-04 Thread Ralph Castain

On May 4, 2010, at 7:56 AM, Terry Dontje wrote:

> Ralph Castain wrote:
>> 
>> 
>> On May 4, 2010, at 3:45 AM, Terry Dontje wrote:
>> 
>>> Is a configure-time test good enough?  For example, are all Linuxes the 
>>> same in this regard.  That is if you built OMPI on RH and it configured in 
>>> the new SysV SM will those bits actually run on other Linux systems 
>>> correctly?  I think Jeff had hinted to this similarly when suggesting this 
>>> may need to be a runtime test.  
>>> 
>> 
>> I don't think we have ever enforced that requirement, nor am I sure the 
>> current code would meet it. We have a number of components that test for 
>> ability to build, but don't check again at run-time.
>> 
>> Generally, the project has followed the philosophy of "build on the system 
>> you intend to run on".
>> 
> There is at least one binary distribution that does build on one linux and 
> allows to be installed on several others.  That is the reason I bring up the 
> above.   The community can make a stance that that one distribution does not 
> matter for this case or needs to handle it on its own.  In the grand scheme 
> of things it might not matter but I wanted to at least stand up and be heard.

No problem - I would simply suggest that they not --enable-sysv or whatever Sam 
calls it. They don't -have- to support that mode, it's just an option.

Or Sam could include a --enable-runtime-sysv-check so they can offer it if they 
want, but recognize that it may significantly slow down process launch.

> 
> --td
>>> --td
>>> 
>>> Samuel K. Gutierrez wrote:
 
 Hi All, 
 
 New configure-time test added - thanks for the suggestion, Jeff.  Update 
 and give it a whirl. 
 
 Ethan - could you please try again?  This time, I'm hoping sysv support 
 will be disabled ;-). 
 
 Thanks! 
 
 -- 
 Samuel K. Gutierrez 
 Los Alamos National Laboratory 
 
 On May 3, 2010, at 9:18 AM, Samuel K. Gutierrez wrote: 
 
> Hi Jeff, 
> 
> Sounds like a plan :-). 
> 
> Thanks! 
> 
> -- 
> Samuel K. Gutierrez 
> Los Alamos National Laboratory 
> 
> On May 3, 2010, at 9:12 AM, Jeff Squyres wrote: 
> 
>> It might well be that you need a configure test to determine whether 
>> this behavior occurs or not.  Heck, it may even need to be a run-time 
>> test!  Hrm. 
>> 
>> Write a small C program that does something like the following (this is 
>> off the top of my head): 
>> 
>> fork a child 
>> child goes to sleep immediately 
>> sysv alloc a segment 
>> attach to it 
>> ipc rm it 
>> parent wakes up child 
>> child tries to attach to segment 
>> 
>> If that succeeds, then all is good.  If not, then don't use this stuff. 
>> 
>> 
>> On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote: 
>> 
>>> Hi all, 
>>> 
>>> Does anyone know of a relatively portable solution for querying a 
>>> given system for the shmctl behavior that I am relying on, or is this 
>>> going to be a nightmare?  Because, if I am reading this thread 
>>> correctly, the presence of shmget and Linux is not sufficient for 
>>> determining an adequate level of sysv support. 
>>> 
>>> Thanks! 
>>> 
>>> -- 
>>> Samuel K. Gutierrez 
>>> Los Alamos National Laboratory 
>>> 
>>> On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote: 
>>> 
 On May 2 2010, Ashley Pittman wrote: 
> On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote: 
> 
> As to performance there should be no difference in use between sys- 
> V shared memory and file-backed shared memory, the instructions 
> issued and the MMU flags for the page should both be the same so 
> the performance should be identical. 
 
 Not necessarily, and possibly not so even for far-future Linuces. 
 On at least one system I used, the poxious kernel wrote the complete 
 file to disk before returning - all right, it did that for System V 
 shared memory, too, just to a 'hidden' file!  But, if I recall, on 
 another it did that only for file-backed shared memory - however, it's 
 a decade ago now and I may be misremembering. 
 
 Of course, that's a serious issue mainly for large segments.  I was 
 using multi-GB ones.  I don't know how big the ones you need are. 
 
> The one area you do need to keep an eye on for performance is on 
> numa machines where it's important which process on a node touches 
> each page first, you can end up using different areas (pages, not 
> regions) for communicating in different directions between the same 
> pair of processes. I don't believe this is any different to mmap 
> backed shared memory though. 
 
 On some systems it may be, but in bizarre, 

Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-04 Thread Terry Dontje

Ralph Castain wrote:


On May 4, 2010, at 3:45 AM, Terry Dontje wrote:

Is a configure-time test good enough?  For example, are all Linuxes 
the same in this regard.  That is if you built OMPI on RH and it 
configured in the new SysV SM will those bits actually run on other 
Linux systems correctly?  I think Jeff had hinted to this similarly 
when suggesting this may need to be a runtime test. 



I don't think we have ever enforced that requirement, nor am I sure 
the current code would meet it. We have a number of components that 
test for ability to build, but don't check again at run-time.


Generally, the project has followed the philosophy of "build on the 
system you intend to run on".


There is at least one binary distribution that does build on one linux 
and allows to be installed on several others.  That is the reason I 
bring up the above.   The community can make a stance that that one 
distribution does not matter for this case or needs to handle it on its 
own.  In the grand scheme of things it might not matter but I wanted to 
at least stand up and be heard.


--td

--td

Samuel K. Gutierrez wrote:

Hi All,

New configure-time test added - thanks for the suggestion, Jeff.  
Update and give it a whirl.


Ethan - could you please try again?  This time, I'm hoping sysv 
support will be disabled ;-).


Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:18 AM, Samuel K. Gutierrez wrote:


Hi Jeff,

Sounds like a plan :-).

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:12 AM, Jeff Squyres wrote:

It might well be that you need a configure test to determine 
whether this behavior occurs or not.  Heck, it may even need to be 
a run-time test!  Hrm.


Write a small C program that does something like the following 
(this is off the top of my head):


fork a child
child goes to sleep immediately
sysv alloc a segment
attach to it
ipc rm it
parent wakes up child
child tries to attach to segment

If that succeeds, then all is good.  If not, then don't use this 
stuff.



On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote:


Hi all,

Does anyone know of a relatively portable solution for querying a
given system for the shmctl behavior that I am relying on, or is 
this

going to be a nightmare?  Because, if I am reading this thread
correctly, the presence of shmget and Linux is not sufficient for
determining an adequate level of sysv support.

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote:


On May 2 2010, Ashley Pittman wrote:

On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote:

As to performance there should be no difference in use between 
sys-

V shared memory and file-backed shared memory, the instructions
issued and the MMU flags for the page should both be the same so
the performance should be identical.


Not necessarily, and possibly not so even for far-future Linuces.
On at least one system I used, the poxious kernel wrote the 
complete

file to disk before returning - all right, it did that for System V
shared memory, too, just to a 'hidden' file!  But, if I recall, on
another it did that only for file-backed shared memory - 
however, it's

a decade ago now and I may be misremembering.

Of course, that's a serious issue mainly for large segments.  I was
using multi-GB ones.  I don't know how big the ones you need are.


The one area you do need to keep an eye on for performance is on
numa machines where it's important which process on a node touches
each page first, you can end up using different areas (pages, not
regions) for communicating in different directions between the 
same

pair of processes. I don't believe this is any different to mmap
backed shared memory though.


On some systems it may be, but in bizarre, inconsistent, 
undocumented

and unpredictable ways :-(  Also, there are usually several system
(and
sometimes user) configuration options that change the behaviour, so
you
have to allow for that.  My experience of trying to use those is 
that

different uses have incompatible requirements, and most of the
critical
configuration parameters apply to ALL uses!

In my view, the configuration variability is the number one 
nightmare
for trying to write portable code that uses any form of shared 
memory.

ARMCI seem to agree.


Because of this, sysv support may be limited to Linux systems -
that is,
until we can get a better sense of which systems provide the 
shmctl

IPC_RMID behavior that I am relying on.


And, I suggest, whether they have an evil gotcha on one of the 
areas

that
Ashley Pittman noted.


Regards,
Nick Maclaren.


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
jsquy...@cisco.com

Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-04 Thread Ralph Castain

On May 4, 2010, at 3:45 AM, Terry Dontje wrote:

> Is a configure-time test good enough?  For example, are all Linuxes the same 
> in this regard.  That is if you built OMPI on RH and it configured in the new 
> SysV SM will those bits actually run on other Linux systems correctly?  I 
> think Jeff had hinted to this similarly when suggesting this may need to be a 
> runtime test.  
> 

I don't think we have ever enforced that requirement, nor am I sure the current 
code would meet it. We have a number of components that test for ability to 
build, but don't check again at run-time.

Generally, the project has followed the philosophy of "build on the system you 
intend to run on".

> --td
> 
> Samuel K. Gutierrez wrote:
>> 
>> Hi All, 
>> 
>> New configure-time test added - thanks for the suggestion, Jeff.  Update and 
>> give it a whirl. 
>> 
>> Ethan - could you please try again?  This time, I'm hoping sysv support will 
>> be disabled ;-). 
>> 
>> Thanks! 
>> 
>> -- 
>> Samuel K. Gutierrez 
>> Los Alamos National Laboratory 
>> 
>> On May 3, 2010, at 9:18 AM, Samuel K. Gutierrez wrote: 
>> 
>>> Hi Jeff, 
>>> 
>>> Sounds like a plan :-). 
>>> 
>>> Thanks! 
>>> 
>>> -- 
>>> Samuel K. Gutierrez 
>>> Los Alamos National Laboratory 
>>> 
>>> On May 3, 2010, at 9:12 AM, Jeff Squyres wrote: 
>>> 
 It might well be that you need a configure test to determine whether this 
 behavior occurs or not.  Heck, it may even need to be a run-time test!  
 Hrm. 
 
 Write a small C program that does something like the following (this is 
 off the top of my head): 
 
 fork a child 
 child goes to sleep immediately 
 sysv alloc a segment 
 attach to it 
 ipc rm it 
 parent wakes up child 
 child tries to attach to segment 
 
 If that succeeds, then all is good.  If not, then don't use this stuff. 
 
 
 On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote: 
 
> Hi all, 
> 
> Does anyone know of a relatively portable solution for querying a 
> given system for the shmctl behavior that I am relying on, or is this 
> going to be a nightmare?  Because, if I am reading this thread 
> correctly, the presence of shmget and Linux is not sufficient for 
> determining an adequate level of sysv support. 
> 
> Thanks! 
> 
> -- 
> Samuel K. Gutierrez 
> Los Alamos National Laboratory 
> 
> On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote: 
> 
>> On May 2 2010, Ashley Pittman wrote: 
>>> On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote: 
>>> 
>>> As to performance there should be no difference in use between sys- 
>>> V shared memory and file-backed shared memory, the instructions 
>>> issued and the MMU flags for the page should both be the same so 
>>> the performance should be identical. 
>> 
>> Not necessarily, and possibly not so even for far-future Linuces. 
>> On at least one system I used, the poxious kernel wrote the complete 
>> file to disk before returning - all right, it did that for System V 
>> shared memory, too, just to a 'hidden' file!  But, if I recall, on 
>> another it did that only for file-backed shared memory - however, it's 
>> a decade ago now and I may be misremembering. 
>> 
>> Of course, that's a serious issue mainly for large segments.  I was 
>> using multi-GB ones.  I don't know how big the ones you need are. 
>> 
>>> The one area you do need to keep an eye on for performance is on 
>>> numa machines where it's important which process on a node touches 
>>> each page first, you can end up using different areas (pages, not 
>>> regions) for communicating in different directions between the same 
>>> pair of processes. I don't believe this is any different to mmap 
>>> backed shared memory though. 
>> 
>> On some systems it may be, but in bizarre, inconsistent, undocumented 
>> and unpredictable ways :-(  Also, there are usually several system 
>> (and 
>> sometimes user) configuration options that change the behaviour, so 
>> you 
>> have to allow for that.  My experience of trying to use those is that 
>> different uses have incompatible requirements, and most of the 
>> critical 
>> configuration parameters apply to ALL uses! 
>> 
>> In my view, the configuration variability is the number one nightmare 
>> for trying to write portable code that uses any form of shared memory. 
>> ARMCI seem to agree. 
>> 
 Because of this, sysv support may be limited to Linux systems - 
 that is, 
 until we can get a better sense of which systems provide the shmctl 
 IPC_RMID behavior that I am relying on. 
>> 
>> And, I suggest, whether they have an evil gotcha on one of the areas 
>> that 
>> Ashley Pittman noted. 
>> 
>> 
>> Regards, 
>> Nick Maclaren. 

Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-04 Thread N.M. Maclaren

On May 4 2010, Terry Dontje wrote:


Is a configure-time test good enough?  For example, are all Linuxes the 
same in this regard.  That is if you built OMPI on RH and it configured 
in the new SysV SM will those bits actually run on other Linux systems 
correctly?  I think Jeff had hinted to this similarly when suggesting 
this may need to be a runtime test. 


A very good question.  It is clearly NOT good enough for the affinity
problems I mentioned, because they are changeable by system configuration,
but this is more basic.  I don't remember seeing any parameters to change
the behaviour of System V shared memory, as distinct from its constants.

Five years ago, I would have guessed "no", because this is exactly the
sort of area where the single-CPU and multi-CPU kernels differed, but
I believe that there is less of that sort of thing nowadays.  However,
it's worth watching out for.

Regards,
Nick Maclaren.




Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-04 Thread Terry Dontje
Is a configure-time test good enough?  For example, are all Linuxes the 
same in this regard.  That is if you built OMPI on RH and it configured 
in the new SysV SM will those bits actually run on other Linux systems 
correctly?  I think Jeff had hinted to this similarly when suggesting 
this may need to be a runtime test. 


--td

Samuel K. Gutierrez wrote:

Hi All,

New configure-time test added - thanks for the suggestion, Jeff.  
Update and give it a whirl.


Ethan - could you please try again?  This time, I'm hoping sysv 
support will be disabled ;-).


Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:18 AM, Samuel K. Gutierrez wrote:


Hi Jeff,

Sounds like a plan :-).

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:12 AM, Jeff Squyres wrote:

It might well be that you need a configure test to determine whether 
this behavior occurs or not.  Heck, it may even need to be a 
run-time test!  Hrm.


Write a small C program that does something like the following (this 
is off the top of my head):


fork a child
child goes to sleep immediately
sysv alloc a segment
attach to it
ipc rm it
parent wakes up child
child tries to attach to segment

If that succeeds, then all is good.  If not, then don't use this stuff.


On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote:


Hi all,

Does anyone know of a relatively portable solution for querying a
given system for the shmctl behavior that I am relying on, or is this
going to be a nightmare?  Because, if I am reading this thread
correctly, the presence of shmget and Linux is not sufficient for
determining an adequate level of sysv support.

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote:


On May 2 2010, Ashley Pittman wrote:

On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote:

As to performance there should be no difference in use between sys-
V shared memory and file-backed shared memory, the instructions
issued and the MMU flags for the page should both be the same so
the performance should be identical.


Not necessarily, and possibly not so even for far-future Linuces.
On at least one system I used, the poxious kernel wrote the complete
file to disk before returning - all right, it did that for System V
shared memory, too, just to a 'hidden' file!  But, if I recall, on
another it did that only for file-backed shared memory - however, 
it's

a decade ago now and I may be misremembering.

Of course, that's a serious issue mainly for large segments.  I was
using multi-GB ones.  I don't know how big the ones you need are.


The one area you do need to keep an eye on for performance is on
numa machines where it's important which process on a node touches
each page first, you can end up using different areas (pages, not
regions) for communicating in different directions between the same
pair of processes. I don't believe this is any different to mmap
backed shared memory though.


On some systems it may be, but in bizarre, inconsistent, undocumented
and unpredictable ways :-(  Also, there are usually several system
(and
sometimes user) configuration options that change the behaviour, so
you
have to allow for that.  My experience of trying to use those is that
different uses have incompatible requirements, and most of the
critical
configuration parameters apply to ALL uses!

In my view, the configuration variability is the number one nightmare
for trying to write portable code that uses any form of shared 
memory.

ARMCI seem to agree.


Because of this, sysv support may be limited to Linux systems -
that is,
until we can get a better sense of which systems provide the shmctl
IPC_RMID behavior that I am relying on.


And, I suggest, whether they have an evil gotcha on one of the areas
that
Ashley Pittman noted.


Regards,
Nick Maclaren.


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 



Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-03 Thread Samuel K. Gutierrez

Hi All,

New configure-time test added - thanks for the suggestion, Jeff.   
Update and give it a whirl.


Ethan - could you please try again?  This time, I'm hoping sysv  
support will be disabled ;-).


Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:18 AM, Samuel K. Gutierrez wrote:


Hi Jeff,

Sounds like a plan :-).

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:12 AM, Jeff Squyres wrote:

It might well be that you need a configure test to determine  
whether this behavior occurs or not.  Heck, it may even need to be  
a run-time test!  Hrm.


Write a small C program that does something like the following  
(this is off the top of my head):


fork a child
child goes to sleep immediately
sysv alloc a segment
attach to it
ipc rm it
parent wakes up child
child tries to attach to segment

If that succeeds, then all is good.  If not, then don't use this  
stuff.



On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote:


Hi all,

Does anyone know of a relatively portable solution for querying a
given system for the shmctl behavior that I am relying on, or is  
this

going to be a nightmare?  Because, if I am reading this thread
correctly, the presence of shmget and Linux is not sufficient for
determining an adequate level of sysv support.

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote:


On May 2 2010, Ashley Pittman wrote:

On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote:

As to performance there should be no difference in use between  
sys-

V shared memory and file-backed shared memory, the instructions
issued and the MMU flags for the page should both be the same so
the performance should be identical.


Not necessarily, and possibly not so even for far-future Linuces.
On at least one system I used, the poxious kernel wrote the  
complete

file to disk before returning - all right, it did that for System V
shared memory, too, just to a 'hidden' file!  But, if I recall, on
another it did that only for file-backed shared memory - however,  
it's

a decade ago now and I may be misremembering.

Of course, that's a serious issue mainly for large segments.  I was
using multi-GB ones.  I don't know how big the ones you need are.


The one area you do need to keep an eye on for performance is on
numa machines where it's important which process on a node touches
each page first, you can end up using different areas (pages, not
regions) for communicating in different directions between the  
same

pair of processes. I don't believe this is any different to mmap
backed shared memory though.


On some systems it may be, but in bizarre, inconsistent,  
undocumented

and unpredictable ways :-(  Also, there are usually several system
(and
sometimes user) configuration options that change the behaviour, so
you
have to allow for that.  My experience of trying to use those is  
that

different uses have incompatible requirements, and most of the
critical
configuration parameters apply to ALL uses!

In my view, the configuration variability is the number one  
nightmare
for trying to write portable code that uses any form of shared  
memory.

ARMCI seem to agree.


Because of this, sysv support may be limited to Linux systems -
that is,
until we can get a better sense of which systems provide the  
shmctl

IPC_RMID behavior that I am relying on.


And, I suggest, whether they have an evil gotcha on one of the  
areas

that
Ashley Pittman noted.


Regards,
Nick Maclaren.


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-03 Thread N.M. Maclaren

On May 3 2010, Jeff Squyres wrote:



Write a small C program that does something like the following (this is 
off the top of my head):


fork a child
child goes to sleep immediately
sysv alloc a segment
attach to it
ipc rm it
parent wakes up child
child tries to attach to segment

If that succeeds, then all is good.  If not, then don't use this stuff.  


Not quite.  You haven't allowed for the ipc rm being scheduled for
immediate effect, but the action not happening immediately - while I
haven't used that facility, I have seen such effects with quite a few
shared facilities.  That can happen when the facility is partially
managed by a daemon or kernel thread, that is otherwise engaged at the
time of the ipc rm, and the daemon needs to be called for allocation
and deallocation but not attaching to an existing segment.  Is that
ever done for this facility?  Dunno.  Does POSIX forbid it?  Not that
I can see.

To reduce that, I would put in a sleep after the ipc rm, for at least
a few seconds, but that will merely reduce the probability of a race
condition and not remove it.  But I don't have a good solution for it,
in general :-(

Regards,
Nick Maclaren.




Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-03 Thread Samuel K. Gutierrez

Hi Jeff,

Sounds like a plan :-).

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:12 AM, Jeff Squyres wrote:

It might well be that you need a configure test to determine whether  
this behavior occurs or not.  Heck, it may even need to be a run- 
time test!  Hrm.


Write a small C program that does something like the following (this  
is off the top of my head):


fork a child
child goes to sleep immediately
sysv alloc a segment
attach to it
ipc rm it
parent wakes up child
child tries to attach to segment

If that succeeds, then all is good.  If not, then don't use this  
stuff.



On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote:


Hi all,

Does anyone know of a relatively portable solution for querying a
given system for the shmctl behavior that I am relying on, or is this
going to be a nightmare?  Because, if I am reading this thread
correctly, the presence of shmget and Linux is not sufficient for
determining an adequate level of sysv support.

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote:


On May 2 2010, Ashley Pittman wrote:

On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote:

As to performance there should be no difference in use between sys-
V shared memory and file-backed shared memory, the instructions
issued and the MMU flags for the page should both be the same so
the performance should be identical.


Not necessarily, and possibly not so even for far-future Linuces.
On at least one system I used, the poxious kernel wrote the complete
file to disk before returning - all right, it did that for System V
shared memory, too, just to a 'hidden' file!  But, if I recall, on
another it did that only for file-backed shared memory - however,  
it's

a decade ago now and I may be misremembering.

Of course, that's a serious issue mainly for large segments.  I was
using multi-GB ones.  I don't know how big the ones you need are.


The one area you do need to keep an eye on for performance is on
numa machines where it's important which process on a node touches
each page first, you can end up using different areas (pages, not
regions) for communicating in different directions between the same
pair of processes. I don't believe this is any different to mmap
backed shared memory though.


On some systems it may be, but in bizarre, inconsistent,  
undocumented

and unpredictable ways :-(  Also, there are usually several system
(and
sometimes user) configuration options that change the behaviour, so
you
have to allow for that.  My experience of trying to use those is  
that

different uses have incompatible requirements, and most of the
critical
configuration parameters apply to ALL uses!

In my view, the configuration variability is the number one  
nightmare
for trying to write portable code that uses any form of shared  
memory.

ARMCI seem to agree.


Because of this, sysv support may be limited to Linux systems -
that is,
until we can get a better sense of which systems provide the  
shmctl

IPC_RMID behavior that I am relying on.


And, I suggest, whether they have an evil gotcha on one of the areas
that
Ashley Pittman noted.


Regards,
Nick Maclaren.


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-03 Thread Jeff Squyres
It might well be that you need a configure test to determine whether this 
behavior occurs or not.  Heck, it may even need to be a run-time test!  Hrm.

Write a small C program that does something like the following (this is off the 
top of my head):

fork a child
child goes to sleep immediately
sysv alloc a segment
attach to it
ipc rm it
parent wakes up child
child tries to attach to segment

If that succeeds, then all is good.  If not, then don't use this stuff.  


On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote:

> Hi all,
> 
> Does anyone know of a relatively portable solution for querying a 
> given system for the shmctl behavior that I am relying on, or is this 
> going to be a nightmare?  Because, if I am reading this thread 
> correctly, the presence of shmget and Linux is not sufficient for 
> determining an adequate level of sysv support.
> 
> Thanks!
> 
> --
> Samuel K. Gutierrez
> Los Alamos National Laboratory
> 
> On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote:
> 
> > On May 2 2010, Ashley Pittman wrote:
> >> On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote:
> >>
> >> As to performance there should be no difference in use between sys-
> >> V shared memory and file-backed shared memory, the instructions 
> >> issued and the MMU flags for the page should both be the same so 
> >> the performance should be identical.
> >
> > Not necessarily, and possibly not so even for far-future Linuces.
> > On at least one system I used, the poxious kernel wrote the complete
> > file to disk before returning - all right, it did that for System V
> > shared memory, too, just to a 'hidden' file!  But, if I recall, on
> > another it did that only for file-backed shared memory - however, it's
> > a decade ago now and I may be misremembering.
> >
> > Of course, that's a serious issue mainly for large segments.  I was
> > using multi-GB ones.  I don't know how big the ones you need are.
> >
> >> The one area you do need to keep an eye on for performance is on 
> >> numa machines where it's important which process on a node touches 
> >> each page first, you can end up using different areas (pages, not 
> >> regions) for communicating in different directions between the same 
> >> pair of processes. I don't believe this is any different to mmap 
> >> backed shared memory though.
> >
> > On some systems it may be, but in bizarre, inconsistent, undocumented
> > and unpredictable ways :-(  Also, there are usually several system 
> > (and
> > sometimes user) configuration options that change the behaviour, so 
> > you
> > have to allow for that.  My experience of trying to use those is that
> > different uses have incompatible requirements, and most of the 
> > critical
> > configuration parameters apply to ALL uses!
> >
> > In my view, the configuration variability is the number one nightmare
> > for trying to write portable code that uses any form of shared memory.
> > ARMCI seem to agree.
> >
> >>> Because of this, sysv support may be limited to Linux systems - 
> >>> that is,
> >>> until we can get a better sense of which systems provide the shmctl
> >>> IPC_RMID behavior that I am relying on.
> >
> > And, I suggest, whether they have an evil gotcha on one of the areas 
> > that
> > Ashley Pittman noted.
> >
> >
> > Regards,
> > Nick Maclaren.
> >
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/