You are right, Ralph.  There is no surprise behavior.  I had forgotten that I 
had been testing --mca orte_tmpdir_base /dev/shm to see if it worked (and 
obviously it doesn't).  Before that, without any MCA options, OpenMPI had tried 
/tmp, and gave me the warning about /tmp being NFS mounted, and so I had been 
exploring options.

I accept your point - I need "a good local directory - anything you have 
permission to write in will work fine".  How would one do this on a stateless 
node?  And can I beat the vendor over the head for not knowing how to set up 
the node image so that OpenMPI could function properly?

Thanks


-----Original Message-----
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Ralph Castain
Sent: Thursday, November 03, 2011 11:33 AM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Shared-memory problems

I'm afraid this isn't correct. You definitely don't want the session directory 
in /dev/shm as this will almost always cause problems.

We look thru a progression of envars to find where to put the session directory:

1. the MCA param orte_tmpdir_base

2. the envar OMPI_PREFIX_ENV

3. the envar TMPDIR

4. the envar TEMP

5. the envar TMP

Check all those to see if one is set to /dev/shm. If so, you have a problem to 
resolve. For performance reasons, you probably don't want the session directory 
sitting on a network mounted location. What you need is a good local directory 
- anything you have permission to write in will work fine. Just set one of the 
above to point to it.


On Nov 3, 2011, at 10:04 AM, Durga Choudhury wrote:

> Since /tmp is mounted across a network and /dev/shm is (always) local,
> /dev/shm seems to be the right place for shared memory transactions.
> If you create temporary files using mktemp is it being created in
> /dev/shm or /tmp?
> 
> 
> On Thu, Nov 3, 2011 at 11:50 AM, Bogdan Costescu <bcoste...@gmail.com> wrote:
>> On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L <edwin.l.blo...@lmco.com> 
>> wrote:
>>> -    /dev/shm is 12 GB and has 755 permissions
>>> ...
>>> % ls -l output:
>>> 
>>> drwxr-xr-x  2 root root         40 Oct 28 09:14 shm
>> 
>> This is your problem: it should be something like drwxrwxrwt. It might
>> depend on the distribution, f.e. the following show this to be a bug:
>> 
>> https://bugzilla.redhat.com/show_bug.cgi?id=533897
>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317329
>> 
>> and surely you can find some more on the subject with your favorite
>> search engine. Another source could be a paranoid sysadmin who has
>> changed the default (most likely correct) setting the distribution
>> came with - not only OpenMPI but any application using shmem would be
>> affected..
>> 
>> Cheers,
>> Bogdan
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to