[OMPI users] Graph 500 execution time was increased by up to 11-fold over MPI

2011-11-03 Thread bhimesh akula
Hi,

we have done with testing OFED stack by using openmpi but still want to
check OFED stack with GRAPH 500.I heard about Graph 500 benchmark test but
don't have enough information on net regarding,how to use it.I posted here
as it was  a hope that MPI people know about these.


Please can u suggest me on this like any sites,forums ...etc

Thanks & regards,

Punya Bhimesh


[OMPI users] Shared-memory problems

2011-11-03 Thread Blosch, Edwin L
Can anyone guess what the problem is here?  I was under the impression that 
OpenMPI (1.4.4) would look for /tmp and would create its shared-memory backing 
file there, i.e. if you don't set orte_tmpdir_base to anything.

Well, there IS a /tmp and yet it appears that OpenMPI has chosen to use 
/dev/shm.  Why?

And, next question, why doesn't it work?  Here are the oddities of this cluster:

-the cluster is 'diskless'

-/tmp is an NFS mount

-/dev/shm is 12 GB and has 755 permissions

FilesystemSize  Used Avail Use% Mounted on
tmpfs  12G  164K   12G   1% /dev/shm

% ls -l output:
drwxr-xr-x  2 root root 40 Oct 28 09:14 shm


The error message below suggests that OpenMPI (1.4.4) has somehow 
auto-magically decided to use /dev/shm and is failing to be able to use it, for 
some reason.

Thanks for whatever help you can offer,

Ed


e8315:02942] opal_os_dirpath_create: Error: Unable to create the sub-directory 
(/dev/shm/openmpi-sessions-estenfte@e8315_0) of 
(/dev/shm/openmpi-sessions-estenfte@e8315_0/8474/0/1), mkdir failed [1]
[e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file util/session_dir.c at 
line 106
[e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file util/session_dir.c at 
line 399
[e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file 
base/ess_base_std_orted.c at line 206
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_session_dir failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--
[e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file ess_env_module.c at 
line 136
[e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at 
line 132
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--
[e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file orted/orted_main.c at 
line 325





Re: [OMPI users] Shared-memory problems

2011-11-03 Thread Ralph Castain

On Nov 3, 2011, at 8:54 AM, Blosch, Edwin L wrote:

> Can anyone guess what the problem is here?  I was under the impression that 
> OpenMPI (1.4.4) would look for /tmp and would create its shared-memory 
> backing file there, i.e. if you don’t set orte_tmpdir_base to anything.

That is correct

>  
> Well, there IS a /tmp and yet it appears that OpenMPI has chosen to use 
> /dev/shm.  Why?

Looks like a bug to me - it shouldn't be doing that. Will have to take a look - 
first I've heard of that behavior.


>  
> And, next question, why doesn’t it work?  Here are the oddities of this 
> cluster:
> -the cluster is ‘diskless’
> -/tmp is an NFS mount
> -/dev/shm is 12 GB and has 755 permissions
>  
> FilesystemSize  Used Avail Use% Mounted on
> tmpfs  12G  164K   12G   1% /dev/shm
>  
> % ls –l output:
> drwxr-xr-x  2 root root 40 Oct 28 09:14 shm
>  
>  
> The error message below suggests that OpenMPI (1.4.4) has somehow 
> auto-magically decided to use /dev/shm and is failing to be able to us e it, 
> for some reason.
>  
> Thanks for whatever help you can offer,
>  
> Ed
>  
>  
> e8315:02942] opal_os_dirpath_create: Error: Unable to create the 
> sub-directory (/dev/shm/openmpi-sessions-estenfte@e8315_0) of 
> (/dev/shm/openmpi-sessions-estenfte@e8315_0/8474/0/1), mkdir failed [1]
> [e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file util/session_dir.c 
> at line 106
> [e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file util/session_dir.c 
> at line 399
> [e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file 
> base/ess_base_std_orted.c at line 206
> --
> It looks like orte_init failed for some reason; your parallel process is
> 
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>  
>   orte_session_dir failed
>   --> Returned value Error (-1) instead of ORTE_SUCCESS
> --
> [e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file ess_env_module.c at 
> line 136
> [e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file runtime/orte_init.c 
> at line 132
> --
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>  
>   orte_ess_set_name failed
>   --> Returned value Error (-1) instead of ORTE_SUCCESS
> --
> [e8315:02942] [[8474,0],1] ORTE_ERROR_LOG: Error in file orted/orted_main.c 
> at line 325
>  
>  
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Shared-memory problems

2011-11-03 Thread Bogdan Costescu
On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L  wrote:
> -    /dev/shm is 12 GB and has 755 permissions
> ...
> % ls –l output:
>
> drwxr-xr-x  2 root root 40 Oct 28 09:14 shm

This is your problem: it should be something like drwxrwxrwt. It might
depend on the distribution, f.e. the following show this to be a bug:

https://bugzilla.redhat.com/show_bug.cgi?id=533897
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317329

and surely you can find some more on the subject with your favorite
search engine. Another source could be a paranoid sysadmin who has
changed the default (most likely correct) setting the distribution
came with - not only OpenMPI but any application using shmem would be
affected..

Cheers,
Bogdan



Re: [OMPI users] Shared-memory problems

2011-11-03 Thread Durga Choudhury
Since /tmp is mounted across a network and /dev/shm is (always) local,
/dev/shm seems to be the right place for shared memory transactions.
If you create temporary files using mktemp is it being created in
/dev/shm or /tmp?


On Thu, Nov 3, 2011 at 11:50 AM, Bogdan Costescu  wrote:
> On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L  wrote:
>> -    /dev/shm is 12 GB and has 755 permissions
>> ...
>> % ls –l output:
>>
>> drwxr-xr-x  2 root root 40 Oct 28 09:14 shm
>
> This is your problem: it should be something like drwxrwxrwt. It might
> depend on the distribution, f.e. the following show this to be a bug:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=533897
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317329
>
> and surely you can find some more on the subject with your favorite
> search engine. Another source could be a paranoid sysadmin who has
> changed the default (most likely correct) setting the distribution
> came with - not only OpenMPI but any application using shmem would be
> affected..
>
> Cheers,
> Bogdan
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



[OMPI users] problem with mpirun

2011-11-03 Thread amine mrabet
hey ,
i use mpirun tu run program  with using mpi this program worked well in
university computer

but with mine i have this error
 i run with

amine@dellam:~/Bureau$ mpirun  -np 2 pl
and i have this error

libibverbs: Fatal: couldn't read uverbs ABI version.
--
[0,0,0]: OpenIB on host dellam was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.





any help?!
-- 
amine mrabet


Re: [OMPI users] Shared-memory problems

2011-11-03 Thread Ralph Castain
I'm afraid this isn't correct. You definitely don't want the session directory 
in /dev/shm as this will almost always cause problems.

We look thru a progression of envars to find where to put the session directory:

1. the MCA param orte_tmpdir_base

2. the envar OMPI_PREFIX_ENV

3. the envar TMPDIR

4. the envar TEMP

5. the envar TMP

Check all those to see if one is set to /dev/shm. If so, you have a problem to 
resolve. For performance reasons, you probably don't want the session directory 
sitting on a network mounted location. What you need is a good local directory 
- anything you have permission to write in will work fine. Just set one of the 
above to point to it.


On Nov 3, 2011, at 10:04 AM, Durga Choudhury wrote:

> Since /tmp is mounted across a network and /dev/shm is (always) local,
> /dev/shm seems to be the right place for shared memory transactions.
> If you create temporary files using mktemp is it being created in
> /dev/shm or /tmp?
> 
> 
> On Thu, Nov 3, 2011 at 11:50 AM, Bogdan Costescu  wrote:
>> On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L  
>> wrote:
>>> -/dev/shm is 12 GB and has 755 permissions
>>> ...
>>> % ls –l output:
>>> 
>>> drwxr-xr-x  2 root root 40 Oct 28 09:14 shm
>> 
>> This is your problem: it should be something like drwxrwxrwt. It might
>> depend on the distribution, f.e. the following show this to be a bug:
>> 
>> https://bugzilla.redhat.com/show_bug.cgi?id=533897
>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317329
>> 
>> and surely you can find some more on the subject with your favorite
>> search engine. Another source could be a paranoid sysadmin who has
>> changed the default (most likely correct) setting the distribution
>> came with - not only OpenMPI but any application using shmem would be
>> affected..
>> 
>> Cheers,
>> Bogdan
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Jeff Squyres
On Nov 1, 2011, at 7:31 PM, Blosch, Edwin L wrote:

> I’m getting this message below which is observing correctly that /tmp is 
> NFS-mounted.   But there is no other directory which has user or group write 
> permissions.  So I think I’m kind of stuck, and it sounds like a serious 
> issue.

That does kinda suck.  :-\

> Before I ask the administrators to change their image, i.e. mount this 
> partition under /work instead of /tmp, I’d like to ask if anyone is using 
> OpenMPI on a state-less cluster, and are there any gotchas with regards to 
> performance of OpenMPI, i.e. like handling of /tmp, that one would need to 
> know?

I don't have much empirical information here -- I know that some people have 
done this (make /tmp be NFS-mounted).  I think there are at least some issues 
with this, though -- many applications believe that a sufficient condition for 
uniqueness in /tmp is to simply append your PID to a filename.  But this may no 
longer be true if /tmp is shared across multiple OS instances.

I don't have a specific case where this is problematic, but it's not a large 
stretch to imagine that this could happen in practice with random applications 
that make temp files in /tmp.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] problem with mpirun

2011-11-03 Thread Ralph Castain
Couple of things:

1. Check the configure cmd line you gave - OMPI thinks your local computer 
should have an openib support that isn't correct.

2. did you recompile your app on your local computer, using the version of OMPI 
built/installed there?


On Nov 3, 2011, at 10:10 AM, amine mrabet wrote:

> hey ,
> i use mpirun tu run program  with using mpi this program worked well in 
> university computer 
> 
> but with mine i have this error
>  i run with 
> 
> amine@dellam:~/Bureau$ mpirun  -np 2 pl
> and i have this error 
> 
> libibverbs: Fatal: couldn't read uverbs ABI version.
> --
> [0,0,0]: OpenIB on host dellam was unable to find any HCAs.
> Another transport will be used instead, although this may result in 
> lower performance.
> 
> 
> 
> 
> 
> any help?!
> -- 
> amine mrabet 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] problem with mpirun

2011-11-03 Thread amine mrabet
i use openmpi in my computer

2011/11/3 Ralph Castain 

> Couple of things:
>
> 1. Check the configure cmd line you gave - OMPI thinks your local computer
> should have an openib support that isn't correct.
>
> 2. did you recompile your app on your local computer, using the version of
> OMPI built/installed there?
>
>
> On Nov 3, 2011, at 10:10 AM, amine mrabet wrote:
>
> > hey ,
> > i use mpirun tu run program  with using mpi this program worked well in
> university computer
> >
> > but with mine i have this error
> >  i run with
> >
> > amine@dellam:~/Bureau$ mpirun  -np 2 pl
> > and i have this error
> >
> > libibverbs: Fatal: couldn't read uverbs ABI version.
> >
> --
> > [0,0,0]: OpenIB on host dellam was unable to find any HCAs.
> > Another transport will be used instead, although this may result in
> > lower performance.
> >
> >
> >
> >
> >
> > any help?!
> > --
> > amine mrabet
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
amine mrabet


Re: [OMPI users] EXTERNAL: Re: Shared-memory problems

2011-11-03 Thread Blosch, Edwin L
In /tmp.

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Durga Choudhury
Sent: Thursday, November 03, 2011 11:04 AM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Shared-memory problems

Since /tmp is mounted across a network and /dev/shm is (always) local,
/dev/shm seems to be the right place for shared memory transactions.
If you create temporary files using mktemp is it being created in
/dev/shm or /tmp?


On Thu, Nov 3, 2011 at 11:50 AM, Bogdan Costescu  wrote:
> On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L  wrote:
>> -    /dev/shm is 12 GB and has 755 permissions
>> ...
>> % ls -l output:
>>
>> drwxr-xr-x  2 root root 40 Oct 28 09:14 shm
>
> This is your problem: it should be something like drwxrwxrwt. It might
> depend on the distribution, f.e. the following show this to be a bug:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=533897
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317329
>
> and surely you can find some more on the subject with your favorite
> search engine. Another source could be a paranoid sysadmin who has
> changed the default (most likely correct) setting the distribution
> came with - not only OpenMPI but any application using shmem would be
> affected..
>
> Cheers,
> Bogdan
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Blosch, Edwin L
Cross-thread response here, as this is related to the shared-memory thread:

Yes it sucks, so that's what led me to post my original question: If /dev/shm 
isn't the right place to put the session file, and /tmp is NFS-mounted, then 
what IS the "right" way to set up a diskless cluster?  I don't think the idea 
of tempfs sounds very appealing, after reading the discussion in FAQ #8 about 
shared-memory usage. We definitely have a job-queueing system and jobs are very 
often killed using qdel, and writing a post-script handler is way beyond the 
level of involvement or expertise we can expect from our sys admins.

Surely there's some reasonable guidance that can be offered to work around an 
issue that is so disabling.

A related question would be: How is it that HP-MPI works just fine on this 
cluster as it is configured now?  Are they doing something different for shared 
memory communications?


Thanks


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Jeff Squyres
Sent: Thursday, November 03, 2011 11:35 AM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] How to set up state-less node /tmp for 
OpenMPI usage

On Nov 1, 2011, at 7:31 PM, Blosch, Edwin L wrote:

> I'm getting this message below which is observing correctly that /tmp is 
> NFS-mounted.   But there is no other directory which has user or group write 
> permissions.  So I think I'm kind of stuck, and it sounds like a serious 
> issue.

That does kinda suck.  :-\

> Before I ask the administrators to change their image, i.e. mount this 
> partition under /work instead of /tmp, I'd like to ask if anyone is using 
> OpenMPI on a state-less cluster, and are there any gotchas with regards to 
> performance of OpenMPI, i.e. like handling of /tmp, that one would need to 
> know?

I don't have much empirical information here -- I know that some people have 
done this (make /tmp be NFS-mounted).  I think there are at least some issues 
with this, though -- many applications believe that a sufficient condition for 
uniqueness in /tmp is to simply append your PID to a filename.  But this may no 
longer be true if /tmp is shared across multiple OS instances.

I don't have a specific case where this is problematic, but it's not a large 
stretch to imagine that this could happen in practice with random applications 
that make temp files in /tmp.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Eugene Loh
I've not been following closely.  Why must one use shared-memory 
communications?  How about using other BTLs in a "loopback" fashion?


Re: [OMPI users] EXTERNAL: Re: Shared-memory problems

2011-11-03 Thread Blosch, Edwin L
You are right, Ralph.  There is no surprise behavior.  I had forgotten that I 
had been testing --mca orte_tmpdir_base /dev/shm to see if it worked (and 
obviously it doesn't).  Before that, without any MCA options, OpenMPI had tried 
/tmp, and gave me the warning about /tmp being NFS mounted, and so I had been 
exploring options.

I accept your point - I need "a good local directory - anything you have 
permission to write in will work fine".  How would one do this on a stateless 
node?  And can I beat the vendor over the head for not knowing how to set up 
the node image so that OpenMPI could function properly?

Thanks


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Ralph Castain
Sent: Thursday, November 03, 2011 11:33 AM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Shared-memory problems

I'm afraid this isn't correct. You definitely don't want the session directory 
in /dev/shm as this will almost always cause problems.

We look thru a progression of envars to find where to put the session directory:

1. the MCA param orte_tmpdir_base

2. the envar OMPI_PREFIX_ENV

3. the envar TMPDIR

4. the envar TEMP

5. the envar TMP

Check all those to see if one is set to /dev/shm. If so, you have a problem to 
resolve. For performance reasons, you probably don't want the session directory 
sitting on a network mounted location. What you need is a good local directory 
- anything you have permission to write in will work fine. Just set one of the 
above to point to it.


On Nov 3, 2011, at 10:04 AM, Durga Choudhury wrote:

> Since /tmp is mounted across a network and /dev/shm is (always) local,
> /dev/shm seems to be the right place for shared memory transactions.
> If you create temporary files using mktemp is it being created in
> /dev/shm or /tmp?
> 
> 
> On Thu, Nov 3, 2011 at 11:50 AM, Bogdan Costescu  wrote:
>> On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L  
>> wrote:
>>> -/dev/shm is 12 GB and has 755 permissions
>>> ...
>>> % ls -l output:
>>> 
>>> drwxr-xr-x  2 root root 40 Oct 28 09:14 shm
>> 
>> This is your problem: it should be something like drwxrwxrwt. It might
>> depend on the distribution, f.e. the following show this to be a bug:
>> 
>> https://bugzilla.redhat.com/show_bug.cgi?id=533897
>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=317329
>> 
>> and surely you can find some more on the subject with your favorite
>> search engine. Another source could be a paranoid sysadmin who has
>> changed the default (most likely correct) setting the distribution
>> came with - not only OpenMPI but any application using shmem would be
>> affected..
>> 
>> Cheers,
>> Bogdan
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Jeff Squyres
On Nov 3, 2011, at 1:36 PM, Blosch, Edwin L wrote:

> Yes it sucks, so that's what led me to post my original question: If /dev/shm 
> isn't the right place to put the session file, and /tmp is NFS-mounted, then 
> what IS the "right" way to set up a diskless cluster?  I don't think the idea 
> of tempfs sounds very appealing, after reading the discussion in FAQ #8 about 
> shared-memory usage. We definitely have a job-queueing system and jobs are 
> very often killed using qdel, and writing a post-script handler is way beyond 
> the level of involvement or expertise we can expect from our sys admins.

In the upcoming OMPI v1.7, we revamped the shared memory setup code such that 
it'll actually use /dev/shm properly, or use some other mechanism other than a 
mmap file backed in a real filesystem.  So the issue goes away.  But it doesn't 
help you yet.  :-\

> Surely there's some reasonable guidance that can be offered to work around an 
> issue that is so disabling.

Other than the shared memory file, the session directory shouldn't be large.  
So keeping it in a tmpfs should be ok.  It's just that putting the shared 
memory in a tmpfs has the potential to cost you "twice": the actual shared 
memory itself, and then taking up space in tmpfs (although I have not verified 
this myself -- perhaps Linux is smart enough to not do this?).

Are there *no* local disk on the machines at all?

> A related question would be: How is it that HP-MPI works just fine on this 
> cluster as it is configured now?  Are they doing something different for 
> shared memory communications?

They're probably either not warning you about the issue or not using mmaped 
files that are backed in a filesystem (warning you about the issue is actually 
a relatively new feature in OMPI, IIRC -- since 1.0, IIRC, OMPI has used mmap 
files in a filesystem).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Blosch, Edwin L
I don't tell OpenMPI what BTLs to use. The default uses sm and puts a session 
file on /tmp, which is NFS-mounted and thus not a good choice.

Are you suggesting something like --mca ^sm?


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Eugene Loh
Sent: Thursday, November 03, 2011 12:54 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage

I've not been following closely.  Why must one use shared-memory 
communications?  How about using other BTLs in a "loopback" fashion?
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Eugene Loh

Right.  Actually "--mca btl ^sm".  (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:

I don't tell OpenMPI what BTLs to use. The default uses sm and puts a session 
file on /tmp, which is NFS-mounted and thus not a good choice.

Are you suggesting something like --mca ^sm?


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Eugene Loh
Sent: Thursday, November 03, 2011 12:54 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage

I've not been following closely.  Why must one use shared-memory
communications?  How about using other BTLs in a "loopback" fashion?
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Blosch, Edwin L
I might be missing something here. Is there a side-effect or performance loss 
if you don't use the sm btl?  Why would it exist if there is a wholly 
equivalent alternative?  What happens to traffic that is intended for another 
process on the same node?

Thanks


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Eugene Loh
Sent: Thursday, November 03, 2011 1:23 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage

Right.  Actually "--mca btl ^sm".  (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:
> I don't tell OpenMPI what BTLs to use. The default uses sm and puts a session 
> file on /tmp, which is NFS-mounted and thus not a good choice.
>
> Are you suggesting something like --mca ^sm?
>
>
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Eugene Loh
> Sent: Thursday, November 03, 2011 12:54 PM
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
> for OpenMPI usage
>
> I've not been following closely.  Why must one use shared-memory
> communications?  How about using other BTLs in a "loopback" fashion?
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Ralph Castain

On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:

> I might be missing something here. Is there a side-effect or performance loss 
> if you don't use the sm btl?  Why would it exist if there is a wholly 
> equivalent alternative?  What happens to traffic that is intended for another 
> process on the same node?

There is a definite performance impact, and we wouldn't recommend doing what 
Eugene suggested if you care about performance.

The correct solution here is get your sys admin to make /tmp local. Making /tmp 
NFS mounted across multiple nodes is a major "faux pas" in the Linux world - it 
should never be done, for the reasons stated by Jeff.


> 
> Thanks
> 
> 
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Eugene Loh
> Sent: Thursday, November 03, 2011 1:23 PM
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
> for OpenMPI usage
> 
> Right.  Actually "--mca btl ^sm".  (Was missing "btl".)
> 
> On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:
>> I don't tell OpenMPI what BTLs to use. The default uses sm and puts a 
>> session file on /tmp, which is NFS-mounted and thus not a good choice.
>> 
>> Are you suggesting something like --mca ^sm?
>> 
>> 
>> -Original Message-
>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
>> Behalf Of Eugene Loh
>> Sent: Thursday, November 03, 2011 12:54 PM
>> To: us...@open-mpi.org
>> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
>> for OpenMPI usage
>> 
>> I've not been following closely.  Why must one use shared-memory
>> communications?  How about using other BTLs in a "loopback" fashion?
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] problem with mpirun

2011-11-03 Thread Jeff Squyres
It sounds like you have an old version of Open MPI that is not ignoring your 
unconfigured OpenFabrics devices in your Linux install.  This is a guess 
because you didn't provide any information about your Open MPI installation.  
:-)

Try upgrading to a newer version of Open MPI.


On Nov 3, 2011, at 12:52 PM, amine mrabet wrote:

> i use openmpi in my computer 
> 
> 2011/11/3 Ralph Castain 
> Couple of things:
> 
> 1. Check the configure cmd line you gave - OMPI thinks your local computer 
> should have an openib support that isn't correct.
> 
> 2. did you recompile your app on your local computer, using the version of 
> OMPI built/installed there?
> 
> 
> On Nov 3, 2011, at 10:10 AM, amine mrabet wrote:
> 
> > hey ,
> > i use mpirun tu run program  with using mpi this program worked well in 
> > university computer
> >
> > but with mine i have this error
> >  i run with
> >
> > amine@dellam:~/Bureau$ mpirun  -np 2 pl
> > and i have this error
> >
> > libibverbs: Fatal: couldn't read uverbs ABI version.
> > --
> > [0,0,0]: OpenIB on host dellam was unable to find any HCAs.
> > Another transport will be used instead, although this may result in
> > lower performance.
> >
> >
> >
> >
> >
> > any help?!
> > --
> > amine mrabet
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> amine mrabet 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Jeff Squyres
The sm btl is definitely more performant than loopback on other devices.

On Nov 3, 2011, at 4:55 PM, Blosch, Edwin L wrote:

> I might be missing something here. Is there a side-effect or performance loss 
> if you don't use the sm btl?  Why would it exist if there is a wholly 
> equivalent alternative?  What happens to traffic that is intended for another 
> process on the same node?
> 
> Thanks
> 
> 
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Eugene Loh
> Sent: Thursday, November 03, 2011 1:23 PM
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
> for OpenMPI usage
> 
> Right.  Actually "--mca btl ^sm".  (Was missing "btl".)
> 
> On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:
>> I don't tell OpenMPI what BTLs to use. The default uses sm and puts a 
>> session file on /tmp, which is NFS-mounted and thus not a good choice.
>> 
>> Are you suggesting something like --mca ^sm?
>> 
>> 
>> -Original Message-
>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
>> Behalf Of Eugene Loh
>> Sent: Thursday, November 03, 2011 12:54 PM
>> To: us...@open-mpi.org
>> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
>> for OpenMPI usage
>> 
>> I've not been following closely.  Why must one use shared-memory
>> communications?  How about using other BTLs in a "loopback" fashion?
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] problem with mpirun

2011-11-03 Thread amine mrabet
yes i have old version i will instal  1.4.4 and see
merci

2011/11/3 Jeff Squyres 

> It sounds like you have an old version of Open MPI that is not ignoring
> your unconfigured OpenFabrics devices in your Linux install.  This is a
> guess because you didn't provide any information about your Open MPI
> installation.  :-)
>
> Try upgrading to a newer version of Open MPI.
>
>
> On Nov 3, 2011, at 12:52 PM, amine mrabet wrote:
>
> > i use openmpi in my computer
> >
> > 2011/11/3 Ralph Castain 
> > Couple of things:
> >
> > 1. Check the configure cmd line you gave - OMPI thinks your local
> computer should have an openib support that isn't correct.
> >
> > 2. did you recompile your app on your local computer, using the version
> of OMPI built/installed there?
> >
> >
> > On Nov 3, 2011, at 10:10 AM, amine mrabet wrote:
> >
> > > hey ,
> > > i use mpirun tu run program  with using mpi this program worked well
> in university computer
> > >
> > > but with mine i have this error
> > >  i run with
> > >
> > > amine@dellam:~/Bureau$ mpirun  -np 2 pl
> > > and i have this error
> > >
> > > libibverbs: Fatal: couldn't read uverbs ABI version.
> > >
> --
> > > [0,0,0]: OpenIB on host dellam was unable to find any HCAs.
> > > Another transport will be used instead, although this may result in
> > > lower performance.
> > >
> > >
> > >
> > >
> > >
> > > any help?!
> > > --
> > > amine mrabet
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > --
> > amine mrabet
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
amine mrabet


Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Blosch, Edwin L
Thanks for the help.  A couple follow-up-questions, maybe this starts to go 
outside OpenMPI:

What's wrong with using /dev/shm?  I think you said earlier in this thread that 
this was not a safe place.

If the NFS-mount point is moved from /tmp to /work, would a /tmp magically 
appear in the filesystem for a stateless node?  How big would it be, given that 
there is no local disk, right?  That may be something I have to ask the vendor, 
which I've tried, but they don't quite seem to get the question.

Thanks




-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Ralph Castain
Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage


On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:

> I might be missing something here. Is there a side-effect or performance loss 
> if you don't use the sm btl?  Why would it exist if there is a wholly 
> equivalent alternative?  What happens to traffic that is intended for another 
> process on the same node?

There is a definite performance impact, and we wouldn't recommend doing what 
Eugene suggested if you care about performance.

The correct solution here is get your sys admin to make /tmp local. Making /tmp 
NFS mounted across multiple nodes is a major "faux pas" in the Linux world - it 
should never be done, for the reasons stated by Jeff.


> 
> Thanks
> 
> 
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Eugene Loh
> Sent: Thursday, November 03, 2011 1:23 PM
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
> for OpenMPI usage
> 
> Right.  Actually "--mca btl ^sm".  (Was missing "btl".)
> 
> On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:
>> I don't tell OpenMPI what BTLs to use. The default uses sm and puts a 
>> session file on /tmp, which is NFS-mounted and thus not a good choice.
>> 
>> Are you suggesting something like --mca ^sm?
>> 
>> 
>> -Original Message-
>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
>> Behalf Of Eugene Loh
>> Sent: Thursday, November 03, 2011 12:54 PM
>> To: us...@open-mpi.org
>> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
>> for OpenMPI usage
>> 
>> I've not been following closely.  Why must one use shared-memory
>> communications?  How about using other BTLs in a "loopback" fashion?
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread David Turner

I'm not a systems guy, but I'll pitch in anyway.  On our cluster,
all the compute nodes are completely diskless.  The root file system,
including /tmp, resides in memory (ramdisk).  OpenMPI puts these
session directories therein.  All our jobs run through a batch
system (torque).  At the conclusion of each batch job, an epilogue
process runs that removes all files belonging to the owner of the
current batch job from /tmp (and also looks for and kills orphan
processes belonging to the user).  This epilogue had to written
by our systems staff.

I believe this is a fairly common configuration for diskless
clusters.

On 11/3/11 4:09 PM, Blosch, Edwin L wrote:

Thanks for the help.  A couple follow-up-questions, maybe this starts to go 
outside OpenMPI:

What's wrong with using /dev/shm?  I think you said earlier in this thread that 
this was not a safe place.

If the NFS-mount point is moved from /tmp to /work, would a /tmp magically 
appear in the filesystem for a stateless node?  How big would it be, given that 
there is no local disk, right?  That may be something I have to ask the vendor, 
which I've tried, but they don't quite seem to get the question.

Thanks




-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Ralph Castain
Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage


On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:


I might be missing something here. Is there a side-effect or performance loss 
if you don't use the sm btl?  Why would it exist if there is a wholly 
equivalent alternative?  What happens to traffic that is intended for another 
process on the same node?


There is a definite performance impact, and we wouldn't recommend doing what 
Eugene suggested if you care about performance.

The correct solution here is get your sys admin to make /tmp local. Making /tmp NFS 
mounted across multiple nodes is a major "faux pas" in the Linux world - it 
should never be done, for the reasons stated by Jeff.




Thanks


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Eugene Loh
Sent: Thursday, November 03, 2011 1:23 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage

Right.  Actually "--mca btl ^sm".  (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:

I don't tell OpenMPI what BTLs to use. The default uses sm and puts a session 
file on /tmp, which is NFS-mounted and thus not a good choice.

Are you suggesting something like --mca ^sm?


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Eugene Loh
Sent: Thursday, November 03, 2011 12:54 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage

I've not been following closely.  Why must one use shared-memory
communications?  How about using other BTLs in a "loopback" fashion?
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316


Re: [OMPI users] problem with mpirun

2011-11-03 Thread amine mrabet
i instaled  last version of openmpi now i have this error
I
t seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).

:)


2011/11/3 amine mrabet 

> yes i have old version i will instal  1.4.4 and see
> merci
>
>
> 2011/11/3 Jeff Squyres 
>
>> It sounds like you have an old version of Open MPI that is not ignoring
>> your unconfigured OpenFabrics devices in your Linux install.  This is a
>> guess because you didn't provide any information about your Open MPI
>> installation.  :-)
>>
>> Try upgrading to a newer version of Open MPI.
>>
>>
>> On Nov 3, 2011, at 12:52 PM, amine mrabet wrote:
>>
>> > i use openmpi in my computer
>> >
>> > 2011/11/3 Ralph Castain 
>> > Couple of things:
>> >
>> > 1. Check the configure cmd line you gave - OMPI thinks your local
>> computer should have an openib support that isn't correct.
>> >
>> > 2. did you recompile your app on your local computer, using the version
>> of OMPI built/installed there?
>> >
>> >
>> > On Nov 3, 2011, at 10:10 AM, amine mrabet wrote:
>> >
>> > > hey ,
>> > > i use mpirun tu run program  with using mpi this program worked well
>> in university computer
>> > >
>> > > but with mine i have this error
>> > >  i run with
>> > >
>> > > amine@dellam:~/Bureau$ mpirun  -np 2 pl
>> > > and i have this error
>> > >
>> > > libibverbs: Fatal: couldn't read uverbs ABI version.
>> > >
>> --
>> > > [0,0,0]: OpenIB on host dellam was unable to find any HCAs.
>> > > Another transport will be used instead, although this may result in
>> > > lower performance.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > any help?!
>> > > --
>> > > amine mrabet
>> > > ___
>> > > users mailing list
>> > > us...@open-mpi.org
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> >
>> > --
>> > amine mrabet
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> amine mrabet
>



-- 
amine mrabet


Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-03 Thread Ed Blosch
Thanks very much, exactly what I wanted to hear. How big is /tmp?

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of David Turner
Sent: Thursday, November 03, 2011 6:36 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp
for OpenMPI usage

I'm not a systems guy, but I'll pitch in anyway.  On our cluster,
all the compute nodes are completely diskless.  The root file system,
including /tmp, resides in memory (ramdisk).  OpenMPI puts these
session directories therein.  All our jobs run through a batch
system (torque).  At the conclusion of each batch job, an epilogue
process runs that removes all files belonging to the owner of the
current batch job from /tmp (and also looks for and kills orphan
processes belonging to the user).  This epilogue had to written
by our systems staff.

I believe this is a fairly common configuration for diskless
clusters.

On 11/3/11 4:09 PM, Blosch, Edwin L wrote:
> Thanks for the help.  A couple follow-up-questions, maybe this starts to
go outside OpenMPI:
>
> What's wrong with using /dev/shm?  I think you said earlier in this thread
that this was not a safe place.
>
> If the NFS-mount point is moved from /tmp to /work, would a /tmp magically
appear in the filesystem for a stateless node?  How big would it be, given
that there is no local disk, right?  That may be something I have to ask the
vendor, which I've tried, but they don't quite seem to get the question.
>
> Thanks
>
>
>
>
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Ralph Castain
> Sent: Thursday, November 03, 2011 5:22 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp
for OpenMPI usage
>
>
> On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:
>
>> I might be missing something here. Is there a side-effect or performance
loss if you don't use the sm btl?  Why would it exist if there is a wholly
equivalent alternative?  What happens to traffic that is intended for
another process on the same node?
>
> There is a definite performance impact, and we wouldn't recommend doing
what Eugene suggested if you care about performance.
>
> The correct solution here is get your sys admin to make /tmp local. Making
/tmp NFS mounted across multiple nodes is a major "faux pas" in the Linux
world - it should never be done, for the reasons stated by Jeff.
>
>
>>
>> Thanks
>>
>>
>> -Original Message-
>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Eugene Loh
>> Sent: Thursday, November 03, 2011 1:23 PM
>> To: us...@open-mpi.org
>> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmp for OpenMPI usage
>>
>> Right.  Actually "--mca btl ^sm".  (Was missing "btl".)
>>
>> On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:
>>> I don't tell OpenMPI what BTLs to use. The default uses sm and puts a
session file on /tmp, which is NFS-mounted and thus not a good choice.
>>>
>>> Are you suggesting something like --mca ^sm?
>>>
>>>
>>> -Original Message-
>>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Eugene Loh
>>> Sent: Thursday, November 03, 2011 12:54 PM
>>> To: us...@open-mpi.org
>>> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmp for OpenMPI usage
>>>
>>> I've not been following closely.  Why must one use shared-memory
>>> communications?  How about using other BTLs in a "loopback" fashion?
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Best regards,

David Turner
User Services Groupemail: dptur...@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Labfax: (510) 486-4316
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users