Re: [OMPI users] Problem launching onto Bourne shell
Doh; yes we did. This was a minor glitch in porting the 1.2 series fix to the trunk/v1.3 (i.e., the fix in v1.2.8 is ok -- whew!). Fixed on the trunk in r19758; thanks for noticing. I'll file a CMR for v1.3. On Oct 16, 2008, at 7:05 PM, Mostyn Lewis wrote: Jeff, You broke my ksh (and I expect something else) Today's SVN 1.4a1r19757 orte/mca/plm/rsh/plm_rsh_module.c line 471: tmp = opal_argv_split("( test ! -r ./.profile || . ./.profile;", ' '); ^ ARGHH No ( tmp = opal_argv_split(" test ! -r ./.profile || . ./.profile;", ' '); and all is well again :) Regards, Mostyn On Thu, 9 Oct 2008, Jeff Squyres wrote: FWIW, the fix has been pushed into the trunk, 1.2.8, and 1.3 SVN branches. So I'll probably take down the hg tree (we use those as temporary branches). On Oct 9, 2008, at 2:32 PM, Hahn Kim wrote: Hi, Thanks for providing a fix, sorry for the delay in response. Once I found out about -x, I've been busy working on the rest of our code, so I haven't had the time to try out the fix. I'll take a look at it soon as I can and will let you know how it works out. Hahn On Oct 7, 2008, at 5:41 PM, Jeff Squyres wrote: On Oct 7, 2008, at 4:19 PM, Hahn Kim wrote: you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. Right, that's exactly what I want to do. I was hoping that mpirun would run .profile as the FAQ page stated, but the -x fix works for now. If you're using Bash, it should be running .bashrc. But it looks like you did identify a bug that we're *not* running .profile. I have a Mercurial branch up with a fix if you want to give it a spin: http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/sh-profile-fixes/ I just realized that I'm using .bash_profile on the x86 and need to move its contents into .bashrc and call .bashrc from .bash_profile, since eventually I will also be launching MPI jobs onto other x86 processors. Thanks to everyone for their help. Hahn On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote: On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote: Regarding 1., we're actually using 1.2.5. We started using Open MPI last winter and just stuck with it. For now, using the -x flag with mpirun works. If this really is a bug in 1.2.7, then I think we'll stick with 1.2.5 for now, then upgrade later when it's fixed. It looks like this behavior has been the same throughout the entire 1.2 series. Regarding 2., are you saying I should run the commands you suggest from the x86 node running bash, so that ssh logs into the Cell node running Bourne? I'm saying that if "ssh othernode env" gives different answers than "ssh othernode"/"env", then your .bashrc or .profile or whatever is dumping out early depending on whether you have an interactive login or not. This is the real cause of the error -- you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. When I run "ssh othernode env" from the x86 node, I get the following vanilla environment: USER=ha17646 HOME=/home/ha17646 LOGNAME=ha17646 SHELL=/bin/sh PWD=/home/ha17646 When I run "ssh othernode" from the x86 node, then run "env" on the Cell, I get the following: USER=ha17646 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 HOME=/home/ha17646 MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key LOGNAME=ha17646 TERM=xterm-color PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/ bin:/ tools/cmake-2.4.7/bin:/tools SHELL=/bin/sh PWD=/home/ha17646 TZ=EST5EDT Hahn On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote: Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It wou
Re: [OMPI users] Problem launching onto Bourne shell
Jeff, You broke my ksh (and I expect something else) Today's SVN 1.4a1r19757 orte/mca/plm/rsh/plm_rsh_module.c line 471: tmp = opal_argv_split("( test ! -r ./.profile || . ./.profile;", ' '); ^ ARGHH No ( tmp = opal_argv_split(" test ! -r ./.profile || . ./.profile;", ' '); and all is well again :) Regards, Mostyn On Thu, 9 Oct 2008, Jeff Squyres wrote: FWIW, the fix has been pushed into the trunk, 1.2.8, and 1.3 SVN branches. So I'll probably take down the hg tree (we use those as temporary branches). On Oct 9, 2008, at 2:32 PM, Hahn Kim wrote: Hi, Thanks for providing a fix, sorry for the delay in response. Once I found out about -x, I've been busy working on the rest of our code, so I haven't had the time to try out the fix. I'll take a look at it soon as I can and will let you know how it works out. Hahn On Oct 7, 2008, at 5:41 PM, Jeff Squyres wrote: On Oct 7, 2008, at 4:19 PM, Hahn Kim wrote: you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. Right, that's exactly what I want to do. I was hoping that mpirun would run .profile as the FAQ page stated, but the -x fix works for now. If you're using Bash, it should be running .bashrc. But it looks like you did identify a bug that we're *not* running .profile. I have a Mercurial branch up with a fix if you want to give it a spin: http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/sh-profile-fixes/ I just realized that I'm using .bash_profile on the x86 and need to move its contents into .bashrc and call .bashrc from .bash_profile, since eventually I will also be launching MPI jobs onto other x86 processors. Thanks to everyone for their help. Hahn On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote: On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote: Regarding 1., we're actually using 1.2.5. We started using Open MPI last winter and just stuck with it. For now, using the -x flag with mpirun works. If this really is a bug in 1.2.7, then I think we'll stick with 1.2.5 for now, then upgrade later when it's fixed. It looks like this behavior has been the same throughout the entire 1.2 series. Regarding 2., are you saying I should run the commands you suggest from the x86 node running bash, so that ssh logs into the Cell node running Bourne? I'm saying that if "ssh othernode env" gives different answers than "ssh othernode"/"env", then your .bashrc or .profile or whatever is dumping out early depending on whether you have an interactive login or not. This is the real cause of the error -- you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. When I run "ssh othernode env" from the x86 node, I get the following vanilla environment: USER=ha17646 HOME=/home/ha17646 LOGNAME=ha17646 SHELL=/bin/sh PWD=/home/ha17646 When I run "ssh othernode" from the x86 node, then run "env" on the Cell, I get the following: USER=ha17646 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 HOME=/home/ha17646 MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key LOGNAME=ha17646 TERM=xterm-color PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/ tools/cmake-2.4.7/bin:/tools SHELL=/bin/sh PWD=/home/ha17646 TZ=EST5EDT Hahn On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote: Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aur?lien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PAT
Re: [OMPI users] Problem launching onto Bourne shell
Great, I look forward to 1.2.8! Hahn On Oct 9, 2008, at 2:32 PM, Hahn Kim wrote: FWIW, the fix has been pushed into the trunk, 1.2.8, and 1.3 SVN branches. So I'll probably take down the hg tree (we use those as temporary branches). On Oct 9, 2008, at 2:32 PM, Hahn Kim wrote: Hi, Thanks for providing a fix, sorry for the delay in response. Once I found out about -x, I've been busy working on the rest of our code, so I haven't had the time to try out the fix. I'll take a look at it soon as I can and will let you know how it works out. Hahn On Oct 7, 2008, at 5:41 PM, Jeff Squyres wrote: On Oct 7, 2008, at 4:19 PM, Hahn Kim wrote: you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. Right, that's exactly what I want to do. I was hoping that mpirun would run .profile as the FAQ page stated, but the -x fix works for now. If you're using Bash, it should be running .bashrc. But it looks like you did identify a bug that we're *not* running .profile. I have a Mercurial branch up with a fix if you want to give it a spin: http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/sh-profile-fixes/ I just realized that I'm using .bash_profile on the x86 and need to move its contents into .bashrc and call .bashrc from .bash_profile, since eventually I will also be launching MPI jobs onto other x86 processors. Thanks to everyone for their help. Hahn On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote: On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote: Regarding 1., we're actually using 1.2.5. We started using Open MPI last winter and just stuck with it. For now, using the -x flag with mpirun works. If this really is a bug in 1.2.7, then I think we'll stick with 1.2.5 for now, then upgrade later when it's fixed. It looks like this behavior has been the same throughout the entire 1.2 series. Regarding 2., are you saying I should run the commands you suggest from the x86 node running bash, so that ssh logs into the Cell node running Bourne? I'm saying that if "ssh othernode env" gives different answers than "ssh othernode"/"env", then your .bashrc or .profile or whatever is dumping out early depending on whether you have an interactive login or not. This is the real cause of the error -- you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. When I run "ssh othernode env" from the x86 node, I get the following vanilla environment: USER=ha17646 HOME=/home/ha17646 LOGNAME=ha17646 SHELL=/bin/sh PWD=/home/ha17646 When I run "ssh othernode" from the x86 node, then run "env" on the Cell, I get the following: USER=ha17646 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 HOME=/home/ha17646 MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key LOGNAME=ha17646 TERM=xterm-color PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/ bin:/ tools/cmake-2.4.7/bin:/tools SHELL=/bin/sh PWD=/home/ha17646 TZ=EST5EDT Hahn On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote: Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my ba
Re: [OMPI users] Problem launching onto Bourne shell
FWIW, the fix has been pushed into the trunk, 1.2.8, and 1.3 SVN branches. So I'll probably take down the hg tree (we use those as temporary branches). On Oct 9, 2008, at 2:32 PM, Hahn Kim wrote: Hi, Thanks for providing a fix, sorry for the delay in response. Once I found out about -x, I've been busy working on the rest of our code, so I haven't had the time to try out the fix. I'll take a look at it soon as I can and will let you know how it works out. Hahn On Oct 7, 2008, at 5:41 PM, Jeff Squyres wrote: On Oct 7, 2008, at 4:19 PM, Hahn Kim wrote: you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. Right, that's exactly what I want to do. I was hoping that mpirun would run .profile as the FAQ page stated, but the -x fix works for now. If you're using Bash, it should be running .bashrc. But it looks like you did identify a bug that we're *not* running .profile. I have a Mercurial branch up with a fix if you want to give it a spin: http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/sh-profile-fixes/ I just realized that I'm using .bash_profile on the x86 and need to move its contents into .bashrc and call .bashrc from .bash_profile, since eventually I will also be launching MPI jobs onto other x86 processors. Thanks to everyone for their help. Hahn On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote: On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote: Regarding 1., we're actually using 1.2.5. We started using Open MPI last winter and just stuck with it. For now, using the -x flag with mpirun works. If this really is a bug in 1.2.7, then I think we'll stick with 1.2.5 for now, then upgrade later when it's fixed. It looks like this behavior has been the same throughout the entire 1.2 series. Regarding 2., are you saying I should run the commands you suggest from the x86 node running bash, so that ssh logs into the Cell node running Bourne? I'm saying that if "ssh othernode env" gives different answers than "ssh othernode"/"env", then your .bashrc or .profile or whatever is dumping out early depending on whether you have an interactive login or not. This is the real cause of the error -- you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. When I run "ssh othernode env" from the x86 node, I get the following vanilla environment: USER=ha17646 HOME=/home/ha17646 LOGNAME=ha17646 SHELL=/bin/sh PWD=/home/ha17646 When I run "ssh othernode" from the x86 node, then run "env" on the Cell, I get the following: USER=ha17646 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 HOME=/home/ha17646 MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key LOGNAME=ha17646 TERM=xterm-color PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/ tools/cmake-2.4.7/bin:/tools SHELL=/bin/sh PWD=/home/ha17646 TZ=EST5EDT Hahn On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote: Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86- based ma
Re: [OMPI users] Problem launching onto Bourne shell
Hi, Thanks for providing a fix, sorry for the delay in response. Once I found out about -x, I've been busy working on the rest of our code, so I haven't had the time to try out the fix. I'll take a look at it soon as I can and will let you know how it works out. Hahn On Oct 7, 2008, at 5:41 PM, Jeff Squyres wrote: On Oct 7, 2008, at 4:19 PM, Hahn Kim wrote: you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. Right, that's exactly what I want to do. I was hoping that mpirun would run .profile as the FAQ page stated, but the -x fix works for now. If you're using Bash, it should be running .bashrc. But it looks like you did identify a bug that we're *not* running .profile. I have a Mercurial branch up with a fix if you want to give it a spin: http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/sh-profile-fixes/ I just realized that I'm using .bash_profile on the x86 and need to move its contents into .bashrc and call .bashrc from .bash_profile, since eventually I will also be launching MPI jobs onto other x86 processors. Thanks to everyone for their help. Hahn On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote: On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote: Regarding 1., we're actually using 1.2.5. We started using Open MPI last winter and just stuck with it. For now, using the -x flag with mpirun works. If this really is a bug in 1.2.7, then I think we'll stick with 1.2.5 for now, then upgrade later when it's fixed. It looks like this behavior has been the same throughout the entire 1.2 series. Regarding 2., are you saying I should run the commands you suggest from the x86 node running bash, so that ssh logs into the Cell node running Bourne? I'm saying that if "ssh othernode env" gives different answers than "ssh othernode"/"env", then your .bashrc or .profile or whatever is dumping out early depending on whether you have an interactive login or not. This is the real cause of the error -- you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. When I run "ssh othernode env" from the x86 node, I get the following vanilla environment: USER=ha17646 HOME=/home/ha17646 LOGNAME=ha17646 SHELL=/bin/sh PWD=/home/ha17646 When I run "ssh othernode" from the x86 node, then run "env" on the Cell, I get the following: USER=ha17646 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 HOME=/home/ha17646 MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key LOGNAME=ha17646 TERM=xterm-color PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/ tools/cmake-2.4.7/bin:/tools SHELL=/bin/sh PWD=/home/ha17646 TZ=EST5EDT Hahn On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote: Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86- based machine running bash and the other is a Cell-based machine running Bourne shell. I'm running mpirun from the x86 machine, which launches a C++ MPI application onto the Cell machine. I get the following error: error while loading shared
Re: [OMPI users] Problem launching onto Bourne shell
On Oct 7, 2008, at 4:19 PM, Hahn Kim wrote: you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. Right, that's exactly what I want to do. I was hoping that mpirun would run .profile as the FAQ page stated, but the -x fix works for now. If you're using Bash, it should be running .bashrc. But it looks like you did identify a bug that we're *not* running .profile. I have a Mercurial branch up with a fix if you want to give it a spin: http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/sh-profile-fixes/ I just realized that I'm using .bash_profile on the x86 and need to move its contents into .bashrc and call .bashrc from .bash_profile, since eventually I will also be launching MPI jobs onto other x86 processors. Thanks to everyone for their help. Hahn On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote: On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote: Regarding 1., we're actually using 1.2.5. We started using Open MPI last winter and just stuck with it. For now, using the -x flag with mpirun works. If this really is a bug in 1.2.7, then I think we'll stick with 1.2.5 for now, then upgrade later when it's fixed. It looks like this behavior has been the same throughout the entire 1.2 series. Regarding 2., are you saying I should run the commands you suggest from the x86 node running bash, so that ssh logs into the Cell node running Bourne? I'm saying that if "ssh othernode env" gives different answers than "ssh othernode"/"env", then your .bashrc or .profile or whatever is dumping out early depending on whether you have an interactive login or not. This is the real cause of the error -- you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. When I run "ssh othernode env" from the x86 node, I get the following vanilla environment: USER=ha17646 HOME=/home/ha17646 LOGNAME=ha17646 SHELL=/bin/sh PWD=/home/ha17646 When I run "ssh othernode" from the x86 node, then run "env" on the Cell, I get the following: USER=ha17646 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 HOME=/home/ha17646 MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key LOGNAME=ha17646 TERM=xterm-color PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/ tools/cmake-2.4.7/bin:/tools SHELL=/bin/sh PWD=/home/ha17646 TZ=EST5EDT Hahn On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote: Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86- based machine running bash and the other is a Cell-based machine running Bourne shell. I'm running mpirun from the x86 machine, which launches a C++ MPI application onto the Cell machine. I get the following error: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory The basic problem is that LD_LIBRARY_PATH needs to be set to the directory that contains libstdc++.so.6 for the Cell. I set the following line in .profile: export LD_LIBRARY_PATH=/opt/cell/toolchai
Re: [OMPI users] Problem launching onto Bourne shell
you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. Right, that's exactly what I want to do. I was hoping that mpirun would run .profile as the FAQ page stated, but the -x fix works for now. I just realized that I'm using .bash_profile on the x86 and need to move its contents into .bashrc and call .bashrc from .bash_profile, since eventually I will also be launching MPI jobs onto other x86 processors. Thanks to everyone for their help. Hahn On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote: On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote: Regarding 1., we're actually using 1.2.5. We started using Open MPI last winter and just stuck with it. For now, using the -x flag with mpirun works. If this really is a bug in 1.2.7, then I think we'll stick with 1.2.5 for now, then upgrade later when it's fixed. It looks like this behavior has been the same throughout the entire 1.2 series. Regarding 2., are you saying I should run the commands you suggest from the x86 node running bash, so that ssh logs into the Cell node running Bourne? I'm saying that if "ssh othernode env" gives different answers than "ssh othernode"/"env", then your .bashrc or .profile or whatever is dumping out early depending on whether you have an interactive login or not. This is the real cause of the error -- you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. When I run "ssh othernode env" from the x86 node, I get the following vanilla environment: USER=ha17646 HOME=/home/ha17646 LOGNAME=ha17646 SHELL=/bin/sh PWD=/home/ha17646 When I run "ssh othernode" from the x86 node, then run "env" on the Cell, I get the following: USER=ha17646 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 HOME=/home/ha17646 MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key LOGNAME=ha17646 TERM=xterm-color PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/ tools/cmake-2.4.7/bin:/tools SHELL=/bin/sh PWD=/home/ha17646 TZ=EST5EDT Hahn On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote: Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86-based machine running bash and the other is a Cell-based machine running Bourne shell. I'm running mpirun from the x86 machine, which launches a C++ MPI application onto the Cell machine. I get the following error: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory The basic problem is that LD_LIBRARY_PATH needs to be set to the directory that contains libstdc++.so.6 for the Cell. I set the following line in .profile: export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 which is the path to the PPC libraries for Cell. Now if I log directly into the Cell machine and run the program directly from the command line, I don't get the above error. But mpirun still fails, even after setting LD_LIBRARY_PATH in .profile. As a sanity check, I did the following. I ran the following command
Re: [OMPI users] Problem launching onto Bourne shell
On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote: Regarding 1., we're actually using 1.2.5. We started using Open MPI last winter and just stuck with it. For now, using the -x flag with mpirun works. If this really is a bug in 1.2.7, then I think we'll stick with 1.2.5 for now, then upgrade later when it's fixed. It looks like this behavior has been the same throughout the entire 1.2 series. Regarding 2., are you saying I should run the commands you suggest from the x86 node running bash, so that ssh logs into the Cell node running Bourne? I'm saying that if "ssh othernode env" gives different answers than "ssh othernode"/"env", then your .bashrc or .profile or whatever is dumping out early depending on whether you have an interactive login or not. This is the real cause of the error -- you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such as that LICENSE key, etc.) regardless of whether it's an interactive or non-interactive login. When I run "ssh othernode env" from the x86 node, I get the following vanilla environment: USER=ha17646 HOME=/home/ha17646 LOGNAME=ha17646 SHELL=/bin/sh PWD=/home/ha17646 When I run "ssh othernode" from the x86 node, then run "env" on the Cell, I get the following: USER=ha17646 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 HOME=/home/ha17646 MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key LOGNAME=ha17646 TERM=xterm-color PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/ tools/cmake-2.4.7/bin:/tools SHELL=/bin/sh PWD=/home/ha17646 TZ=EST5EDT Hahn On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote: Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86-based machine running bash and the other is a Cell-based machine running Bourne shell. I'm running mpirun from the x86 machine, which launches a C++ MPI application onto the Cell machine. I get the following error: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory The basic problem is that LD_LIBRARY_PATH needs to be set to the directory that contains libstdc++.so.6 for the Cell. I set the following line in .profile: export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 which is the path to the PPC libraries for Cell. Now if I log directly into the Cell machine and run the program directly from the command line, I don't get the above error. But mpirun still fails, even after setting LD_LIBRARY_PATH in .profile. As a sanity check, I did the following. I ran the following command from the x86 machine: mpirun -np 1 --host cab0 env which, among others things, shows me the following value: LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib: If I log into the Cell machine and run env directly from the command line, I get the following value: LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 So it appears that .profile gets sourced when I log in but not when mpirun runs. However, according to the OpenMPI FAQ (http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path ), mpirun is supposed to directly call .profile since Bourne shell doesn't automatically call it for non-in
Re: [OMPI users] Problem launching onto Bourne shell
Thanks for the feedback. Regarding 1., we're actually using 1.2.5. We started using Open MPI last winter and just stuck with it. For now, using the -x flag with mpirun works. If this really is a bug in 1.2.7, then I think we'll stick with 1.2.5 for now, then upgrade later when it's fixed. Regarding 2., are you saying I should run the commands you suggest from the x86 node running bash, so that ssh logs into the Cell node running Bourne? When I run "ssh othernode env" from the x86 node, I get the following vanilla environment: USER=ha17646 HOME=/home/ha17646 LOGNAME=ha17646 SHELL=/bin/sh PWD=/home/ha17646 When I run "ssh othernode" from the x86 node, then run "env" on the Cell, I get the following: USER=ha17646 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 HOME=/home/ha17646 MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key LOGNAME=ha17646 TERM=xterm-color PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/ tools/cmake-2.4.7/bin:/tools SHELL=/bin/sh PWD=/home/ha17646 TZ=EST5EDT Hahn On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote: Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86-based machine running bash and the other is a Cell-based machine running Bourne shell. I'm running mpirun from the x86 machine, which launches a C++ MPI application onto the Cell machine. I get the following error: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory The basic problem is that LD_LIBRARY_PATH needs to be set to the directory that contains libstdc++.so.6 for the Cell. I set the following line in .profile: export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 which is the path to the PPC libraries for Cell. Now if I log directly into the Cell machine and run the program directly from the command line, I don't get the above error. But mpirun still fails, even after setting LD_LIBRARY_PATH in .profile. As a sanity check, I did the following. I ran the following command from the x86 machine: mpirun -np 1 --host cab0 env which, among others things, shows me the following value: LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib: If I log into the Cell machine and run env directly from the command line, I get the following value: LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 So it appears that .profile gets sourced when I log in but not when mpirun runs. However, according to the OpenMPI FAQ (http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path ), mpirun is supposed to directly call .profile since Bourne shell doesn't automatically call it for non-interactive shells. Does anyone have any insight as to why my environment isn't being set properly? Thanks! Hahn -- Hahn Kim, h...@ll.mit.edu MIT Lincoln Laboratory 244 Wood St., Lexington, MA 02420 Tel: 781-981-0940, Fax: 781-981-5255 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- * Dr. Aurélien Bouteiller * Sr. Research Associate at Innovative Computing Laboratory * University of Tennessee * 1122 Volunteer Boulevard, suite 350 * Knoxville, TN 37996 * 865 9
Re: [OMPI users] Problem launching onto Bourne shell
Ralph and I just talked about this a bit: 1. In all released versions of OMPI, we *do* source the .profile file on the target node if it exists (because vanilla Bourne shells do not source anything on remote nodes -- Bash does, though, per the FAQ). However, looking in 1.2.7, it looks like it might not be executing that code -- there *may* be a bug in this area. We're checking into it. 2. You might want to check your configuration to see if your .bashrc is dumping out early because it's a non-interactive shell. Check the output of: ssh othernode env vs. ssh othernode env (i.e., a non-interactive running of "env" vs. an interactive login and running "env") On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote: I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86-based machine running bash and the other is a Cell-based machine running Bourne shell. I'm running mpirun from the x86 machine, which launches a C++ MPI application onto the Cell machine. I get the following error: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory The basic problem is that LD_LIBRARY_PATH needs to be set to the directory that contains libstdc++.so.6 for the Cell. I set the following line in .profile: export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 which is the path to the PPC libraries for Cell. Now if I log directly into the Cell machine and run the program directly from the command line, I don't get the above error. But mpirun still fails, even after setting LD_LIBRARY_PATH in .profile. As a sanity check, I did the following. I ran the following command from the x86 machine: mpirun -np 1 --host cab0 env which, among others things, shows me the following value: LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib: If I log into the Cell machine and run env directly from the command line, I get the following value: LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 So it appears that .profile gets sourced when I log in but not when mpirun runs. However, according to the OpenMPI FAQ (http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path ), mpirun is supposed to directly call .profile since Bourne shell doesn't automatically call it for non-interactive shells. Does anyone have any insight as to why my environment isn't being set properly? Thanks! Hahn -- Hahn Kim, h...@ll.mit.edu MIT Lincoln Laboratory 244 Wood St., Lexington, MA 02420 Tel: 781-981-0940, Fax: 781-981-5255 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- * Dr. Aurélien Bouteiller * Sr. Research Associate at Innovative Computing Laboratory * University of Tennessee * 1122 Volunteer Boulevard, suite 350 * Knoxville, TN 37996 * 865 974 6321 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Hahn Kim MIT Lincoln Laboratory Phone: (781) 981-0940 244 Wood Street, S2-252 Fax: (781) 981-5255 Lexington, MA 02420 E-mail: h...@ll.mit.edu ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] Problem launching onto Bourne shell
I am unaware of anything in the code that would "source .profile" for you. I believe the FAQ page is in error here. Ralph On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote: Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86-based machine running bash and the other is a Cell-based machine running Bourne shell. I'm running mpirun from the x86 machine, which launches a C++ MPI application onto the Cell machine. I get the following error: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory The basic problem is that LD_LIBRARY_PATH needs to be set to the directory that contains libstdc++.so.6 for the Cell. I set the following line in .profile: export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 which is the path to the PPC libraries for Cell. Now if I log directly into the Cell machine and run the program directly from the command line, I don't get the above error. But mpirun still fails, even after setting LD_LIBRARY_PATH in .profile. As a sanity check, I did the following. I ran the following command from the x86 machine: mpirun -np 1 --host cab0 env which, among others things, shows me the following value: LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib: If I log into the Cell machine and run env directly from the command line, I get the following value: LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 So it appears that .profile gets sourced when I log in but not when mpirun runs. However, according to the OpenMPI FAQ (http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path ), mpirun is supposed to directly call .profile since Bourne shell doesn't automatically call it for non-interactive shells. Does anyone have any insight as to why my environment isn't being set properly? Thanks! Hahn -- Hahn Kim, h...@ll.mit.edu MIT Lincoln Laboratory 244 Wood St., Lexington, MA 02420 Tel: 781-981-0940, Fax: 781-981-5255 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- * Dr. Aurélien Bouteiller * Sr. Research Associate at Innovative Computing Laboratory * University of Tennessee * 1122 Volunteer Boulevard, suite 350 * Knoxville, TN 37996 * 865 974 6321 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Hahn Kim MIT Lincoln Laboratory Phone: (781) 981-0940 244 Wood Street, S2-252 Fax: (781) 981-5255 Lexington, MA 02420 E-mail: h...@ll.mit.edu ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Problem launching onto Bourne shell
Great, that worked, thanks! However, it still concerns me that the FAQ page says that mpirun will execute .profile which doesn't seem to work for me. Are there any configuration issues that could possibly be preventing mpirun from doing this? It would certainly be more convenient if I could maintain my environment in a single .profile file instead of adding what could potentially be a lot of -x arguments to my mpirun command. Hahn On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote: tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86-based machine running bash and the other is a Cell-based machine running Bourne shell. I'm running mpirun from the x86 machine, which launches a C++ MPI application onto the Cell machine. I get the following error: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory The basic problem is that LD_LIBRARY_PATH needs to be set to the directory that contains libstdc++.so.6 for the Cell. I set the following line in .profile: export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 which is the path to the PPC libraries for Cell. Now if I log directly into the Cell machine and run the program directly from the command line, I don't get the above error. But mpirun still fails, even after setting LD_LIBRARY_PATH in .profile. As a sanity check, I did the following. I ran the following command from the x86 machine: mpirun -np 1 --host cab0 env which, among others things, shows me the following value: LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib: If I log into the Cell machine and run env directly from the command line, I get the following value: LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 So it appears that .profile gets sourced when I log in but not when mpirun runs. However, according to the OpenMPI FAQ (http://www.open-mpi.org/ faq/?category=running#adding-ompi-to-path ), mpirun is supposed to directly call .profile since Bourne shell doesn't automatically call it for non-interactive shells. Does anyone have any insight as to why my environment isn't being set properly? Thanks! Hahn -- Hahn Kim, h...@ll.mit.edu MIT Lincoln Laboratory 244 Wood St., Lexington, MA 02420 Tel: 781-981-0940, Fax: 781-981-5255 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- * Dr. Aurélien Bouteiller * Sr. Research Associate at Innovative Computing Laboratory * University of Tennessee * 1122 Volunteer Boulevard, suite 350 * Knoxville, TN 37996 * 865 974 6321 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Hahn Kim MIT Lincoln Laboratory Phone: (781) 981-0940 244 Wood Street, S2-252 Fax: (781) 981-5255 Lexington, MA 02420 E-mail: h...@ll.mit.edu
Re: [OMPI users] Problem launching onto Bourne shell
tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an alternative you can set specific values with mpirun -x LD_LIBRARY_PATH=/some/where:/some/where/else . More information with mpirun --help (or man mpirun). Aurelien Le 6 oct. 08 à 16:06, Hahn Kim a écrit : Hi, I'm having difficulty launching an Open MPI job onto a machine that is running the Bourne shell. Here's my basic setup. I have two machines, one is an x86-based machine running bash and the other is a Cell-based machine running Bourne shell. I'm running mpirun from the x86 machine, which launches a C++ MPI application onto the Cell machine. I get the following error: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory The basic problem is that LD_LIBRARY_PATH needs to be set to the directory that contains libstdc++.so.6 for the Cell. I set the following line in .profile: export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 which is the path to the PPC libraries for Cell. Now if I log directly into the Cell machine and run the program directly from the command line, I don't get the above error. But mpirun still fails, even after setting LD_LIBRARY_PATH in .profile. As a sanity check, I did the following. I ran the following command from the x86 machine: mpirun -np 1 --host cab0 env which, among others things, shows me the following value: LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib: If I log into the Cell machine and run env directly from the command line, I get the following value: LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32 So it appears that .profile gets sourced when I log in but not when mpirun runs. However, according to the OpenMPI FAQ (http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path ), mpirun is supposed to directly call .profile since Bourne shell doesn't automatically call it for non-interactive shells. Does anyone have any insight as to why my environment isn't being set properly? Thanks! Hahn -- Hahn Kim, h...@ll.mit.edu MIT Lincoln Laboratory 244 Wood St., Lexington, MA 02420 Tel: 781-981-0940, Fax: 781-981-5255 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- * Dr. Aurélien Bouteiller * Sr. Research Associate at Innovative Computing Laboratory * University of Tennessee * 1122 Volunteer Boulevard, suite 350 * Knoxville, TN 37996 * 865 974 6321