This problem is occuring because the fortran wasn't compiled with the debug symbols: warning: Could not find object file "/Users/admin/build/i386-apple- darwin9.0.0/libgcc/_udiv_w_sdiv_s.o" - no debug information available for "../../../gcc-4.3-20071026/libgcc/../gcc/libgcc2.c".

Is the same problem for who is using LLVM in Xcode, there is no debug symbols to create a debug release, try create a release and see if it will compile at all and try the fortran from macports it will works smoothly.


Le 09-05-05 à 17:33, Jeff Squyres a écrit :

I agree; that is a bummer.  :-(

Warner -- do you have any advice here, perchance?


On May 4, 2009, at 7:26 PM, Vicente Puig wrote:

But it doesn't work well.

For example, I am trying to debug a program, "floyd" in this case, and when I make a breakpoint:

No line 26 in file "../../../gcc-4.2-20060805/libgfortran/fmain.c".

I am getting disappointed and frustrated that I can not work well with openmpi in my Mac. There should be a was to make it run in Xcode, uff...

2009/5/4 Jeff Squyres <jsquy...@cisco.com>
I get those as well. I believe that they are (annoying but) harmless -- an artifact of how the freeware gcc/gofrtran that I use was built.



On May 4, 2009, at 1:47 PM, Vicente Puig wrote:

Maybe I had to open a new thread, but if you have any idea why I receive it when I use gdb for debugging an openmpi program:

warning: Could not find object file "/Users/admin/build/i386-apple- darwin9.0.0/libgcc/_umoddi3_s.o" - no debug information available for "../../../gcc-4.3-20071026/libgcc/../gcc/libgcc2.c".


warning: Could not find object file "/Users/admin/build/i386-apple- darwin9.0.0/libgcc/_udiv_w_sdiv_s.o" - no debug information available for "../../../gcc-4.3-20071026/libgcc/../gcc/libgcc2.c".


warning: Could not find object file "/Users/admin/build/i386-apple- darwin9.0.0/libgcc/_udivmoddi4_s.o" - no debug information available for "../../../gcc-4.3-20071026/libgcc/../gcc/libgcc2.c".


warning: Could not find object file "/Users/admin/build/i386-apple- darwin9.0.0/libgcc/unwind-dw2_s.o" - no debug information available for "../../../gcc-4.3-20071026/libgcc/../gcc/unwind-dw2.c".


warning: Could not find object file "/Users/admin/build/i386-apple- darwin9.0.0/libgcc/unwind-dw2-fde-darwin_s.o" - no debug information available for "../../../gcc-4.3-20071026/libgcc/../gcc/ unwind-dw2-fde-darwin.c".


warning: Could not find object file "/Users/admin/build/i386-apple- darwin9.0.0/libgcc/unwind-c_s.o" - no debug information available for "../../../gcc-4.3-20071026/libgcc/../gcc/unwind-c.c".
.......



There is no 'admin' so I don't know why it happen. It works well with a C program.

Any idea??.

Thanks.


Vincent





2009/5/4 Vicente Puig <vpui...@gmail.com>
I can run openmpi perfectly with command line, but I wanted a graphic interface for debugging because I was having problems.

Thanks anyway.

Vincent

2009/5/4 Warner Yuen <wy...@apple.com>

Admittedly, I don't use Xcode to build Open MPI either.

You can just compile Open MPI from the command line and install everything in /usr/local/. Make sure that gfortran is set in your path and you should just be able to do a './configure --prefix=/usr/ local'

After the installation, just make sure that your path is set correctly when you go to use the newly installed Open MPI. If you don't set your path, it will always default to using the version of OpenMPI that ships with Leopard.


Warner Yuen
Scientific Computing
Consulting Engineer
Apple, Inc.
email: wy...@apple.com
Tel: 408.718.2859




On May 4, 2009, at 9:13 AM, users-requ...@open-mpi.org wrote:

Send users mailing list submissions to
     us...@open-mpi.org

To subscribe or unsubscribe via the World Wide Web, visit
     http://www.open-mpi.org/mailman/listinfo.cgi/users
or, via email, send a message with subject or body 'help' to
     users-requ...@open-mpi.org

You can reach the person managing the list at
     users-ow...@open-mpi.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

1. Re: How do I compile OpenMPI in Xcode 3.1 (Vicente Puig)


----------------------------------------------------------------------

Message: 1
Date: Mon, 4 May 2009 18:13:45 +0200
From: Vicente Puig <vpui...@gmail.com>
Subject: Re: [OMPI users] How do I compile OpenMPI in Xcode 3.1
To: Open MPI Users <us...@open-mpi.org>
Message-ID:
     <3e9a21680905040913u3f36d3c9rdcd3413bfdcd...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

If I can not make it work with Xcode, which one could I use?, which one do
you use to compile and debug OpenMPI?.
Thanks

Vincent


2009/5/4 Jeff Squyres <jsquy...@cisco.com>

Open MPI comes pre-installed in Leopard; as Warner noted, since Leopard doesn't ship with a Fortran compiler, the Open MPI that Apple ships has
non-functional mpif77 and mpif90 wrapper compilers.

So the Open MPI that you installed manually will use your Fortran
compilers, and therefore will have functional mpif77 and mpif90 wrapper compilers. Hence, you probably need to be sure to use the "right" wrapper compilers. It looks like you specified the full path specified to ExecPath, so I'm not sure why Xcode wouldn't work with that (like I mentioned, I unfortunately don't use Xcode myself, so I don't know why that wouldn't
work).




On May 4, 2009, at 11:53 AM, Vicente wrote:

Yes, I already have gfortran compiler on /usr/local/bin, the same path as my mpif90 compiler. But I've seen when I use the mpif90 on /usr/ bin
and on  /Developer/usr/bin says it:

"Unfortunately, this installation of Open MPI was not compiled with
Fortran 90 support.  As such, the mpif90 compiler is non-functional."


That should be the problem, I will have to change the path to use the
gfortran I have installed.
How could I do it? (Sorry, I am beginner)

Thanks.


El 04/05/2009, a las 17:38, Warner Yuen escribi?:

Have you installed a Fortran compiler? Mac OS X's developer tools do
not come with a Fortran compiler, so you'll need to install one if
you haven't already done so. I routinely use the Intel IFORT
compilers with success. However, I hear many good things about the
gfortran compilers on Mac OS X, you can't beat the price of gfortran!


Warner Yuen
Scientific Computing
Consulting Engineer
Apple, Inc.
email: wy...@apple.com
Tel: 408.718.2859




On May 4, 2009, at 7:28 AM, users-requ...@open-mpi.org wrote:

Send users mailing list submissions to
 us...@open-mpi.org

To subscribe or unsubscribe via the World Wide Web, visit
 http://www.open-mpi.org/mailman/listinfo.cgi/users
or, via email, send a message with subject or body 'help' to
 users-requ...@open-mpi.org

You can reach the person managing the list at
 users-ow...@open-mpi.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

1. How do I compile OpenMPI in Xcode 3.1 (Vicente)
2. Re: 1.3.1 -rf rankfile behaviour ?? (Ralph Castain)


----------------------------------------------------------------------

Message: 1
Date: Mon, 4 May 2009 16:12:44 +0200
From: Vicente <vpui...@gmail.com>
Subject: [OMPI users] How do I compile OpenMPI in Xcode 3.1
To: us...@open-mpi.org
Message-ID: <1c2c0085-940f-43bb-910f-975871ae2...@gmail.com>
Content-Type: text/plain; charset="windows-1252"; Format="flowed";
 DelSp="yes"

Hi, I've seen the FAQ "How do I use Open MPI wrapper compilers in
Xcode", but it's only for MPICC. I am using MPIF90, so I did the
same,
but changing MPICC for MPIF90, and also the path, but it did not
work.

Building target ?fortran? of project ?fortran? with configuration
?Debug?


Checking Dependencies
Invalid value 'MPIF90' for GCC_VERSION


The file "MPIF90.cpcompspec" looks like this:

1 /**
2         Xcode Coompiler Specification for MPIF90
3
4 */
5
6 {   Type = Compiler;
7     Identifier = com.apple.compilers.mpif90;
8     BasedOn = com.apple.compilers.gcc.4_0;
9     Name = "MPIF90";
10     Version = "Default";
11     Description = "MPI GNU C/C++ Compiler 4.0";
12     ExecPath = "/usr/local/bin/mpif90";      // This gets
converted to the g++ variant automatically
13     PrecompStyle = pch;
14 }

and is located in "/Developer/Library/Xcode/Plug-ins"

and when I do mpif90 -v on terminal it works well:

Using built-in specs.
Target: i386-apple-darwin8.10.1
Configured with: /tmp/gfortran-20090321/ibin/../gcc/configure --
prefix=/usr/local/gfortran --enable-languages=c,fortran --with-gmp=/
tmp/gfortran-20090321/gfortran_libs --enable-bootstrap
Thread model: posix
gcc version 4.4.0 20090321 (experimental) [trunk revision 144983]
(GCC)


Any idea??

Thanks.

Vincent
-------------- next part --------------
HTML attachment scrubbed and removed

------------------------------

Message: 2
Date: Mon, 4 May 2009 08:28:26 -0600
From: Ralph Castain <r...@open-mpi.org>
Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
To: Open MPI Users <us...@open-mpi.org>
Message-ID:
 <71d2d8cc0905040728h2002f4d7s4c49219eee29e...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Unfortunately, I didn't write any of that code - I was just fixing
the
mapper so it would properly map the procs. From what I can tell,
the proper
things are happening there.

I'll have to dig into the code that specifically deals with parsing
the
results to bind the processes. Afraid that will take awhile longer
- pretty
dark in that hole.


On Mon, May 4, 2009 at 8:04 AM, Geoffroy Pignot
<geopig...@gmail.com> wrote:

Hi,

So, there are no more crashes with my "crazy" mpirun command. But
the
paffinity feature seems to be broken. Indeed I am not able to pin my
processes.

Simple test with a program using your plpa library :

r011n006% cat hostf
r011n006 slots=4

r011n006% cat rankf
rank 0=r011n006 slot=0   ----> bind to CPU 0 , exact ?

r011n006% /tmp/HALMPI/openmpi-1.4a/bin/mpirun --hostfile hostf --
rankfile
rankf --wdir /tmp -n 1 a.out
PLPA Number of processors online: 4
PLPA Number of processor sockets: 2
PLPA Socket 0 (ID 0): 2 cores
PLPA Socket 1 (ID 3): 2 cores

Ctrl+Z
r011n006%bg

r011n006% ps axo stat,user,psr,pid,pcpu,comm | grep gpignot
R+   gpignot    3  9271 97.8 a.out

In fact whatever the slot number I put in my rankfile , a.out
always runs
on the CPU 3. I was looking for it on CPU 0 accordind to my
cpuinfo file
(see below)
The result is the same if I try another syntax (rank 0=r011n006
slot=0:0
bind to socket 0 - core 0  , exact ? )

Thanks in advance

Geoffroy

PS: I run on rhel5

r011n006% uname -a
Linux r011n006 2.6.18-92.1.1NOMAP32.el5 #1 SMP Sat Mar 15 01:46:39
CDT 2008
x86_64 x86_64 x86_64 GNU/Linux

My configure is :
./configure --prefix=/tmp/openmpi-1.4a --libdir='${exec_prefix}/
lib64'
--disable-dlopen --disable-mpi-cxx --enable-heterogeneous


r011n006% cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5150  @ 2.66GHz
stepping        : 6
cpu MHz         : 2660.007
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
nx lm
constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips        : 5323.68
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5150  @ 2.66GHz
stepping        : 6
cpu MHz         : 2660.007
cache size      : 4096 KB
physical id     : 3
siblings        : 2
core id         : 0
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
nx lm
constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips        : 5320.03
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5150  @ 2.66GHz
stepping        : 6
cpu MHz         : 2660.007
cache size      : 4096 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
nx lm
constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips        : 5319.39
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5150  @ 2.66GHz
stepping        : 6
cpu MHz         : 2660.007
cache size      : 4096 KB
physical id     : 3
siblings        : 2
core id         : 1
cpu cores       : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
nx lm
constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips        : 5320.03
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:


------------------------------

Message: 2
Date: Mon, 4 May 2009 04:45:57 -0600
From: Ralph Castain <r...@open-mpi.org>
Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <d01d7b16-4b47-46f3-ad41-d1a90b2e4...@open-mpi.org>

Content-Type: text/plain; charset="us-ascii"; Format="flowed";
 DelSp="yes"

My apologies - I wasn't clear enough. You need a tarball from
r21111
or greater...such as:

http://www.open-mpi.org/nightly/trunk/openmpi-1.4a1r21142.tar.gz

HTH
Ralph


On May 4, 2009, at 2:14 AM, Geoffroy Pignot wrote:

Hi ,

I got the openmpi-1.4a1r21095.tar.gz tarball, but unfortunately my
command doesn't work

cat rankf:
rank 0=node1 slot=*
rank 1=node2 slot=*

cat hostf:
node1 slots=2
node2 slots=2

mpirun  --rankfile rankf --hostfile hostf  --host node1 -n 1
hostname : --host node2 -n 1 hostname

Error, invalid rank (1) in the rankfile (rankf)



--------------------------------------------------------------------------
[r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
file
rmaps_rank_file.c at line 403
[r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
file
base/rmaps_base_map_job.c at line 86
[r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
file
base/plm_base_launch_support.c at line 86
[r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in
file
plm_rsh_module.c at line 1016


Ralph, could you tell me if my command syntax is correct or
not ? if
not, give me the expected one ?

Regards

Geoffroy




2009/4/30 Geoffroy Pignot <geopig...@gmail.com>
Immediately Sir !!! :)

Thanks again Ralph

Geoffroy





------------------------------

Message: 2
Date: Thu, 30 Apr 2009 06:45:39 -0600
From: Ralph Castain <r...@open-mpi.org>
Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
To: Open MPI Users <us...@open-mpi.org>
Message-ID:
 <71d2d8cc0904300545v61a42fe1k50086d2704d0f...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

I believe this is fixed now in our development trunk - you can
download any
tarball starting from last night and give it a try, if you like.
Any
feedback would be appreciated.

Ralph


On Apr 14, 2009, at 7:57 AM, Ralph Castain wrote:

Ah now, I didn't say it -worked-, did I? :-)

Clearly a bug exists in the program. I'll try to take a look at it
(if Lenny
doesn't get to it first), but it won't be until later in the week.

On Apr 14, 2009, at 7:18 AM, Geoffroy Pignot wrote:

I agree with you Ralph , and that 's what I expect from openmpi
but my
second example shows that it's not working

cat hostfile.0
r011n002 slots=4
r011n003 slots=4

cat rankfile.0
rank 0=r011n002 slot=0
rank 1=r011n003 slot=1

mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
hostname
### CRASHED

Error, invalid rank (1) in the rankfile (rankfile.0)




--------------------------------------------------------------------------
[r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
file
rmaps_rank_file.c at line 404
[r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
file
base/rmaps_base_map_job.c at line 87
[r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
file
base/plm_base_launch_support.c at line 77
[r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
file
plm_rsh_module.c at line 985




--------------------------------------------------------------------------
A daemon (pid unknown) died unexpectedly on signal 1  while
attempting to
launch so we are aborting.

There may be more information reported by the environment (see
above).

This may be because the daemon was unable to find all the needed
shared
libraries on the remote node. You may set your LD_LIBRARY_PATH
to
have the
location of the shared libraries on the remote nodes and this
will
automatically be forwarded to the remote nodes.




--------------------------------------------------------------------------




--------------------------------------------------------------------------
orterun noticed that the job aborted, but has no info as to the
process
that caused that situation.




--------------------------------------------------------------------------
orterun: clean termination accomplished



Message: 4
Date: Tue, 14 Apr 2009 06:55:58 -0600
From: Ralph Castain <r...@lanl.gov>
Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <f6290ada-a196-43f0-a853-cbcb802d8...@lanl.gov>
Content-Type: text/plain; charset="us-ascii"; Format="flowed";
DelSp="yes"

The rankfile cuts across the entire job - it isn't applied on an
app_context basis. So the ranks in your rankfile must correspond
to
the eventual rank of each process in the cmd line.

Unfortunately, that means you have to count ranks. In your case,
you
only have four, so that makes life easier. Your rankfile would
look
something like this:

rank 0=r001n001 slot=0
rank 1=r001n002 slot=1
rank 2=r001n001 slot=1
rank 3=r001n002 slot=2

HTH
Ralph

On Apr 14, 2009, at 12:19 AM, Geoffroy Pignot wrote:

Hi,

I agree that my examples are not very clear. What I want to do
is to
launch a multiexes application (masters-slaves) and benefit
from the
processor affinity.
Could you show me how to convert this command , using -rf option
(whatever the affinity is)

mpirun -n 1 -host r001n001 master.x options1  : -n 1 -host
r001n002
master.x options2 : -n 1 -host r001n001 slave.x options3 : -n 1 -
host r001n002 slave.x options4

Thanks for your help

Geoffroy





Message: 2
Date: Sun, 12 Apr 2009 18:26:35 +0300
From: Lenny Verkhovsky <lenny.verkhov...@gmail.com>
Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??
To: Open MPI Users <us...@open-mpi.org>
Message-ID:

<453d39990904120826t2e1d1d33l7bb1fe3de65b5...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

The first "crash" is OK, since your rankfile has ranks 0 and 1
defined,
while n=1, which means only rank 0 is present and can be
allocated.

NP must be >= the largest rank in rankfile.

What exactly are you trying to do ?

I tried to recreate your seqv but all I got was

~/work/svn/ompi/trunk/build_x86-64/install/bin/mpirun --hostfile
hostfile.0
-rf rankfile.0 -n 1 hostname : -rf rankfile.1 -n 1 hostname
[witch19:30798] mca: base: component_find: paffinity
"mca_paffinity_linux"
uses an MCA interface that is not recognized (component MCA
v1.0.0 !=
supported MCA v2.0.0) -- ignored



--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel
process is
likely to abort. There are many reasons that a parallel process
can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal
failure;
here's some additional information (which may only be relevant
to an
Open MPI developer):

opal_carto_base_select failed
--> Returned value -13 instead of OPAL_SUCCESS



--------------------------------------------------------------------------
[witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
file
../../orte/runtime/orte_init.c at line 78
[witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
file
../../orte/orted/orted_main.c at line 344



--------------------------------------------------------------------------
A daemon (pid 11629) died unexpectedly with status 243 while
attempting
to launch so we are aborting.

There may be more information reported by the environment (see
above).

This may be because the daemon was unable to find all the needed
shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to
have the
location of the shared libraries on the remote nodes and this
will
automatically be forwarded to the remote nodes.



--------------------------------------------------------------------------



--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the
process
that caused that situation.



--------------------------------------------------------------------------
mpirun: clean termination accomplished


Lenny.


On 4/10/09, Geoffroy Pignot <geopig...@gmail.com> wrote:

Hi ,

I am currently testing the process affinity capabilities of
openmpi and I
would like to know if the rankfile behaviour I will describe
below
is normal
or not ?

cat hostfile.0
r011n002 slots=4
r011n003 slots=4

cat rankfile.0
rank 0=r011n002 slot=0
rank 1=r011n003 slot=1






##################################################################################

mpirun --hostfile hostfile.0 -rf rankfile.0 -n 2  hostname ###
OK
r011n002
r011n003






##################################################################################
but
mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1
hostname
### CRASHED
*




--------------------------------------------------------------------------
Error, invalid rank (1) in the rankfile (rankfile.0)




--------------------------------------------------------------------------
[r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
file
rmaps_rank_file.c at line 404
[r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
file
base/rmaps_base_map_job.c at line 87
[r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
file
base/plm_base_launch_support.c at line 77
[r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in
file
plm_rsh_module.c at line 985




--------------------------------------------------------------------------
A daemon (pid unknown) died unexpectedly on signal 1  while
attempting to
launch so we are aborting.

There may be more information reported by the environment (see
above).

This may be because the daemon was unable to find all the needed
shared
libraries on the remote node. You may set your LD_LIBRARY_PATH
to
have the
location of the shared libraries on the remote nodes and this
will
automatically be forwarded to the remote nodes.




--------------------------------------------------------------------------




--------------------------------------------------------------------------
orterun noticed that the job aborted, but has no info as to the
process
that caused that situation.




--------------------------------------------------------------------------
orterun: clean termination accomplished
*
It seems that the rankfile option is not propagted to the second
command
line ; there is no global understanding of the ranking inside a
mpirun
command.






##################################################################################

Assuming that , I tried to provide a rankfile to each command
line:

cat rankfile.0
rank 0=r011n002 slot=0

cat rankfile.1
rank 0=r011n003 slot=1

mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -rf
rankfile.1
-n 1 hostname ### CRASHED
*[r011n002:28778] *** Process received signal ***
[r011n002:28778] Signal: Segmentation fault (11)
[r011n002:28778] Signal code: Address not mapped (1)
[r011n002:28778] Failing at address: 0x34
[r011n002:28778] [ 0] [0xffffe600]
[r011n002:28778] [ 1]
/tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
0(orte_odls_base_default_get_add_procs_data+0x55d)
[0x5557decd]
[r011n002:28778] [ 2]
/tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so.
0(orte_plm_base_launch_apps+0x117)
[0x555842a7]
[r011n002:28778] [ 3] /tmp/HALMPI/openmpi-1.3.1/lib/openmpi/
mca_plm_rsh.so
[0x556098c0]
[r011n002:28778] [ 4] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
[0x804aa27]
[r011n002:28778] [ 5] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
[0x804a022]
[r011n002:28778] [ 6] /lib/libc.so.6(__libc_start_main+0xdc)
[0x9f1dec]
[r011n002:28778] [ 7] /tmp/HALMPI/openmpi-1.3.1/bin/orterun
[0x8049f71]
[r011n002:28778] *** End of error message ***
Segmentation fault (core dumped)*



I hope that I've found a bug because it would be very important
for me to
have this kind of capabiliy .
Launch a multiexe mpirun command line and be able to bind my
exes
and
sockets together.

Thanks in advance for your help

Geoffroy
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

-------------- next part --------------
HTML attachment scrubbed and removed

------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 1202, Issue 2
**************************************

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
-------------- next part --------------
HTML attachment scrubbed and removed

------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 1218, Issue 2
**************************************


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

-------------- next part --------------
HTML attachment scrubbed and removed

------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 1221, Issue 3
**************************************



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

-------------- next part --------------
HTML attachment scrubbed and removed

------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 1221, Issue 6
**************************************

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

-------------- next part --------------
HTML attachment scrubbed and removed

------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 1221, Issue 12
***************************************

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Attachment: PGP.sig
Description: Ceci est une signature électronique PGP

Reply via email to