[OMPI users] Problem with sending messages from one of the machines

2010-11-10 Thread Grzegorz Maj
POLLIN}, {fd=8, events=POLLIN}, {fd=9,
events=POLLIN}], 6, 0) = 0 (Timeout)
poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6,
events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9,
events=POLLIN}], 6, 0) = 0 (Timeout)
poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6,
events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9,
events=POLLIN}], 6, 0) = 0 (Timeout)
poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6,
events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9,
events=POLLIN}], 6, 0) = 0 (Timeout)
...
(forever)
...
--

For me it looks like the above connect is responsible for establishing
connection, but I'm afraid I don't understand what those calls for
poll are supposed to do.

Attaching gdb to the sender gives me:

--
(gdb) bt
#0  0xe410 in __kernel_vsyscall ()
#1  0x0064993b in poll () from /lib/libc.so.6
#2  0xf7df07b5 in poll_dispatch () from /home/gmaj/openmpi/lib/libopen-pal.so.0
#3  0xf7def8c3 in opal_event_base_loop () from
/home/gmaj/openmpi/lib/libopen-pal.so.0
#4  0xf7defbe7 in opal_event_loop () from
/home/gmaj/openmpi/lib/libopen-pal.so.0
#5  0xf7de323b in opal_progress () from /home/gmaj/openmpi/lib/libopen-pal.so.0
#6  0xf7c51455 in mca_pml_ob1_send () from
/home/gmaj/openmpi/lib/openmpi/mca_pml_ob1.so
#7  0xf7ed9c60 in PMPI_Send () from /home/gmaj/openmpi/lib/libmpi.so.0
#8  0x0804e900 in main ()
--

If anybody knows what may cause this problem or what may I do to find
the reason, any help is appreciated.

My open-mpi is version 1.4.1.


Regards,
Grzegorz Maj


Re: [OMPI users] Problem with sending messages from one of the machines

2010-11-17 Thread Grzegorz Maj
2010/11/11 Jeff Squyres :
> On Nov 11, 2010, at 3:23 PM, Krzysztof Zarzycki wrote:
>
>> No, unfortunately specification of interfaces is a little more 
>> complicated...  eth0/1/2 is not common for both machines.
>
> Can you define "common"?  Do you mean that eth0 on one machine is on a 
> different network then eth0 on the other machine?
>
> Is there any way that you can make them the same?  It would certainly make 
> things easier.

Yes, they are on different networks and unfortunately we are not
allowed to play with this.

>
>> I've tried to play with (oob/btl)_tcp_ if_include, but actually... I don't 
>> know exactly how.
>
> See my other mail:
>
>    http://www.open-mpi.org/community/lists/users/2010/11/14737.php
>
>> Anyway, do you have any ideas how to further debug the communication problem?
>
> The connect() is not getting through somehow.  Sadly, we don't have enough 
> debug messages to show exactly what is going wrong when these kinds of things 
> happen; I have a half-finished branch that has much better debug/error 
> messages, but I've never had the time to finish it (indeed, I think there's a 
> bug in that development branch right now, otherwise I'd recommend giving it a 
> whirl).  :-\

Analyzing the strace of both processes shows, that on both sides the
call to 'poll' after connect/accept succeeds. As I understand they
even exchange some information, which is always 8 bytes, like
D\227\0\1\0\0\0\0. One of them sends this information and the other
receives it. But after receiving, it does:


recv(8, "\5g\0\1\0\0\0\0", 8, 0)= 8
fcntl64(8, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(8, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
getpeername(8, {sa_family=AF_INET, sin_port=htons(57885),
sin_addr=inet_addr("10.0.0.2")}, [16]) = 0
close(8)


In a working scenario (on another machines), after receiving, these
bytes are resent and then proceeds the proper communication (my
'hello' message is sent).

The above address 10.0.0.2 is eth2 on the host machine, which indeed
should be used in this communication.

While playing with network interfaces it came out, that when we bring
down one of the aliases (eth2:0), it starts working. How should we
enforce mpirun not to use this alias, when it's up? We were trying to
use (oob/btl)_tcp_ if_exclude and specifying eth2:0, but it doesn't
seem to help.

Regards,
Grzegorz


>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



[OMPI users] Segmentation fault in mca_pml_ob1.so

2010-12-06 Thread Grzegorz Maj
Hi,
I'm using mkl scalapack in my project. Recently, I was trying to run
my application on new set of nodes. Unfortunately, when I try to
execute more than about 20 processes, I get segmentation fault.

[compn7:03552] *** Process received signal ***
[compn7:03552] Signal: Segmentation fault (11)
[compn7:03552] Signal code: Address not mapped (1)
[compn7:03552] Failing at address: 0x20b2e68
[compn7:03552] [ 0] /lib64/libpthread.so.0(+0xf3c0) [0x7f46e0fc33c0]
[compn7:03552] [ 1]
/home/gmaj/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0xd577)
[0x7f46dd093577]
[compn7:03552] [ 2]
/home/gmaj/lib/openmpi/lib/openmpi/mca_btl_tcp.so(+0x5b4c)
[0x7f46dc5edb4c]
[compn7:03552] [ 3]
/home/gmaj/lib/openmpi/lib/libopen-pal.so.0(+0x1dbe8) [0x7f46e0679be8]
[compn7:03552] [ 4]
(home/gmaj/lib/openmpi/lib/libopen-pal.so.0(opal_progress+0xa1)
[0x7f46e066dbf1]
[compn7:03552] [ 5]
/home/gmaj/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x5945)
[0x7f46dd08b945]
[compn7:03552] [ 6]
/home/gmaj/lib/openmpi/lib/libmpi.so.0(MPI_Send+0x6a) [0x7f46e0b4f10a]
[compn7:03552] [ 7] /home/gmaj/matrix/matrix(BI_Ssend+0x21) [0x49cc11]
[compn7:03552] [ 8] /home/gmaj/matrix/matrix(BI_IdringBR+0x79) [0x49c579]
[compn7:03552] [ 9] /home/gmaj/matrix/matrix(ilp64_Cdgebr2d+0x221) [0x495bb1]
[compn7:03552] [10] /home/gmaj/matrix/matrix(Cdgebr2d+0xd0) [0x47ffb0]
[compn7:03552] [11]
/home/gmaj/lib/intel_mkl/current/lib/em64t/libmkl_scalapack_ilp64.so(PB_CInV2+0x1304)
[0x7f46e27f5124]
[compn7:03552] *** End of error message ***

This error appears during some scalapack computation. My processes do
some mpi communication before this error appears.

I found out, that by modifying btl_tcp_eager_limit and
btl_tcp_max_send_size parameters, I can run more processes - the
smaller those values are, the more processes I can run. Unfortunately,
by this method I've succeeded to run up to 30 processes, which is
still far to small.

Some clue may be what valgrind says:

==3894== Syscall param writev(vector[...]) points to uninitialised byte(s)
==3894==at 0x82D009B: writev (in /lib64/libc-2.12.90.so)
==3894==by 0xBA2136D: mca_btl_tcp_frag_send (in
/home/gmaj/lib/openmpi/lib/openmpi/mca_btl_tcp.so)
==3894==by 0xBA203D0: mca_btl_tcp_endpoint_send (in
/home/gmaj/lib/openmpi/lib/openmpi/mca_btl_tcp.so)
==3894==by 0xB003583: mca_pml_ob1_send_request_start_rdma (in
/home/gmaj/lib/openmpi/lib/openmpi/mca_pml_ob1.so)
==3894==by 0xAFFA7C9: mca_pml_ob1_send (in
/home/gmaj/lib/openmpi/lib/openmpi/mca_pml_ob1.so)
==3894==by 0x6D4B109: PMPI_Send (in /home/gmaj/lib/openmpi/lib/libmpi.so.0)
==3894==by 0x49CC10: BI_Ssend (in /home/gmaj/matrix/matrix)
==3894==by 0x49C578: BI_IdringBR (in /home/gmaj/matrix/matrix)
==3894==by 0x495BB0: ilp64_Cdgebr2d (in /home/gmaj/matrix/matrix)
==3894==by 0x47FFAF: Cdgebr2d (in /home/gmaj/matrix/matrix)
==3894==by 0x51B38E0: PB_CInV2 (in
/home/gmaj/lib/intel_mkl/10.2.6/lib/em64t/libmkl_scalapack_ilp64.so)
==3894==by 0x51DB89B: PB_CpgemmAB (in
/home/gmaj/lib/intel_mkl/10.2.6/lib/em64t/libmkl_scalapack_ilp64.so)
==3894==  Address 0xadecdce is 461,886 bytes inside a block of size
527,544 alloc'd
==3894==at 0x4C2615D: malloc (vg_replace_malloc.c:195)
==3894==by 0x6D0BBA3: ompi_free_list_grow (in
/home/gmaj/lib/openmpi/lib/libmpi.so.0)
==3894==by 0xBA1E1A4: mca_btl_tcp_component_init (in
/home/gmaj/lib/openmpi/lib/openmpi/mca_btl_tcp.so)
==3894==by 0x6D5C909: mca_btl_base_select (in
/home/gmaj/lib/openmpi/lib/libmpi.so.0)
==3894==by 0xB40E950: mca_bml_r2_component_init (in
/home/gmaj/lib/openmpi/lib/openmpi/mca_bml_r2.so)
==3894==by 0x6D5C07E: mca_bml_base_init (in
/home/gmaj/lib/openmpi/lib/libmpi.so.0)
==3894==by 0xAFF8A0E: mca_pml_ob1_component_init (in
/home/gmaj/lib/openmpi/lib/openmpi/mca_pml_ob1.so)
==3894==by 0x6D663B2: mca_pml_base_select (in
/home/gmaj/lib/openmpi/lib/libmpi.so.0)
==3894==by 0x6D25D20: ompi_mpi_init (in
/home/gmaj/lib/openmpi/lib/libmpi.so.0)
==3894==by 0x6D45987: PMPI_Init_thread (in
/home/gmaj/lib/openmpi/lib/libmpi.so.0)
==3894==by 0x42490A: MPI::Init_thread(int&, char**&, int)
(functions_inln.h:150)
==3894==by 0x41F483: main (matrix.cpp:83)

I've tried to configure open-mpi with option --without-memory-manager,
but it didn't help.

I can successfully run exactly the same application on other machines,
having the number of nodes even over 800.

Does anyone have any idea how to further debug this issue? Any help
would be appreciated.

Thanks,
Grzegorz Maj


Re: [OMPI users] Segmentation fault in mca_pml_ob1.so

2010-12-07 Thread Grzegorz Maj
Some update on this issue. I've attached gdb to the crashing
application and I got:

-
Program received signal SIGSEGV, Segmentation fault.
mca_pml_ob1_send_request_put (sendreq=0x130c480, btl=0xc49850,
hdr=0xd10e60) at pml_ob1_sendreq.c:1231
1231pml_ob1_sendreq.c: No such file or directory.
in pml_ob1_sendreq.c
(gdb) bt
#0  mca_pml_ob1_send_request_put (sendreq=0x130c480, btl=0xc49850,
hdr=0xd10e60) at pml_ob1_sendreq.c:1231
#1  0x7fc55bf31693 in mca_btl_tcp_endpoint_recv_handler (sd=, flags=, user=) at btl_tcp_endpoint.c:718
#2  0x7fc55fff7de4 in event_process_active (base=0xc1daf0,
flags=2) at event.c:651
#3  opal_event_base_loop (base=0xc1daf0, flags=2) at event.c:823
#4  0x7fc55ffe9ff1 in opal_progress () at runtime/opal_progress.c:189
#5  0x7fc55c9d7115 in opal_condition_wait (addr=, count=, datatype=,
src=, tag=,
comm=, status=0xcc6100) at
../../../../opal/threads/condition.h:99
#6  ompi_request_wait_completion (addr=,
count=, datatype=,
src=, tag=,
comm=, status=0xcc6100) at
../../../../ompi/request/request.h:375
#7  mca_pml_ob1_recv (addr=, count=, datatype=, src=, tag=, comm=,
status=0xcc6100) at pml_ob1_irecv.c:104
#8  0x7fc560511260 in PMPI_Recv (buf=0x0, count=12884048,
type=0xd10410, source=-1, tag=0, comm=0xd0daa0, status=0xcc6100) at
precv.c:75
#9  0x0049cc43 in BI_Srecv ()
#10 0x0049c555 in BI_IdringBR ()
#11 0x00495ba1 in ilp64_Cdgebr2d ()
#12 0x0047ffa0 in Cdgebr2d ()
#13 0x7fc5621da8e1 in PB_CInV2 () from
/home/gmaj/lib/intel_mkl/current/lib/em64t/libmkl_scalapack_ilp64.so
#14 0x7fc56220289c in PB_CpgemmAB () from
/home/gmaj/lib/intel_mkl/current/lib/em64t/libmkl_scalapack_ilp64.so
#15 0x7fc5622b28fd in pdgemm_ () from
/home/gmaj/lib/intel_mkl/current/lib/em64t/libmkl_scalapack_ilp64.so
-

So this looks like the line responsible for segmentation fault is:
mca_bml_base_endpoint_t *bml_endpoint = sendreq->req_endpoint;

I repeated it several times: always crashes in the same line.

I have no idea what to do with this. Again, any help would be appreciated.

Thanks,
Grzegorz Maj



2010/12/6 Grzegorz Maj :
> Hi,
> I'm using mkl scalapack in my project. Recently, I was trying to run
> my application on new set of nodes. Unfortunately, when I try to
> execute more than about 20 processes, I get segmentation fault.
>
> [compn7:03552] *** Process received signal ***
> [compn7:03552] Signal: Segmentation fault (11)
> [compn7:03552] Signal code: Address not mapped (1)
> [compn7:03552] Failing at address: 0x20b2e68
> [compn7:03552] [ 0] /lib64/libpthread.so.0(+0xf3c0) [0x7f46e0fc33c0]
> [compn7:03552] [ 1]
> /home/gmaj/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0xd577)
> [0x7f46dd093577]
> [compn7:03552] [ 2]
> /home/gmaj/lib/openmpi/lib/openmpi/mca_btl_tcp.so(+0x5b4c)
> [0x7f46dc5edb4c]
> [compn7:03552] [ 3]
> /home/gmaj/lib/openmpi/lib/libopen-pal.so.0(+0x1dbe8) [0x7f46e0679be8]
> [compn7:03552] [ 4]
> (home/gmaj/lib/openmpi/lib/libopen-pal.so.0(opal_progress+0xa1)
> [0x7f46e066dbf1]
> [compn7:03552] [ 5]
> /home/gmaj/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x5945)
> [0x7f46dd08b945]
> [compn7:03552] [ 6]
> /home/gmaj/lib/openmpi/lib/libmpi.so.0(MPI_Send+0x6a) [0x7f46e0b4f10a]
> [compn7:03552] [ 7] /home/gmaj/matrix/matrix(BI_Ssend+0x21) [0x49cc11]
> [compn7:03552] [ 8] /home/gmaj/matrix/matrix(BI_IdringBR+0x79) [0x49c579]
> [compn7:03552] [ 9] /home/gmaj/matrix/matrix(ilp64_Cdgebr2d+0x221) [0x495bb1]
> [compn7:03552] [10] /home/gmaj/matrix/matrix(Cdgebr2d+0xd0) [0x47ffb0]
> [compn7:03552] [11]
> /home/gmaj/lib/intel_mkl/current/lib/em64t/libmkl_scalapack_ilp64.so(PB_CInV2+0x1304)
> [0x7f46e27f5124]
> [compn7:03552] *** End of error message ***
>
> This error appears during some scalapack computation. My processes do
> some mpi communication before this error appears.
>
> I found out, that by modifying btl_tcp_eager_limit and
> btl_tcp_max_send_size parameters, I can run more processes - the
> smaller those values are, the more processes I can run. Unfortunately,
> by this method I've succeeded to run up to 30 processes, which is
> still far to small.
>
> Some clue may be what valgrind says:
>
> ==3894== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==3894==    at 0x82D009B: writev (in /lib64/libc-2.12.90.so)
> ==3894==    by 0xBA2136D: mca_btl_tcp_frag_send (in
> /home/gmaj/lib/openmpi/lib/openmpi/mca_btl_tcp.so)
> ==3894==    by 0xBA203D0: mca_btl_tcp_endpoint_send (in
> /home/gmaj/lib/openmpi/lib/openmpi/mca_btl_tcp.so)
> ==3894==    by 0xB003583: mca_pml_ob1_send_request_start_rdma (in
> /home/gmaj/lib/openmpi/lib/openmpi/mca_pml_ob1.so)
> ==3894==    by 0xAFFA7C9: mca_pml_ob1_send (in
> /home/gmaj/lib/openmpi/lib/openmpi/mca_pml_ob1.so)
> ==3894==    by 0x6D4B109: PMPI_Send (in 
>

Re: [OMPI users] Segmentation fault in mca_pml_ob1.so

2010-12-07 Thread Grzegorz Maj
I recompiled MPI with -g, but it didn't solve the problem. Two things that
have changed are: buf in PMPI_Recv is no longer of value 0 and backtrace in
gdb shows more functions (eg. mca_pml_ob1_recv_frag_callback_put as #1).

As you recommended, I will try to walk up the stack, but it's not so easy
for me to follow this code.

This is the backtrace I got with -g:
-
Program received signal SIGSEGV, Segmentation fault.
0x7f1f1a11e4eb in mca_pml_ob1_send_request_put (sendreq=0x1437b00,
btl=0xdae850, hdr=0xeb4870) at pml_ob1_sendreq.c:1231
1231 pml_ob1_sendreq.c: No such file or directory.
 in pml_ob1_sendreq.c
(gdb) bt
#0  0x7f1f1a11e4eb in mca_pml_ob1_send_request_put (sendreq=0x1437b00,
btl=0xdae850, hdr=0xeb4870) at pml_ob1_sendreq.c:1231
#1  0x7f1f1a1124de in mca_pml_ob1_recv_frag_callback_put (btl=0xdae850,
tag=72 'H', des=0x7f1f1ff6bb00, cbdata=0x0) at pml_ob1_recvfrag.c:361
#2  0x7f1f19660e0f in mca_btl_tcp_endpoint_recv_handler (sd=24, flags=2,
user=0xe2ab40) at btl_tcp_endpoint.c:718
#3  0x7f1f1d74aa5b in event_process_active (base=0xd82af0) at
event.c:651
#4  0x7f1f1d74b087 in opal_event_base_loop (base=0xd82af0, flags=2) at
event.c:823
#5  0x7f1f1d74ac76 in opal_event_loop (flags=2) at event.c:730
#6  0x7f1f1d73a360 in opal_progress () at runtime/opal_progress.c:189
#7  0x7f1f1a10c0af in opal_condition_wait (c=0x7f1f1df3a5c0,
m=0x7f1f1df3a620) at ../../../../opal/threads/condition.h:99
#8  0x7f1f1a10bef1 in ompi_request_wait_completion (req=0xe1eb00) at
../../../../ompi/request/request.h:375
#9  0x7f1f1a10bdb5 in mca_pml_ob1_recv (addr=0x7f1f1a083080, count=1,
datatype=0xeb3da0, src=-1, tag=0, comm=0xeb0cd0, status=0xe43f00) at
pml_ob1_irecv.c:104
#10 0x7f1f1dc9e324 in PMPI_Recv (buf=0x7f1f1a083080, count=1,
type=0xeb3da0, source=-1, tag=0, comm=0xeb0cd0, status=0xe43f00) at
precv.c:75
#11 0x0049cc43 in BI_Srecv ()
#12 0x0049c555 in BI_IdringBR ()
#13 0x00495ba1 in ilp64_Cdgebr2d ()
#14 0x0047ffa0 in Cdgebr2d ()
#15 0x7f1f1f99c8e1 in PB_CInV2 () from
/home/gmaj/lib/intel_mkl/current/lib/em64t/libmkl_scalapack_ilp64.so
#16 0x7f1f1f9c489c in PB_CpgemmAB () from
/home/gmaj/lib/intel_mkl/current/lib/em64t/libmkl_scalapack_ilp64.so
#17 0x7f1f1fa748fd in pdgemm_ () from
/home/gmaj/lib/intel_mkl/current/lib/em64t/libmkl_scalapack_ilp64.so
-----

Thanks,
Grzegorz Maj



2010/12/7 Terry Dontje 

>  I am not sure this has anything to do with your problem but if you look at
> the stack entry for PMPI_Recv I noticed the buf has a value of 0.  Shouldn't
> that be an address?
>
> Does your code fail if the MPI library is built with -g?  If it does fail
> the same way, the next step I would do would be to walk up the stack and try
> and figure out where the sendreq address is coming from because supposedly
> it is that address that is not mapped according to the original stack.
>
> --td
>
>
> On 12/07/2010 08:29 AM, Grzegorz Maj wrote:
>
> Some update on this issue. I've attached gdb to the crashing
> application and I got:
>
> -
> Program received signal SIGSEGV, Segmentation fault.
> mca_pml_ob1_send_request_put (sendreq=0x130c480, btl=0xc49850,
> hdr=0xd10e60) at pml_ob1_sendreq.c:1231
> 1231  pml_ob1_sendreq.c: No such file or directory.
>   in pml_ob1_sendreq.c
> (gdb) bt
> #0  mca_pml_ob1_send_request_put (sendreq=0x130c480, btl=0xc49850,
> hdr=0xd10e60) at pml_ob1_sendreq.c:1231
> #1  0x7fc55bf31693 in mca_btl_tcp_endpoint_recv_handler (sd= optimized out>, flags=, user= out>) at btl_tcp_endpoint.c:718
> #2  0x7fc55fff7de4 in event_process_active (base=0xc1daf0,
> flags=2) at event.c:651
> #3  opal_event_base_loop (base=0xc1daf0, flags=2) at event.c:823
> #4  0x7fc55ffe9ff1 in opal_progress () at runtime/opal_progress.c:189
> #5  0x7fc55c9d7115 in opal_condition_wait (addr= out>, count=, datatype=,
> src=, tag=,
> comm=, status=0xcc6100) at
> ../../../../opal/threads/condition.h:99
> #6  ompi_request_wait_completion (addr=,
> count=, datatype=,
> src=, tag=,
> comm=, status=0xcc6100) at
> ../../../../ompi/request/request.h:375
> #7  mca_pml_ob1_recv (addr=, count= optimized out>, datatype=, src= out>, tag=, comm=,
> status=0xcc6100) at pml_ob1_irecv.c:104
> #8  0x7fc560511260 in PMPI_Recv (buf=0x0, count=12884048,
> type=0xd10410, source=-1, tag=0, comm=0xd0daa0, status=0xcc6100) at
> precv.c:75
> #9  0x0049cc43 in BI_Srecv ()
> #10 0x0049c555 in BI_IdringBR ()
> #11 0x00495ba1 in ilp64_Cdgebr2d ()
> #12 0x0047ffa0 in Cdgebr2d ()
> #13 0x7fc5621da8e1 in PB_CInV2 () from
> /home/gmaj/lib/intel_mkl/current/lib/em64t/libmkl_scalapack_ilp64.so
> #14 0x7fc56220289c in PB_CpgemmAB () from
> /home/gmaj/lib/intel_mkl/current/lib/em64t/libmk

[OMPI users] MPI daemon died unexpectedly

2012-03-27 Thread Grzegorz Maj
Hi,
I have an MPI application using ScaLAPACK routines. I'm running it on
OpenMPI 1.4.3. I'm using mpirun to launch less than 100 processes. I'm
using it quite extensively for almost two years and it almost always
works fine. However, once every 3-4 months I get the following error
during the execution:

--
A daemon (pid unknown) died unexpectedly on signal 1  while attempting to
launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
--
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--
--
mpirun was unable to cleanly terminate the daemons on the nodes shown
below. Additional manual cleanup may be required - please refer to
the "orte-clean" tool for assistance.
--

It says that the daemon died while attempting to launch, but my
application (MPI grid) was running for about 14 minutes before it
failed. I can say that based on the log messages I'm producing during
the execution of my application. There is no more information from
mpirun. One more thing I know is that mpirun exit status was 1, but I
guess it is not very helpful. There are no core files.

I would appreciate any suggestions on how to debug this issue.

Regards,
Grzegorz Maj


Re: [OMPI users] MPI daemon died unexpectedly

2012-03-27 Thread Grzegorz Maj
John, thank you for your reply.

I checked the system logs and there are no signs of oom killer.

What do you mean by cleaning 'orphan' processes? Should I check if
there are any processes left after each job execution? I have always
been assuming that when mpirun terminates, everything is cleaned up.
Currently there are no processes left on the nodes. The failure
happend on Friday and after that tens of similar jobs completed
successfully.

Regards,
Grzegorz Maj

2012/3/27 John Hearns :
> Have you checked the system logs on the machines where this is running?
> Is it perhaps that the processes use lots of memory and the Out Of
> Memory (OOM) killer is killing them?
> Also check all nodes for left-over 'orphan' processes which are still
> running after a job finishes - these should be killed or the node
> rebooted.
>
> On 27/03/2012, Grzegorz Maj  wrote:
>> Hi,
>> I have an MPI application using ScaLAPACK routines. I'm running it on
>> OpenMPI 1.4.3. I'm using mpirun to launch less than 100 processes. I'm
>> using it quite extensively for almost two years and it almost always
>> works fine. However, once every 3-4 months I get the following error
>> during the execution:
>>
>> --
>> A daemon (pid unknown) died unexpectedly on signal 1  while attempting to
>> launch so we are aborting.
>>
>> There may be more information reported by the environment (see above).
>>
>> This may be because the daemon was unable to find all the needed shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> --
>> --
>> mpirun noticed that the job aborted, but has no info as to the process
>> that caused that situation.
>> --
>> --
>> mpirun was unable to cleanly terminate the daemons on the nodes shown
>> below. Additional manual cleanup may be required - please refer to
>> the "orte-clean" tool for assistance.
>> --
>>
>> It says that the daemon died while attempting to launch, but my
>> application (MPI grid) was running for about 14 minutes before it
>> failed. I can say that based on the log messages I'm producing during
>> the execution of my application. There is no more information from
>> mpirun. One more thing I know is that mpirun exit status was 1, but I
>> guess it is not very helpful. There are no core files.
>>
>> I would appreciate any suggestions on how to debug this issue.
>>
>> Regards,
>> Grzegorz Maj
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



[OMPI users] Using MPI derived datatypes

2012-08-03 Thread Grzegorz Maj
Hi,
I would like my MPI processes to exchange some structural data. That
data is represented by plain structures containing basic datatypes. I
would like to use MPI derived datatypes, because of its portability
and good performance.

I would like to be able to send/receive any of my structures in the
same part of code. In the low-level network programming it is usually
done by having each struct of this pattern:
struct S1 {
  int structType;
  ...
}
And then you first read structType and know what bytes to expect next.

Is there a good way to do it using MPI derived datatypes?

I was thinking of having separate MPI_Request for each of my
structures and calling multiple MPI_Irecv + MPI_Waitany. But then, how
to do this for MPI_Bcast?

My second question is about having arbitrary size structures, i.e. the
ones having 'char buf[0]' as the last field, where you allocate memory
of size 'sizeof(S) + bufLen'. Is there a way to convert such a struct
into MPI derived datatype?

Thanks for any help,
Regards,
Grzegorz Maj


[OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-04-17 Thread Grzegorz Maj
Hi,
I'd like to dynamically create a group of processes communicating via
MPI. Those processes need to be run without mpirun and create
intracommunicator after the startup. Any ideas how to do this
efficiently?
I came up with a solution in which the processes are connecting one by
one using MPI_Comm_connect, but unfortunately all the processes that
are already in the group need to call MPI_Comm_accept. This means that
when the n-th process wants to connect I need to collect all the n-1
processes on the MPI_Comm_accept call. After I run about 40 processes
every subsequent call takes more and more time, which I'd like to
avoid.
Another problem in this solution is that when I try to connect 66-th
process the root of the existing group segfaults on MPI_Comm_accept.
Maybe it's my bug, but it's weird as everything works fine for at most
65 processes. Is there any limitation I don't know about?
My last question is about MPI_COMM_WORLD. When I run my processes
without mpirun their MPI_COMM_WORLD is the same as MPI_COMM_SELF. Is
there any way to change MPI_COMM_WORLD and set it to the
intracommunicator that I've created?

Thanks,
Grzegorz Maj


Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-04-17 Thread Grzegorz Maj
Yes, I know. The problem is that I need to use some special way for
running my processes provided by the environment in which I'm working
and unfortunately I can't use mpirun.

2010/4/18 Ralph Castain :
> Guess I don't understand why you can't use mpirun - all it does is start 
> things, provide a means to forward io, etc. It mainly sits there quietly 
> without using any cpu unless required to support the job.
>
> Sounds like it would solve your problem. Otherwise, I know of no way to get 
> all these processes into comm_world.
>
>
> On Apr 17, 2010, at 2:27 PM, Grzegorz Maj wrote:
>
>> Hi,
>> I'd like to dynamically create a group of processes communicating via
>> MPI. Those processes need to be run without mpirun and create
>> intracommunicator after the startup. Any ideas how to do this
>> efficiently?
>> I came up with a solution in which the processes are connecting one by
>> one using MPI_Comm_connect, but unfortunately all the processes that
>> are already in the group need to call MPI_Comm_accept. This means that
>> when the n-th process wants to connect I need to collect all the n-1
>> processes on the MPI_Comm_accept call. After I run about 40 processes
>> every subsequent call takes more and more time, which I'd like to
>> avoid.
>> Another problem in this solution is that when I try to connect 66-th
>> process the root of the existing group segfaults on MPI_Comm_accept.
>> Maybe it's my bug, but it's weird as everything works fine for at most
>> 65 processes. Is there any limitation I don't know about?
>> My last question is about MPI_COMM_WORLD. When I run my processes
>> without mpirun their MPI_COMM_WORLD is the same as MPI_COMM_SELF. Is
>> there any way to change MPI_COMM_WORLD and set it to the
>> intracommunicator that I've created?
>>
>> Thanks,
>> Grzegorz Maj
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>


Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-04-23 Thread Grzegorz Maj
Thank you Ralph for your explanation.
And, apart from that descriptors' issue, is there any other way to
solve my problem, i.e. to run separately a number of processes,
without mpirun and then to collect them into an MPI intracomm group?
If I for example would need to run some 'server process' (even using
mpirun) for this task, that's OK. Any ideas?

Thanks,
Grzegorz Maj


2010/4/18 Ralph Castain :
> Okay, but here is the problem. If you don't use mpirun, and are not operating 
> in an environment we support for "direct" launch (i.e., starting processes 
> outside of mpirun), then every one of those processes thinks it is a 
> singleton - yes?
>
> What you may not realize is that each singleton immediately fork/exec's an 
> orted daemon that is configured to behave just like mpirun. This is required 
> in order to support MPI-2 operations such as MPI_Comm_spawn, 
> MPI_Comm_connect/accept, etc.
>
> So if you launch 64 processes that think they are singletons, then you have 
> 64 copies of orted running as well. This eats up a lot of file descriptors, 
> which is probably why you are hitting this 65 process limit - your system is 
> probably running out of file descriptors. You might check you system limits 
> and see if you can get them revised upward.
>
>
> On Apr 17, 2010, at 4:24 PM, Grzegorz Maj wrote:
>
>> Yes, I know. The problem is that I need to use some special way for
>> running my processes provided by the environment in which I'm working
>> and unfortunately I can't use mpirun.
>>
>> 2010/4/18 Ralph Castain :
>>> Guess I don't understand why you can't use mpirun - all it does is start 
>>> things, provide a means to forward io, etc. It mainly sits there quietly 
>>> without using any cpu unless required to support the job.
>>>
>>> Sounds like it would solve your problem. Otherwise, I know of no way to get 
>>> all these processes into comm_world.
>>>
>>>
>>> On Apr 17, 2010, at 2:27 PM, Grzegorz Maj wrote:
>>>
>>>> Hi,
>>>> I'd like to dynamically create a group of processes communicating via
>>>> MPI. Those processes need to be run without mpirun and create
>>>> intracommunicator after the startup. Any ideas how to do this
>>>> efficiently?
>>>> I came up with a solution in which the processes are connecting one by
>>>> one using MPI_Comm_connect, but unfortunately all the processes that
>>>> are already in the group need to call MPI_Comm_accept. This means that
>>>> when the n-th process wants to connect I need to collect all the n-1
>>>> processes on the MPI_Comm_accept call. After I run about 40 processes
>>>> every subsequent call takes more and more time, which I'd like to
>>>> avoid.
>>>> Another problem in this solution is that when I try to connect 66-th
>>>> process the root of the existing group segfaults on MPI_Comm_accept.
>>>> Maybe it's my bug, but it's weird as everything works fine for at most
>>>> 65 processes. Is there any limitation I don't know about?
>>>> My last question is about MPI_COMM_WORLD. When I run my processes
>>>> without mpirun their MPI_COMM_WORLD is the same as MPI_COMM_SELF. Is
>>>> there any way to change MPI_COMM_WORLD and set it to the
>>>> intracommunicator that I've created?
>>>>
>>>> Thanks,
>>>> Grzegorz Maj
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-04-23 Thread Grzegorz Maj
To be more precise: by 'server process' I mean some process that I
could run once on my system and it could help in creating those
groups.
My typical scenario is:
1. run N separate processes, each without mpirun
2. connect them into MPI group
3. do some job
4. exit all N processes
5. goto 1

2010/4/23 Grzegorz Maj :
> Thank you Ralph for your explanation.
> And, apart from that descriptors' issue, is there any other way to
> solve my problem, i.e. to run separately a number of processes,
> without mpirun and then to collect them into an MPI intracomm group?
> If I for example would need to run some 'server process' (even using
> mpirun) for this task, that's OK. Any ideas?
>
> Thanks,
> Grzegorz Maj
>
>
> 2010/4/18 Ralph Castain :
>> Okay, but here is the problem. If you don't use mpirun, and are not 
>> operating in an environment we support for "direct" launch (i.e., starting 
>> processes outside of mpirun), then every one of those processes thinks it is 
>> a singleton - yes?
>>
>> What you may not realize is that each singleton immediately fork/exec's an 
>> orted daemon that is configured to behave just like mpirun. This is required 
>> in order to support MPI-2 operations such as MPI_Comm_spawn, 
>> MPI_Comm_connect/accept, etc.
>>
>> So if you launch 64 processes that think they are singletons, then you have 
>> 64 copies of orted running as well. This eats up a lot of file descriptors, 
>> which is probably why you are hitting this 65 process limit - your system is 
>> probably running out of file descriptors. You might check you system limits 
>> and see if you can get them revised upward.
>>
>>
>> On Apr 17, 2010, at 4:24 PM, Grzegorz Maj wrote:
>>
>>> Yes, I know. The problem is that I need to use some special way for
>>> running my processes provided by the environment in which I'm working
>>> and unfortunately I can't use mpirun.
>>>
>>> 2010/4/18 Ralph Castain :
>>>> Guess I don't understand why you can't use mpirun - all it does is start 
>>>> things, provide a means to forward io, etc. It mainly sits there quietly 
>>>> without using any cpu unless required to support the job.
>>>>
>>>> Sounds like it would solve your problem. Otherwise, I know of no way to 
>>>> get all these processes into comm_world.
>>>>
>>>>
>>>> On Apr 17, 2010, at 2:27 PM, Grzegorz Maj wrote:
>>>>
>>>>> Hi,
>>>>> I'd like to dynamically create a group of processes communicating via
>>>>> MPI. Those processes need to be run without mpirun and create
>>>>> intracommunicator after the startup. Any ideas how to do this
>>>>> efficiently?
>>>>> I came up with a solution in which the processes are connecting one by
>>>>> one using MPI_Comm_connect, but unfortunately all the processes that
>>>>> are already in the group need to call MPI_Comm_accept. This means that
>>>>> when the n-th process wants to connect I need to collect all the n-1
>>>>> processes on the MPI_Comm_accept call. After I run about 40 processes
>>>>> every subsequent call takes more and more time, which I'd like to
>>>>> avoid.
>>>>> Another problem in this solution is that when I try to connect 66-th
>>>>> process the root of the existing group segfaults on MPI_Comm_accept.
>>>>> Maybe it's my bug, but it's weird as everything works fine for at most
>>>>> 65 processes. Is there any limitation I don't know about?
>>>>> My last question is about MPI_COMM_WORLD. When I run my processes
>>>>> without mpirun their MPI_COMM_WORLD is the same as MPI_COMM_SELF. Is
>>>>> there any way to change MPI_COMM_WORLD and set it to the
>>>>> intracommunicator that I've created?
>>>>>
>>>>> Thanks,
>>>>> Grzegorz Maj
>>>>> ___
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>



[OMPI users] Specifying slots in rankfile

2010-06-09 Thread Grzegorz Maj
Hi,
I'd like mpirun to run tasks with specific ranks on specific hosts,
but I don't want to provide any particular sockets/slots/cores.
The following example uses just one host, but generally I'll use more.
In my hostfile I just have:

root@host01 slots=4

I was playing with my rankfile to achieve what I've mentioned, but I
always get some problems.

1) With rankfile like:
rank 0=host01 slot=*
rank 1=host01 slot=*
rank 2=host01 slot=*
rank 3=host01 slot=*

I get:

--
We were unable to successfully process/set the requested processor
affinity settings:

Specified slot list: *
Error: Error

This could mean that a non-existent processor was specified, or
that the specification had improper syntax.
--
--
mpirun was unable to start the specified application as it encountered an error:

Error name: Error
Node: host01

when attempting to start process rank 0.
--
[host01:13715] Rank 0: PAFFINITY cannot get physical processor id for
logical processor 4


I think it tries to find processor #4, bug there are only 0-3

2) With rankfile like:
rank 0=host01 slot=*:*
rank 1=host01 slot=*:*
rank 2=host01 slot=*:*
rank 3=host01 slot=*:*

Everything looks well, i.e. my programs are spread across 4 processors.
But when running MPI program as follows:

MPI::Init(argc, argv);
fprintf(stderr, "after init %d\n", MPI::Is_initialized());
nprocs_mpi = MPI::COMM_WORLD.Get_size();
fprintf(stderr, "won't get here\n");

I get:

after init 1
[host01:14348] *** Process received signal ***
[host01:14348] Signal: Segmentation fault (11)
[host01:14348] Signal code: Address not mapped (1)
[host01:14348] Failing at address: 0x8
[host01:14348] [ 0] [0xe410]
[host01:14348] [ 1] p(_ZNK3MPI4Comm8Get_sizeEv+0x19) [0x8051299]
[host01:14348] [ 2] p(main+0x86) [0x804ee4e]
[host01:14348] [ 3] /lib/libc.so.6(__libc_start_main+0xe5) [0x4180b5c5]
[host01:14348] [ 4] p(__gxx_personality_v0+0x125) [0x804ecc1]
[host01:14348] *** End of error message ***

I'm using OPEN MPI v. 1.4.2 (downloaded yesterday).
In my rankfile I really want to write something like slot=*. I know
slot=0-3 would be a solution, but when generating rankfile I may not
be sure how many processors are there available on a particular host.

Any help would be appreciated.

Regards,
Grzegorz Maj


Re: [OMPI users] Specifying slots in rankfile

2010-06-09 Thread Grzegorz Maj
In my previous mail I said that slot=0-3 would be a solution.
Unfortunately it gives me exactly the same segfault as in case with
*:*

2010/6/9 Grzegorz Maj :
> Hi,
> I'd like mpirun to run tasks with specific ranks on specific hosts,
> but I don't want to provide any particular sockets/slots/cores.
> The following example uses just one host, but generally I'll use more.
> In my hostfile I just have:
>
> root@host01 slots=4
>
> I was playing with my rankfile to achieve what I've mentioned, but I
> always get some problems.
>
> 1) With rankfile like:
> rank 0=host01 slot=*
> rank 1=host01 slot=*
> rank 2=host01 slot=*
> rank 3=host01 slot=*
>
> I get:
>
> --
> We were unable to successfully process/set the requested processor
> affinity settings:
>
> Specified slot list: *
> Error: Error
>
> This could mean that a non-existent processor was specified, or
> that the specification had improper syntax.
> --
> --
> mpirun was unable to start the specified application as it encountered an 
> error:
>
> Error name: Error
> Node: host01
>
> when attempting to start process rank 0.
> --
> [host01:13715] Rank 0: PAFFINITY cannot get physical processor id for
> logical processor 4
>
>
> I think it tries to find processor #4, bug there are only 0-3
>
> 2) With rankfile like:
> rank 0=host01 slot=*:*
> rank 1=host01 slot=*:*
> rank 2=host01 slot=*:*
> rank 3=host01 slot=*:*
>
> Everything looks well, i.e. my programs are spread across 4 processors.
> But when running MPI program as follows:
>
> MPI::Init(argc, argv);
> fprintf(stderr, "after init %d\n", MPI::Is_initialized());
> nprocs_mpi = MPI::COMM_WORLD.Get_size();
> fprintf(stderr, "won't get here\n");
>
> I get:
>
> after init 1
> [host01:14348] *** Process received signal ***
> [host01:14348] Signal: Segmentation fault (11)
> [host01:14348] Signal code: Address not mapped (1)
> [host01:14348] Failing at address: 0x8
> [host01:14348] [ 0] [0xe410]
> [host01:14348] [ 1] p(_ZNK3MPI4Comm8Get_sizeEv+0x19) [0x8051299]
> [host01:14348] [ 2] p(main+0x86) [0x804ee4e]
> [host01:14348] [ 3] /lib/libc.so.6(__libc_start_main+0xe5) [0x4180b5c5]
> [host01:14348] [ 4] p(__gxx_personality_v0+0x125) [0x804ecc1]
> [host01:14348] *** End of error message ***
>
> I'm using OPEN MPI v. 1.4.2 (downloaded yesterday).
> In my rankfile I really want to write something like slot=*. I know
> slot=0-3 would be a solution, but when generating rankfile I may not
> be sure how many processors are there available on a particular host.
>
> Any help would be appreciated.
>
> Regards,
> Grzegorz Maj
>


Re: [OMPI users] Specifying slots in rankfile

2010-06-09 Thread Grzegorz Maj
Thanks a lot, it works fine for me.
But going back to my problems - is it some bug in open-mpi or I should
use "slot=*" option in some other way?

2010/6/9 Ralph Castain :
> I would recommend using the sequential mapper instead:
>
> mpirun -mca rmaps seq
>
> You can then just list your hosts in your hostfile, and we will put the ranks 
> sequentially on those hosts. So you get something like this
>
> host01  <= rank0
> host01  <= rank1
> host02  <= rank2
> host03  <= rank3
> host01  <= rank4
>
> Ralph
>
> On Jun 9, 2010, at 4:39 AM, Grzegorz Maj wrote:
>
>> In my previous mail I said that slot=0-3 would be a solution.
>> Unfortunately it gives me exactly the same segfault as in case with
>> *:*
>>
>> 2010/6/9 Grzegorz Maj :
>>> Hi,
>>> I'd like mpirun to run tasks with specific ranks on specific hosts,
>>> but I don't want to provide any particular sockets/slots/cores.
>>> The following example uses just one host, but generally I'll use more.
>>> In my hostfile I just have:
>>>
>>> root@host01 slots=4
>>>
>>> I was playing with my rankfile to achieve what I've mentioned, but I
>>> always get some problems.
>>>
>>> 1) With rankfile like:
>>> rank 0=host01 slot=*
>>> rank 1=host01 slot=*
>>> rank 2=host01 slot=*
>>> rank 3=host01 slot=*
>>>
>>> I get:
>>>
>>> --
>>> We were unable to successfully process/set the requested processor
>>> affinity settings:
>>>
>>> Specified slot list: *
>>> Error: Error
>>>
>>> This could mean that a non-existent processor was specified, or
>>> that the specification had improper syntax.
>>> --
>>> --
>>> mpirun was unable to start the specified application as it encountered an 
>>> error:
>>>
>>> Error name: Error
>>> Node: host01
>>>
>>> when attempting to start process rank 0.
>>> --
>>> [host01:13715] Rank 0: PAFFINITY cannot get physical processor id for
>>> logical processor 4
>>>
>>>
>>> I think it tries to find processor #4, bug there are only 0-3
>>>
>>> 2) With rankfile like:
>>> rank 0=host01 slot=*:*
>>> rank 1=host01 slot=*:*
>>> rank 2=host01 slot=*:*
>>> rank 3=host01 slot=*:*
>>>
>>> Everything looks well, i.e. my programs are spread across 4 processors.
>>> But when running MPI program as follows:
>>>
>>> MPI::Init(argc, argv);
>>> fprintf(stderr, "after init %d\n", MPI::Is_initialized());
>>> nprocs_mpi = MPI::COMM_WORLD.Get_size();
>>> fprintf(stderr, "won't get here\n");
>>>
>>> I get:
>>>
>>> after init 1
>>> [host01:14348] *** Process received signal ***
>>> [host01:14348] Signal: Segmentation fault (11)
>>> [host01:14348] Signal code: Address not mapped (1)
>>> [host01:14348] Failing at address: 0x8
>>> [host01:14348] [ 0] [0xe410]
>>> [host01:14348] [ 1] p(_ZNK3MPI4Comm8Get_sizeEv+0x19) [0x8051299]
>>> [host01:14348] [ 2] p(main+0x86) [0x804ee4e]
>>> [host01:14348] [ 3] /lib/libc.so.6(__libc_start_main+0xe5) [0x4180b5c5]
>>> [host01:14348] [ 4] p(__gxx_personality_v0+0x125) [0x804ecc1]
>>> [host01:14348] *** End of error message ***
>>>
>>> I'm using OPEN MPI v. 1.4.2 (downloaded yesterday).
>>> In my rankfile I really want to write something like slot=*. I know
>>> slot=0-3 would be a solution, but when generating rankfile I may not
>>> be sure how many processors are there available on a particular host.
>>>
>>> Any help would be appreciated.
>>>
>>> Regards,
>>> Grzegorz Maj
>>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-06 Thread Grzegorz Maj
Hi Ralph,
sorry for the late response, but I couldn't find free time to play
with this. Finally I've applied the patch you prepared. I've launched
my processes in the way you've described and I think it's working as
you expected. None of my processes runs the orted daemon and they can
perform MPI operations. Unfortunately I'm still hitting the 65
processes issue :(
Maybe I'm doing something wrong.
I attach my source code. If anybody could have a look on this, I would
be grateful.

When I run that code with clients_count <= 65 everything works fine:
all the processes create a common grid, exchange some information and
disconnect.
When I set clients_count > 65 the 66th process crashes on
MPI_Comm_connect (segmentation fault).

Another thing I would like to know is if it's normal that any of my
processes when calling MPI_Comm_connect or MPI_Comm_accept when the
other side is not ready, is eating up a full CPU available.

Any help would be appreciated,
Grzegorz Maj


2010/4/24 Ralph Castain :
> Actually, OMPI is distributed with a daemon that does pretty much what you
> want. Checkout "man ompi-server". I originally wrote that code to support
> cross-application MPI publish/subscribe operations, but we can utilize it
> here too. Have to blame me for not making it more publicly known.
> The attached patch upgrades ompi-server and modifies the singleton startup
> to provide your desired support. This solution works in the following
> manner:
> 1. launch "ompi-server -report-uri ". This starts a persistent
> daemon called "ompi-server" that acts as a rendezvous point for
> independently started applications.  The problem with starting different
> applications and wanting them to MPI connect/accept lies in the need to have
> the applications find each other. If they can't discover contact info for
> the other app, then they can't wire up their interconnects. The
> "ompi-server" tool provides that rendezvous point. I don't like that
> comm_accept segfaulted - should have just error'd out.
> 2. set OMPI_MCA_orte_server=file:" in the environment where you
> will start your processes. This will allow your singleton processes to find
> the ompi-server. I automatically also set the envar to connect the MPI
> publish/subscribe system for you.
> 3. run your processes. As they think they are singletons, they will detect
> the presence of the above envar and automatically connect themselves to the
> "ompi-server" daemon. This provides each process with the ability to perform
> any MPI-2 operation.
> I tested this on my machines and it worked, so hopefully it will meet your
> needs. You only need to run one "ompi-server" period, so long as you locate
> it where all of the processes can find the contact file and can open a TCP
> socket to the daemon. There is a way to knit multiple ompi-servers into a
> broader network (e.g., to connect processes that cannot directly access a
> server due to network segmentation), but it's a tad tricky - let me know if
> you require it and I'll try to help.
> If you have trouble wiring them all into a single communicator, you might
> ask separately about that and see if one of our MPI experts can provide
> advice (I'm just the RTE grunt).
> HTH - let me know how this works for you and I'll incorporate it into future
> OMPI releases.
> Ralph
>
>
> On Apr 24, 2010, at 1:49 AM, Krzysztof Zarzycki wrote:
>
> Hi Ralph,
> I'm Krzysztof and I'm working with Grzegorz Maj on this our small
> project/experiment.
> We definitely would like to give your patch a try. But could you please
> explain your solution a little more?
> You still would like to start one mpirun per mpi grid, and then have
> processes started by us to join the MPI comm?
> It is a good solution of course.
> But it would be especially preferable to have one daemon running
> persistently on our "entry" machine that can handle several mpi grid starts.
> Can your patch help us this way too?
> Thanks for your help!
> Krzysztof
>
> On 24 April 2010 03:51, Ralph Castain  wrote:
>>
>> In thinking about this, my proposed solution won't entirely fix the
>> problem - you'll still wind up with all those daemons. I believe I can
>> resolve that one as well, but it would require a patch.
>>
>> Would you like me to send you something you could try? Might take a couple
>> of iterations to get it right...
>>
>> On Apr 23, 2010, at 12:12 PM, Ralph Castain wrote:
>>
>> > HmmmI -think- this will work, but I cannot guarantee it:
>> >
>> > 1. launch one process (can just be a spinner) using mpirun that includes
>> >

[OMPI users] MPI_Init failing in singleton

2010-07-07 Thread Grzegorz Maj
Hi,
I was trying to run some MPI processes as a singletons. On some of the
machines they crash on MPI_Init. I use exactly the same binaries of my
application and the same installation of openmpi 1.4.2 on two machines
and it works on one of them and fails on the other one. This is the
command and its output (test is a simple application calling only
MPI_Init and MPI_Finalize):

LD_LIBRARY_PATH=/home/gmaj/openmpi/lib ./test
[host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 161
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_plm_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--
[host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
../../orte/runtime/orte_init.c at line 132
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--
[host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
../../orte/orted/orted_main.c at line 323
[host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
daemon on the local node in file
../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
381
[host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
daemon on the local node in file
../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
143
[host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
daemon on the local node in file ../../orte/runtime/orte_init.c at
line 132
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Unable to start a daemon on the local node (-128)
instead of ORTE_SUCCESS
--
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Unable to start a daemon on the local node" (-128)
instead of "Success" (0)
--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[host01:21865] Abort before MPI_INIT completed successfully; not able
to guarantee that all other processes were killed!


Any ideas on this?

Thanks,
Grzegorz Maj


Re: [OMPI users] MPI_Init failing in singleton

2010-07-07 Thread Grzegorz Maj
The problem was that orted couldn't find ssh nor rsh on that machine.
I've added my installation to PATH and it now works.
So one question: I will definitely not use MPI_Comm_spawn or any
related stuff. Do I need this ssh? If not, is there any way to say
orted that it shouldn't be looking for ssh because it won't need it?

Regards,
Grzegorz Maj

2010/7/7 Ralph Castain :
> Check your path and ld_library_path- looks like you are picking up some stale 
> binary for orted and/or stale libraries (perhaps getting the default OMPI 
> instead of 1.4.2) on the machine where it fails.
>
> On Jul 7, 2010, at 7:44 AM, Grzegorz Maj wrote:
>
>> Hi,
>> I was trying to run some MPI processes as a singletons. On some of the
>> machines they crash on MPI_Init. I use exactly the same binaries of my
>> application and the same installation of openmpi 1.4.2 on two machines
>> and it works on one of them and fails on the other one. This is the
>> command and its output (test is a simple application calling only
>> MPI_Init and MPI_Finalize):
>>
>> LD_LIBRARY_PATH=/home/gmaj/openmpi/lib ./test
>> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
>> ../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 161
>> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>>  orte_plm_base_select failed
>>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
>> --
>> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
>> ../../orte/runtime/orte_init.c at line 132
>> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>>  orte_ess_set_name failed
>>  --> Returned value Not found (-13) instead of ORTE_SUCCESS
>> --
>> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
>> ../../orte/orted/orted_main.c at line 323
>> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
>> daemon on the local node in file
>> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
>> 381
>> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
>> daemon on the local node in file
>> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
>> 143
>> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
>> daemon on the local node in file ../../orte/runtime/orte_init.c at
>> line 132
>> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>>  orte_ess_set_name failed
>>  --> Returned value Unable to start a daemon on the local node (-128)
>> instead of ORTE_SUCCESS
>> --
>> --
>> It looks like MPI_INIT failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during MPI_INIT; some of which are due to configuration or environment
>> problems.  This failure appears to be an internal failure; here's some
>> additional information (which may only be relevant to an Open MPI
>> developer):
>>
>>  ompi_mpi_init: orte_init failed
>>  --> Returned "Unable to start a daemon on the local node&

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-07 Thread Grzegorz Maj
2010/7/7 Ralph Castain :
>
> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>
>> Hi Ralph,
>> sorry for the late response, but I couldn't find free time to play
>> with this. Finally I've applied the patch you prepared. I've launched
>> my processes in the way you've described and I think it's working as
>> you expected. None of my processes runs the orted daemon and they can
>> perform MPI operations. Unfortunately I'm still hitting the 65
>> processes issue :(
>> Maybe I'm doing something wrong.
>> I attach my source code. If anybody could have a look on this, I would
>> be grateful.
>>
>> When I run that code with clients_count <= 65 everything works fine:
>> all the processes create a common grid, exchange some information and
>> disconnect.
>> When I set clients_count > 65 the 66th process crashes on
>> MPI_Comm_connect (segmentation fault).
>
> I didn't have time to check the code, but my guess is that you are still 
> hitting some kind of file descriptor or other limit. Check to see what your 
> limits are - usually "ulimit" will tell you.

My limitations are:
time(seconds)unlimited
file(blocks) unlimited
data(kb) unlimited
stack(kb)10240
coredump(blocks) 0
memory(kb)   unlimited
locked memory(kb)64
process  200704
nofiles  1024
vmemory(kb)  unlimited
locksunlimited

Which one do you think could be responsible for that?

I was trying to run all the 66 processes on one machine or spread them
across several machines and it always crashes the same way on the 66th
process.

>
>>
>> Another thing I would like to know is if it's normal that any of my
>> processes when calling MPI_Comm_connect or MPI_Comm_accept when the
>> other side is not ready, is eating up a full CPU available.
>
> Yes - the waiting process is polling in a tight loop waiting for the 
> connection to be made.
>
>>
>> Any help would be appreciated,
>> Grzegorz Maj
>>
>>
>> 2010/4/24 Ralph Castain :
>>> Actually, OMPI is distributed with a daemon that does pretty much what you
>>> want. Checkout "man ompi-server". I originally wrote that code to support
>>> cross-application MPI publish/subscribe operations, but we can utilize it
>>> here too. Have to blame me for not making it more publicly known.
>>> The attached patch upgrades ompi-server and modifies the singleton startup
>>> to provide your desired support. This solution works in the following
>>> manner:
>>> 1. launch "ompi-server -report-uri ". This starts a persistent
>>> daemon called "ompi-server" that acts as a rendezvous point for
>>> independently started applications.  The problem with starting different
>>> applications and wanting them to MPI connect/accept lies in the need to have
>>> the applications find each other. If they can't discover contact info for
>>> the other app, then they can't wire up their interconnects. The
>>> "ompi-server" tool provides that rendezvous point. I don't like that
>>> comm_accept segfaulted - should have just error'd out.
>>> 2. set OMPI_MCA_orte_server=file:" in the environment where you
>>> will start your processes. This will allow your singleton processes to find
>>> the ompi-server. I automatically also set the envar to connect the MPI
>>> publish/subscribe system for you.
>>> 3. run your processes. As they think they are singletons, they will detect
>>> the presence of the above envar and automatically connect themselves to the
>>> "ompi-server" daemon. This provides each process with the ability to perform
>>> any MPI-2 operation.
>>> I tested this on my machines and it worked, so hopefully it will meet your
>>> needs. You only need to run one "ompi-server" period, so long as you locate
>>> it where all of the processes can find the contact file and can open a TCP
>>> socket to the daemon. There is a way to knit multiple ompi-servers into a
>>> broader network (e.g., to connect processes that cannot directly access a
>>> server due to network segmentation), but it's a tad tricky - let me know if
>>> you require it and I'll try to help.
>>> If you have trouble wiring them all into a single communicator, you might
>>> ask separately about that and see if one of our MPI experts can provide
>>> advice (I'm just the RTE grunt).
>>> HTH - let me know how this works for you and I'

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-12 Thread Grzegorz Maj
1024 is not the problem: changing it to 2048 hasn't change anything.
Following your advice I've run my process using gdb. Unfortunately I
didn't get anything more than:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xf7e4c6c0 (LWP 20246)]
0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0

(gdb) bt
#0  0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
#1  0xf7e3ba95 in connect_accept () from
/home/gmaj/openmpi/lib/openmpi/mca_dpm_orte.so
#2  0xf7f62013 in PMPI_Comm_connect () from /home/gmaj/openmpi/lib/libmpi.so.0
#3  0x080489ed in main (argc=825832753, argv=0x34393638) at client.c:43

What's more: when I've added a breakpoint on ompi_comm_set in 66th
process and stepped a couple of instructions, one of the other
processes crashed (as usualy on ompi_comm_set) earlier than 66th did.

Finally I decided to recompile openmpi using -g flag for gcc. In this
case the 66 processes issue has gone! I was running my applications
exactly the same way as previously (even without recompilation) and
I've run successfully over 130 processes.
When switching back to the openmpi compilation without -g it again segfaults.

Any ideas? I'm really confused.



2010/7/7 Ralph Castain :
> I would guess the #files limit of 1024. However, if it behaves the same way 
> when spread across multiple machines, I would suspect it is somewhere in your 
> program itself. Given that the segfault is in your process, can you use gdb 
> to look at the core file and see where and why it fails?
>
> On Jul 7, 2010, at 10:17 AM, Grzegorz Maj wrote:
>
>> 2010/7/7 Ralph Castain :
>>>
>>> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>>>
>>>> Hi Ralph,
>>>> sorry for the late response, but I couldn't find free time to play
>>>> with this. Finally I've applied the patch you prepared. I've launched
>>>> my processes in the way you've described and I think it's working as
>>>> you expected. None of my processes runs the orted daemon and they can
>>>> perform MPI operations. Unfortunately I'm still hitting the 65
>>>> processes issue :(
>>>> Maybe I'm doing something wrong.
>>>> I attach my source code. If anybody could have a look on this, I would
>>>> be grateful.
>>>>
>>>> When I run that code with clients_count <= 65 everything works fine:
>>>> all the processes create a common grid, exchange some information and
>>>> disconnect.
>>>> When I set clients_count > 65 the 66th process crashes on
>>>> MPI_Comm_connect (segmentation fault).
>>>
>>> I didn't have time to check the code, but my guess is that you are still 
>>> hitting some kind of file descriptor or other limit. Check to see what your 
>>> limits are - usually "ulimit" will tell you.
>>
>> My limitations are:
>> time(seconds)        unlimited
>> file(blocks)         unlimited
>> data(kb)             unlimited
>> stack(kb)            10240
>> coredump(blocks)     0
>> memory(kb)           unlimited
>> locked memory(kb)    64
>> process              200704
>> nofiles              1024
>> vmemory(kb)          unlimited
>> locks                unlimited
>>
>> Which one do you think could be responsible for that?
>>
>> I was trying to run all the 66 processes on one machine or spread them
>> across several machines and it always crashes the same way on the 66th
>> process.
>>
>>>
>>>>
>>>> Another thing I would like to know is if it's normal that any of my
>>>> processes when calling MPI_Comm_connect or MPI_Comm_accept when the
>>>> other side is not ready, is eating up a full CPU available.
>>>
>>> Yes - the waiting process is polling in a tight loop waiting for the 
>>> connection to be made.
>>>
>>>>
>>>> Any help would be appreciated,
>>>> Grzegorz Maj
>>>>
>>>>
>>>> 2010/4/24 Ralph Castain :
>>>>> Actually, OMPI is distributed with a daemon that does pretty much what you
>>>>> want. Checkout "man ompi-server". I originally wrote that code to support
>>>>> cross-application MPI publish/subscribe operations, but we can utilize it
>>>>> here too. Have to blame me for not making it more publicly known.
>>>>> The attached patch upgrades ompi-server and modifies the singleton startup
>>>>> to provide your desired support. This solution works in the following
>>>>&g

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-12 Thread Grzegorz Maj
2010/7/12 Ralph Castain :
> Dug around a bit and found the problem!!
>
> I have no idea who or why this was done, but somebody set a limit of 64 
> separate jobids in the dynamic init called by ompi_comm_set, which builds the 
> intercommunicator. Unfortunately, they hard-wired the array size, but never 
> check that size before adding to it.
>
> So after 64 calls to connect_accept, you are overwriting other areas of the 
> code. As you found, hitting 66 causes it to segfault.
>
> I'll fix this on the developer's trunk (I'll also add that original patch to 
> it). Rather than my searching this thread in detail, can you remind me what 
> version you are using so I can patch it too?

I'm using 1.4.2
Thanks a lot and I'm looking forward for the patch.

>
> Thanks for your patience with this!
> Ralph
>
>
> On Jul 12, 2010, at 7:20 AM, Grzegorz Maj wrote:
>
>> 1024 is not the problem: changing it to 2048 hasn't change anything.
>> Following your advice I've run my process using gdb. Unfortunately I
>> didn't get anything more than:
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 0xf7e4c6c0 (LWP 20246)]
>> 0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
>>
>> (gdb) bt
>> #0  0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
>> #1  0xf7e3ba95 in connect_accept () from
>> /home/gmaj/openmpi/lib/openmpi/mca_dpm_orte.so
>> #2  0xf7f62013 in PMPI_Comm_connect () from 
>> /home/gmaj/openmpi/lib/libmpi.so.0
>> #3  0x080489ed in main (argc=825832753, argv=0x34393638) at client.c:43
>>
>> What's more: when I've added a breakpoint on ompi_comm_set in 66th
>> process and stepped a couple of instructions, one of the other
>> processes crashed (as usualy on ompi_comm_set) earlier than 66th did.
>>
>> Finally I decided to recompile openmpi using -g flag for gcc. In this
>> case the 66 processes issue has gone! I was running my applications
>> exactly the same way as previously (even without recompilation) and
>> I've run successfully over 130 processes.
>> When switching back to the openmpi compilation without -g it again segfaults.
>>
>> Any ideas? I'm really confused.
>>
>>
>>
>> 2010/7/7 Ralph Castain :
>>> I would guess the #files limit of 1024. However, if it behaves the same way 
>>> when spread across multiple machines, I would suspect it is somewhere in 
>>> your program itself. Given that the segfault is in your process, can you 
>>> use gdb to look at the core file and see where and why it fails?
>>>
>>> On Jul 7, 2010, at 10:17 AM, Grzegorz Maj wrote:
>>>
>>>> 2010/7/7 Ralph Castain :
>>>>>
>>>>> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>>>>>
>>>>>> Hi Ralph,
>>>>>> sorry for the late response, but I couldn't find free time to play
>>>>>> with this. Finally I've applied the patch you prepared. I've launched
>>>>>> my processes in the way you've described and I think it's working as
>>>>>> you expected. None of my processes runs the orted daemon and they can
>>>>>> perform MPI operations. Unfortunately I'm still hitting the 65
>>>>>> processes issue :(
>>>>>> Maybe I'm doing something wrong.
>>>>>> I attach my source code. If anybody could have a look on this, I would
>>>>>> be grateful.
>>>>>>
>>>>>> When I run that code with clients_count <= 65 everything works fine:
>>>>>> all the processes create a common grid, exchange some information and
>>>>>> disconnect.
>>>>>> When I set clients_count > 65 the 66th process crashes on
>>>>>> MPI_Comm_connect (segmentation fault).
>>>>>
>>>>> I didn't have time to check the code, but my guess is that you are still 
>>>>> hitting some kind of file descriptor or other limit. Check to see what 
>>>>> your limits are - usually "ulimit" will tell you.
>>>>
>>>> My limitations are:
>>>> time(seconds)        unlimited
>>>> file(blocks)         unlimited
>>>> data(kb)             unlimited
>>>> stack(kb)            10240
>>>> coredump(blocks)     0
>>>> memory(kb)           unlimited
>>>> locked memory(kb)    64
>>>> process              200704
>>>> 

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-13 Thread Grzegorz Maj
Bad news..
I've tried the latest patch with and without the prior one, but it
hasn't changed anything. I've also tried using the old code but with
the OMPI_DPM_BASE_MAXJOBIDS constant changed to 80, but it also didn't
help.
While looking through the sources of openmpi-1.4.2 I couldn't find any
call of the function ompi_dpm_base_mark_dyncomm.


2010/7/12 Ralph Castain :
> Just so you don't have to wait for 1.4.3 release, here is the patch (doesn't 
> include the prior patch).
>
>
>
>
> On Jul 12, 2010, at 12:13 PM, Grzegorz Maj wrote:
>
>> 2010/7/12 Ralph Castain :
>>> Dug around a bit and found the problem!!
>>>
>>> I have no idea who or why this was done, but somebody set a limit of 64 
>>> separate jobids in the dynamic init called by ompi_comm_set, which builds 
>>> the intercommunicator. Unfortunately, they hard-wired the array size, but 
>>> never check that size before adding to it.
>>>
>>> So after 64 calls to connect_accept, you are overwriting other areas of the 
>>> code. As you found, hitting 66 causes it to segfault.
>>>
>>> I'll fix this on the developer's trunk (I'll also add that original patch 
>>> to it). Rather than my searching this thread in detail, can you remind me 
>>> what version you are using so I can patch it too?
>>
>> I'm using 1.4.2
>> Thanks a lot and I'm looking forward for the patch.
>>
>>>
>>> Thanks for your patience with this!
>>> Ralph
>>>
>>>
>>> On Jul 12, 2010, at 7:20 AM, Grzegorz Maj wrote:
>>>
>>>> 1024 is not the problem: changing it to 2048 hasn't change anything.
>>>> Following your advice I've run my process using gdb. Unfortunately I
>>>> didn't get anything more than:
>>>>
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>> [Switching to Thread 0xf7e4c6c0 (LWP 20246)]
>>>> 0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
>>>>
>>>> (gdb) bt
>>>> #0  0xf7f39905 in ompi_comm_set () from /home/gmaj/openmpi/lib/libmpi.so.0
>>>> #1  0xf7e3ba95 in connect_accept () from
>>>> /home/gmaj/openmpi/lib/openmpi/mca_dpm_orte.so
>>>> #2  0xf7f62013 in PMPI_Comm_connect () from 
>>>> /home/gmaj/openmpi/lib/libmpi.so.0
>>>> #3  0x080489ed in main (argc=825832753, argv=0x34393638) at client.c:43
>>>>
>>>> What's more: when I've added a breakpoint on ompi_comm_set in 66th
>>>> process and stepped a couple of instructions, one of the other
>>>> processes crashed (as usualy on ompi_comm_set) earlier than 66th did.
>>>>
>>>> Finally I decided to recompile openmpi using -g flag for gcc. In this
>>>> case the 66 processes issue has gone! I was running my applications
>>>> exactly the same way as previously (even without recompilation) and
>>>> I've run successfully over 130 processes.
>>>> When switching back to the openmpi compilation without -g it again 
>>>> segfaults.
>>>>
>>>> Any ideas? I'm really confused.
>>>>
>>>>
>>>>
>>>> 2010/7/7 Ralph Castain :
>>>>> I would guess the #files limit of 1024. However, if it behaves the same 
>>>>> way when spread across multiple machines, I would suspect it is somewhere 
>>>>> in your program itself. Given that the segfault is in your process, can 
>>>>> you use gdb to look at the core file and see where and why it fails?
>>>>>
>>>>> On Jul 7, 2010, at 10:17 AM, Grzegorz Maj wrote:
>>>>>
>>>>>> 2010/7/7 Ralph Castain :
>>>>>>>
>>>>>>> On Jul 6, 2010, at 8:48 AM, Grzegorz Maj wrote:
>>>>>>>
>>>>>>>> Hi Ralph,
>>>>>>>> sorry for the late response, but I couldn't find free time to play
>>>>>>>> with this. Finally I've applied the patch you prepared. I've launched
>>>>>>>> my processes in the way you've described and I think it's working as
>>>>>>>> you expected. None of my processes runs the orted daemon and they can
>>>>>>>> perform MPI operations. Unfortunately I'm still hitting the 65
>>>>>>>> processes issue :(
>>>>>>>> Maybe I'm doing something wrong.
>>>>>>>> I attach m

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-20 Thread Grzegorz Maj
My start script looks almost exactly the same as the one published by
Edgar, ie. the processes are starting one by one with no delay.

2010/7/20 Ralph Castain :
> Grzegorz: something occurred to me. When you start all these processes, how 
> are you staggering their wireup? Are they flooding us, or are you 
> time-shifting them a little?
>
>
> On Jul 19, 2010, at 10:32 AM, Edgar Gabriel wrote:
>
>> Hm, so I am not sure how to approach this. First of all, the test case
>> works for me. I used up to 80 clients, and for both optimized and
>> non-optimized compilation. I ran the tests with trunk (not with 1.4
>> series, but the communicator code is identical in both cases). Clearly,
>> the patch from Ralph is necessary to make it work.
>>
>> Additionally, I went through the communicator creation code for dynamic
>> communicators trying to find spots that could create problems. The only
>> place that I found the number 64 appear is the fortran-to-c mapping
>> arrays (e.g. for communicators), where the initial size of the table is
>> 64. I looked twice over the pointer-array code to see whether we could
>> have a problem their (since it is a key-piece of the cid allocation code
>> for communicators), but I am fairly confident that it is correct.
>>
>> Note, that we have other (non-dynamic tests), were comm_set is called
>> 100,000 times, and the code per se does not seem to have a problem due
>> to being called too often. So I am not sure what else to look at.
>>
>> Edgar
>>
>>
>>
>> On 7/13/2010 8:42 PM, Ralph Castain wrote:
>>> As far as I can tell, it appears the problem is somewhere in our 
>>> communicator setup. The people knowledgeable on that area are going to look 
>>> into it later this week.
>>>
>>> I'm creating a ticket to track the problem and will copy you on it.
>>>
>>>
>>> On Jul 13, 2010, at 6:57 AM, Ralph Castain wrote:
>>>
>>>>
>>>> On Jul 13, 2010, at 3:36 AM, Grzegorz Maj wrote:
>>>>
>>>>> Bad news..
>>>>> I've tried the latest patch with and without the prior one, but it
>>>>> hasn't changed anything. I've also tried using the old code but with
>>>>> the OMPI_DPM_BASE_MAXJOBIDS constant changed to 80, but it also didn't
>>>>> help.
>>>>> While looking through the sources of openmpi-1.4.2 I couldn't find any
>>>>> call of the function ompi_dpm_base_mark_dyncomm.
>>>>
>>>> It isn't directly called - it shows in ompi_comm_set as 
>>>> ompi_dpm.mark_dyncomm. You were definitely overrunning that array, but I 
>>>> guess something else is also being hit. Have to look further...
>>>>
>>>>
>>>>>
>>>>>
>>>>> 2010/7/12 Ralph Castain :
>>>>>> Just so you don't have to wait for 1.4.3 release, here is the patch 
>>>>>> (doesn't include the prior patch).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Jul 12, 2010, at 12:13 PM, Grzegorz Maj wrote:
>>>>>>
>>>>>>> 2010/7/12 Ralph Castain :
>>>>>>>> Dug around a bit and found the problem!!
>>>>>>>>
>>>>>>>> I have no idea who or why this was done, but somebody set a limit of 
>>>>>>>> 64 separate jobids in the dynamic init called by ompi_comm_set, which 
>>>>>>>> builds the intercommunicator. Unfortunately, they hard-wired the array 
>>>>>>>> size, but never check that size before adding to it.
>>>>>>>>
>>>>>>>> So after 64 calls to connect_accept, you are overwriting other areas 
>>>>>>>> of the code. As you found, hitting 66 causes it to segfault.
>>>>>>>>
>>>>>>>> I'll fix this on the developer's trunk (I'll also add that original 
>>>>>>>> patch to it). Rather than my searching this thread in detail, can you 
>>>>>>>> remind me what version you are using so I can patch it too?
>>>>>>>
>>>>>>> I'm using 1.4.2
>>>>>>> Thanks a lot and I'm looking forward for the patch.
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for your patience with this!
>>>>>>>> Ralph
>>>>>>>>
>>&

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-26 Thread Grzegorz Maj
Hi,
I'm very sorry, but the problem was on my side. My installation
process was not always taking the newest sources of openmpi. In this
case it hasn't installed the version with the latest patch. Now I
think everything works fine - I could run over 130 processes with no
problems.
I'm sorry again that I've wasted your time. And thank you for the patch.

2010/7/21 Ralph Castain :
> We're having some problem replicating this once my patches are applied. Can 
> you send us your configure cmd? Just the output from "head config.log" will 
> do for now.
>
> Thanks!
>
> On Jul 20, 2010, at 9:09 AM, Grzegorz Maj wrote:
>
>> My start script looks almost exactly the same as the one published by
>> Edgar, ie. the processes are starting one by one with no delay.
>>
>> 2010/7/20 Ralph Castain :
>>> Grzegorz: something occurred to me. When you start all these processes, how 
>>> are you staggering their wireup? Are they flooding us, or are you 
>>> time-shifting them a little?
>>>
>>>
>>> On Jul 19, 2010, at 10:32 AM, Edgar Gabriel wrote:
>>>
>>>> Hm, so I am not sure how to approach this. First of all, the test case
>>>> works for me. I used up to 80 clients, and for both optimized and
>>>> non-optimized compilation. I ran the tests with trunk (not with 1.4
>>>> series, but the communicator code is identical in both cases). Clearly,
>>>> the patch from Ralph is necessary to make it work.
>>>>
>>>> Additionally, I went through the communicator creation code for dynamic
>>>> communicators trying to find spots that could create problems. The only
>>>> place that I found the number 64 appear is the fortran-to-c mapping
>>>> arrays (e.g. for communicators), where the initial size of the table is
>>>> 64. I looked twice over the pointer-array code to see whether we could
>>>> have a problem their (since it is a key-piece of the cid allocation code
>>>> for communicators), but I am fairly confident that it is correct.
>>>>
>>>> Note, that we have other (non-dynamic tests), were comm_set is called
>>>> 100,000 times, and the code per se does not seem to have a problem due
>>>> to being called too often. So I am not sure what else to look at.
>>>>
>>>> Edgar
>>>>
>>>>
>>>>
>>>> On 7/13/2010 8:42 PM, Ralph Castain wrote:
>>>>> As far as I can tell, it appears the problem is somewhere in our 
>>>>> communicator setup. The people knowledgeable on that area are going to 
>>>>> look into it later this week.
>>>>>
>>>>> I'm creating a ticket to track the problem and will copy you on it.
>>>>>
>>>>>
>>>>> On Jul 13, 2010, at 6:57 AM, Ralph Castain wrote:
>>>>>
>>>>>>
>>>>>> On Jul 13, 2010, at 3:36 AM, Grzegorz Maj wrote:
>>>>>>
>>>>>>> Bad news..
>>>>>>> I've tried the latest patch with and without the prior one, but it
>>>>>>> hasn't changed anything. I've also tried using the old code but with
>>>>>>> the OMPI_DPM_BASE_MAXJOBIDS constant changed to 80, but it also didn't
>>>>>>> help.
>>>>>>> While looking through the sources of openmpi-1.4.2 I couldn't find any
>>>>>>> call of the function ompi_dpm_base_mark_dyncomm.
>>>>>>
>>>>>> It isn't directly called - it shows in ompi_comm_set as 
>>>>>> ompi_dpm.mark_dyncomm. You were definitely overrunning that array, but I 
>>>>>> guess something else is also being hit. Have to look further...
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2010/7/12 Ralph Castain :
>>>>>>>> Just so you don't have to wait for 1.4.3 release, here is the patch 
>>>>>>>> (doesn't include the prior patch).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jul 12, 2010, at 12:13 PM, Grzegorz Maj wrote:
>>>>>>>>
>>>>>>>>> 2010/7/12 Ralph Castain :
>>>>>>>>>> Dug around a bit and found the problem!!
>>>>>>>>>>
>>>>>>>>>> I have no idea who or why this was d

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-27 Thread Grzegorz Maj
So now I have a new question.
When I run my server and a lot of clients on the same machine,
everything looks fine.

But when I try to run the clients on several machines the most
frequent scenario is:
* server is stared on machine A
* X (= 1, 4, 10, ..) clients are started on machine B and they connect
successfully
* the first client starting on machine C connects successfully to the
server, but the whole grid hangs on MPI_Comm_merge (all the processes
from intercommunicator get there).

As I said it's the most frequent scenario. Sometimes I can connect the
clients from several machines. Sometimes it hangs (always on
MPI_Comm_merge) when connecting the clients from machine B.
The interesting thing is, that if before MPI_Comm_merge I send a dummy
message on the intercommunicator from process rank 0 in one group to
process rank 0 in the other one, it will not hang on MPI_Comm_merge.

I've tried both versions with and without the first patch (ompi-server
as orted) but it doesn't change the behavior.

I've attached gdb to my server, this is bt:
#0  0xe410 in __kernel_vsyscall ()
#1  0x00637afc in sched_yield () from /lib/libc.so.6
#2  0xf7e8ce31 in opal_progress () at ../../opal/runtime/opal_progress.c:220
#3  0xf7f60ad4 in opal_condition_wait (c=0xf7fd7dc0, m=0xf7fd7e00) at
../../opal/threads/condition.h:99
#4  0xf7f60dee in ompi_request_default_wait_all (count=2,
requests=0xff8d7754, statuses=0x0) at
../../ompi/request/req_wait.c:262
#5  0xf7d3e221 in mca_coll_inter_allgatherv_inter (sbuf=0xff8d7824,
scount=1, sdtype=0x8049200, rbuf=0xff8d77e0, rcounts=0x9783df8,
disps=0x9755520, rdtype=0x8049200, comm=0x978c2a8, module=0x9794b08)
at ../../../../../ompi/mca/coll/inter/coll_inter_allgatherv.c:127
#6  0xf7f4c615 in ompi_comm_determine_first (intercomm=0x978c2a8,
high=0) at ../../ompi/communicator/comm.c:1199
#7  0xf7f8d1d9 in PMPI_Intercomm_merge (intercomm=0x978c2a8, high=0,
newcomm=0xff8d78c0) at pintercomm_merge.c:84
#8  0x0804893c in main (argc=Cannot access memory at address 0xf
) at server.c:50

And this is bt from one of the clients:
#0  0xe410 in __kernel_vsyscall ()
#1  0x0064993b in poll () from /lib/libc.so.6
#2  0xf7de027f in poll_dispatch (base=0x8643fb8, arg=0x86442d8,
tv=0xff82299c) at ../../../opal/event/poll.c:168
#3  0xf7dde4b2 in opal_event_base_loop (base=0x8643fb8, flags=2) at
../../../opal/event/event.c:807
#4  0xf7dde34f in opal_event_loop (flags=2) at ../../../opal/event/event.c:730
#5  0xf7dcfc77 in opal_progress () at ../../opal/runtime/opal_progress.c:189
#6  0xf7ea80b8 in opal_condition_wait (c=0xf7f25160, m=0xf7f251a0) at
../../opal/threads/condition.h:99
#7  0xf7ea7ff3 in ompi_request_wait_completion (req=0x8686680) at
../../ompi/request/request.h:375
#8  0xf7ea7ef1 in ompi_request_default_wait (req_ptr=0xff822ae8,
status=0x0) at ../../ompi/request/req_wait.c:37
#9  0xf7c663a6 in ompi_coll_tuned_bcast_intra_generic
(buffer=0xff822d20, original_count=1, datatype=0x868bd00, root=0,
comm=0x86aa7f8, module=0x868b700, count_by_segment=1, tree=0x868b3d8)
at ../../../../../ompi/mca/coll/tuned/coll_tuned_bcast.c:237
#10 0xf7c668ea in ompi_coll_tuned_bcast_intra_binomial
(buffer=0xff822d20, count=1, datatype=0x868bd00, root=0,
comm=0x86aa7f8, module=0x868b700, segsize=0)
at ../../../../../ompi/mca/coll/tuned/coll_tuned_bcast.c:368
#11 0xf7c5af12 in ompi_coll_tuned_bcast_intra_dec_fixed
(buff=0xff822d20, count=1, datatype=0x868bd00, root=0, comm=0x86aa7f8,
module=0x868b700)
at ../../../../../ompi/mca/coll/tuned/coll_tuned_decision_fixed.c:256
#12 0xf7c73269 in mca_coll_sync_bcast (buff=0xff822d20, count=1,
datatype=0x868bd00, root=0, comm=0x86aa7f8, module=0x86aaa28) at
../../../../../ompi/mca/coll/sync/coll_sync_bcast.c:44
#13 0xf7c80381 in mca_coll_inter_allgatherv_inter (sbuf=0xff822d64,
scount=0, sdtype=0x8049400, rbuf=0xff822d20, rcounts=0x868a188,
disps=0x868abb8, rdtype=0x8049400, comm=0x86aa300,
module=0x86aae18) at
../../../../../ompi/mca/coll/inter/coll_inter_allgatherv.c:134
#14 0xf7e9398f in ompi_comm_determine_first (intercomm=0x86aa300,
high=0) at ../../ompi/communicator/comm.c:1199
#15 0xf7ed7833 in PMPI_Intercomm_merge (intercomm=0x86aa300, high=0,
newcomm=0xff8241d0) at pintercomm_merge.c:84
#16 0x08048afd in main (argc=943274038, argv=0x33393133) at client.c:47



What do you think may cause the problem?


2010/7/26 Ralph Castain :
> No problem at all - glad it works!
>
> On Jul 26, 2010, at 7:58 AM, Grzegorz Maj wrote:
>
>> Hi,
>> I'm very sorry, but the problem was on my side. My installation
>> process was not always taking the newest sources of openmpi. In this
>> case it hasn't installed the version with the latest patch. Now I
>> think everything works fine - I could run over 130 processes with no
>> problems.
>> I'm sorry again that I've wasted your time. And thank you for the patch.
>>
>> 2010/7/21 Ralph Castain :
>>> We

Re: [OMPI users] Dynamic processes connection and segfault on MPI_Comm_accept

2010-07-28 Thread Grzegorz Maj
I've attached gdb to the client which has just connected to the grid.
Its bt is almost exactly the same as the server's one:
#0  0x428066d7 in sched_yield () from /lib/libc.so.6
#1  0x00933cbf in opal_progress () at ../../opal/runtime/opal_progress.c:220
#2  0x00d460b8 in opal_condition_wait (c=0xdc3160, m=0xdc31a0) at
../../opal/threads/condition.h:99
#3  0x00d463cc in ompi_request_default_wait_all (count=2,
requests=0xff8a36d0, statuses=0x0) at
../../ompi/request/req_wait.c:262
#4  0x00a1431f in mca_coll_inter_allgatherv_inter (sbuf=0xff8a3794,
scount=1, sdtype=0x8049400, rbuf=0xff8a3750, rcounts=0x80948e0,
disps=0x8093938, rdtype=0x8049400, comm=0x8094fb8, module=0x80954a0)
at ../../../../../ompi/mca/coll/inter/coll_inter_allgatherv.c:127
#5  0x00d3198f in ompi_comm_determine_first (intercomm=0x8094fb8,
high=1) at ../../ompi/communicator/comm.c:1199
#6  0x00d75833 in PMPI_Intercomm_merge (intercomm=0x8094fb8, high=1,
newcomm=0xff8a4c00) at pintercomm_merge.c:84
#7  0x08048a16 in main (argc=892352312, argv=0x32323038) at client.c:28

I've tried both scenarios described: when hangs a client connecting
from machines B and C. In both cases bt looks the same.
How does it look like?
Shall I repost that using a different subject as Ralph suggested?

Regards,
Grzegorz



2010/7/27 Edgar Gabriel :
> based on your output shown here, there is absolutely nothing wrong
> (yet). Both processes are in the same function and do what they are
> supposed to do.
>
> However, I am fairly sure that the client process bt that you show is
> already part of current_intracomm. Could you try to create a bt of the
> process that is not yet part of current_intracomm (If I understand your
> code correctly, the intercommunicator is n-1 configuration, with each
> client process being part of n after the intercomm_merge). It would be
> interesting to see where that process is...
>
> Thanks
> Edgar
>
> On 7/27/2010 1:42 PM, Ralph Castain wrote:
>> This slides outside of my purview - I would suggest you post this question 
>> with a different subject line specifically mentioning failure of 
>> intercomm_merge to work so it attracts the attention of those with knowledge 
>> of that area.
>>
>>
>> On Jul 27, 2010, at 9:30 AM, Grzegorz Maj wrote:
>>
>>> So now I have a new question.
>>> When I run my server and a lot of clients on the same machine,
>>> everything looks fine.
>>>
>>> But when I try to run the clients on several machines the most
>>> frequent scenario is:
>>> * server is stared on machine A
>>> * X (= 1, 4, 10, ..) clients are started on machine B and they connect
>>> successfully
>>> * the first client starting on machine C connects successfully to the
>>> server, but the whole grid hangs on MPI_Comm_merge (all the processes
>>> from intercommunicator get there).
>>>
>>> As I said it's the most frequent scenario. Sometimes I can connect the
>>> clients from several machines. Sometimes it hangs (always on
>>> MPI_Comm_merge) when connecting the clients from machine B.
>>> The interesting thing is, that if before MPI_Comm_merge I send a dummy
>>> message on the intercommunicator from process rank 0 in one group to
>>> process rank 0 in the other one, it will not hang on MPI_Comm_merge.
>>>
>>> I've tried both versions with and without the first patch (ompi-server
>>> as orted) but it doesn't change the behavior.
>>>
>>> I've attached gdb to my server, this is bt:
>>> #0  0xe410 in __kernel_vsyscall ()
>>> #1  0x00637afc in sched_yield () from /lib/libc.so.6
>>> #2  0xf7e8ce31 in opal_progress () at ../../opal/runtime/opal_progress.c:220
>>> #3  0xf7f60ad4 in opal_condition_wait (c=0xf7fd7dc0, m=0xf7fd7e00) at
>>> ../../opal/threads/condition.h:99
>>> #4  0xf7f60dee in ompi_request_default_wait_all (count=2,
>>> requests=0xff8d7754, statuses=0x0) at
>>> ../../ompi/request/req_wait.c:262
>>> #5  0xf7d3e221 in mca_coll_inter_allgatherv_inter (sbuf=0xff8d7824,
>>> scount=1, sdtype=0x8049200, rbuf=0xff8d77e0, rcounts=0x9783df8,
>>> disps=0x9755520, rdtype=0x8049200, comm=0x978c2a8, module=0x9794b08)
>>>    at ../../../../../ompi/mca/coll/inter/coll_inter_allgatherv.c:127
>>> #6  0xf7f4c615 in ompi_comm_determine_first (intercomm=0x978c2a8,
>>> high=0) at ../../ompi/communicator/comm.c:1199
>>> #7  0xf7f8d1d9 in PMPI_Intercomm_merge (intercomm=0x978c2a8, high=0,
>>> newcomm=0xff8d78c0) at pintercomm_merge.c:84
>>> #8  0x0804893c in main (argc=Cannot access memory at address 0xf
>>> ) at server.c:50
>>>
>>> An