Hi,

today a tested some small Java programs with openmpi-dev-178-ga16c1e4.
One program throws an exception ArrayIndexOutOfBoundsException. The
program worked fine in older MPI versions, e.g., openmpi-1.8.2a1r31804.


tyr java 138 mpiexec -np 2 java MsgSendRecvMain

Now 1 process sends its greetings.

Greetings from process 1:
  message tag:    3
  message length: 26
  message:        
tyr.informatik.hs-fulda.de???????????????????????????????????????????????????????????????????????????????
??????????????????????????????????????????????????????????????????????????????

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
        at mpi.Comm.recv(Native Method)
        at mpi.Comm.recv(Comm.java:391)
        at MsgSendRecvMain.main(MsgSendRecvMain.java:92)
...



The exception happens also on my Linux box.

linpc1 java 102 mpijavac MsgSendRecvMain.java 
linpc1 java 103 mpiexec -np 2 java MsgSendRecvMain

Now 1 process sends its greetings.

Greetings from process 1:
  message tag:    3
  message length: 6
  message:        linpc1?????%???%?????%?f?%?%???$??????????

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
        at mpi.Comm.recv(Native Method)
        at mpi.Comm.recv(Comm.java:391)
        at MsgSendRecvMain.main(MsgSendRecvMain.java:92)
...



tyr java 139 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
...
(gdb) run -np 2 java MsgSendRecvMain
Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec -np 2 java 
MsgSendRecvMain
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New LWP    2        ]

Now 1 process sends its greetings.

Greetings from process 1:
  message tag:    3
  message length: 26
  message:        tyr.informatik.hs-fulda.de

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
        at mpi.Comm.recv(Native Method)
        at mpi.Comm.recv(Comm.java:391)
        at MsgSendRecvMain.main(MsgSendRecvMain.java:92)
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus 
causing
the job to be terminated. The first process to do so was:

  Process name: [[61564,1],1]
  Exit code:    1
--------------------------------------------------------------------------
[LWP    2         exited]
[New Thread 2        ]
[Switching to Thread 1 (LWP 1)]
sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to satisfy 
query
(gdb) bt
#0  0xffffffff7f6173d0 in rtld_db_dlactivity () from /usr/lib/sparcv9/ld.so.1
#1  0xffffffff7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1
#2  0xffffffff7f618950 in lm_delete () from /usr/lib/sparcv9/ld.so.1
#3  0xffffffff7f6226bc in remove_so () from /usr/lib/sparcv9/ld.so.1
#4  0xffffffff7f624574 in remove_hdl () from /usr/lib/sparcv9/ld.so.1
#5  0xffffffff7f61d97c in dlclose_core () from /usr/lib/sparcv9/ld.so.1
#6  0xffffffff7f61d9d4 in dlclose_intn () from /usr/lib/sparcv9/ld.so.1
#7  0xffffffff7f61db0c in dlclose () from /usr/lib/sparcv9/ld.so.1
#8  0xffffffff7ec87ca0 in vm_close ()
   from /usr/local/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0
#9  0xffffffff7ec85274 in lt_dlclose ()
   from /usr/local/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0
#10 0xffffffff7ecaa5dc in ri_destructor (obj=0x100187b70)
    at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_component_repository.c:382
#11 0xffffffff7eca8fd8 in opal_obj_run_destructors (object=0x100187b70)
    at ../../../../openmpi-dev-178-ga16c1e4/opal/class/opal_object.h:446
#12 0xffffffff7eca9eac in mca_base_component_repository_release (
    component=0xffffffff7b1236f0 <mca_oob_tcp_component>)
    at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_component_repository.c:240
#13 0xffffffff7ecac17c in mca_base_component_unload (
    component=0xffffffff7b1236f0 <mca_oob_tcp_component>, output_id=-1)
    at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:47
#14 0xffffffff7ecac210 in mca_base_component_close (
    component=0xffffffff7b1236f0 <mca_oob_tcp_component>, output_id=-1)
    at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:60
#15 0xffffffff7ecac2e4 in mca_base_components_close (output_id=-1, 
    components=0xffffffff7f14bc58 <orte_oob_base_framework+80>, skip=0x0)
    at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:86
#16 0xffffffff7ecac24c in mca_base_framework_components_close (
    framework=0xffffffff7f14bc08 <orte_oob_base_framework>, skip=0x0)
    at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_components_close.c:66
#17 0xffffffff7efcaf80 in orte_oob_base_close ()
    at 
../../../../openmpi-dev-178-ga16c1e4/orte/mca/oob/base/oob_base_frame.c:112
#18 0xffffffff7ecc0d74 in mca_base_framework_close (
    framework=0xffffffff7f14bc08 <orte_oob_base_framework>)
    at 
../../../../openmpi-dev-178-ga16c1e4/opal/mca/base/mca_base_framework.c:187
#19 0xffffffff7be07858 in rte_finalize ()
    at 
../../../../../openmpi-dev-178-ga16c1e4/orte/mca/ess/hnp/ess_hnp_module.c:857
#20 0xffffffff7ef338bc in orte_finalize ()
    at ../../openmpi-dev-178-ga16c1e4/orte/runtime/orte_finalize.c:66
#21 0x000000010000723c in orterun (argc=5, argv=0xffffffff7fffe0d8)
    at ../../../../openmpi-dev-178-ga16c1e4/orte/tools/orterun/orterun.c:1103
#22 0x0000000100003e80 in main (argc=5, argv=0xffffffff7fffe0d8)
    at ../../../../openmpi-dev-178-ga16c1e4/orte/tools/orterun/main.c:13
(gdb) 


Hopefully the problem has nothing to do with my program.
I would be grateful if somebody (Oscar?) can fix the
problem. Thank you very much for any help in advance.


Kind regards

Siegmar

Attachment: MsgSendRecvMain.java
Description: MsgSendRecvMain.java

Reply via email to