On 04/25/2012 02:57 PM, Ralph Castain wrote:
Strange that your code didn't generate any symbols - is that a mosix thing?
Have you tried just adding opal_output (so it goes to a special diagnostic
output channel) statements in your code to see where the segfault is occurring?
It looks like you are getting thru orte_init. You could add -mca
grpcomm_base_verbose 5 to see if you are getting in/thru the modex - if so,
then you are probably failing in add_procs.
I guess the symbols are a mosix thing, but it should still show some
sort of segmentation fault trace, no? maybe only the assembly opcode...
It seems that the SEGV is detected, rather then caught. This may also be
related to mosix - I'll check it with the mosix developer.
I added the parameter you suggested and appended the output. Modex seems
to be working because I use it to exchange the IP and PID, and as you
can see at the bottom these are received OK. I'll try debug printouts
specifically in add_procs. Thanks for the advice!
alex@singularity:~/huji/benchmarks/mpi/npb$ mpirun -mca
grpcomm_base_verbose 5 -mca btl self,mosix -mca btl_base_verbose 100 -n
4 ft.S.4
[singularity:08915] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08915] mca:base:select:(grpcomm) Query of component [bad]
set priority to 10
[singularity:08915] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08915] [[37778,0],0] grpcomm:base:receive start comm
[singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job
[37778,0] tag 1
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] grpcomm:base:xcast updating nidmap
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient
list is empty!
[singularity:08916] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08916] mca:base:select:(grpcomm) Query of component [bad]
set priority to 10
[singularity:08916] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08916] [[37778,1],0] grpcomm:base:receive start comm
[singularity:08919] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08919] mca:base:select:(grpcomm) Query of component [bad]
set priority to 10
[singularity:08919] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08919] [[37778,1],2] grpcomm:base:receive start comm
[singularity:08917] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08917] mca:base:select:(grpcomm) Query of component [bad]
set priority to 10
[singularity:08917] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08917] [[37778,1],1] grpcomm:base:receive start comm
[singularity:08921] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08921] mca:base:select:(grpcomm) Query of component [bad]
set priority to 10
[singularity:08921] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08921] [[37778,1],3] grpcomm:base:receive start comm
[singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting
attribute MPI_THREAD_LEVEL data size 1
[singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting
attribute OMPI_ARCH data size 11
[singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting
attribute MPI_THREAD_LEVEL data size 1
[singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting
attribute OMPI_ARCH data size 11
[singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting
attribute MPI_THREAD_LEVEL data size 1
[singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting
attribute OMPI_ARCH data size 11
[singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting
attribute MPI_THREAD_LEVEL data size 1
[singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting
attribute OMPI_ARCH data size 11
[singularity:08916] mca: base: components_open: Looking for btl components
[singularity:08916] mca: base: components_open: opening btl components
[singularity:08916] mca: base: components_open: found loaded component mosix
[singularity:08916] mca: base: components_open: component mosix register
function successful
[singularity:08916] mca: base: components_open: component mosix open
function successful
[singularity:08916] mca: base: components_open: found loaded component self
[singularity:08916] mca: base: components_open: component self has no
register function
[singularity:08916] mca: base: components_open: component self open
function successful
[singularity:08919] mca: base: components_open: Looking for btl components
[singularity:08917] mca: base: components_open: Looking for btl components
[singularity:08919] mca: base: components_open: opening btl components
[singularity:08919] mca: base: components_open: found loaded component mosix
[singularity:08919] mca: base: components_open: component mosix register
function successful
[singularity:08919] mca: base: components_open: component mosix open
function successful
[singularity:08919] mca: base: components_open: found loaded component self
[singularity:08919] mca: base: components_open: component self has no
register function
[singularity:08919] mca: base: components_open: component self open
function successful
[singularity:08921] mca: base: components_open: Looking for btl components
[singularity:08917] mca: base: components_open: opening btl components
[singularity:08917] mca: base: components_open: found loaded component mosix
[singularity:08917] mca: base: components_open: component mosix register
function successful
[singularity:08917] mca: base: components_open: component mosix open
function successful
[singularity:08917] mca: base: components_open: found loaded component self
[singularity:08917] mca: base: components_open: component self has no
register function
[singularity:08917] mca: base: components_open: component self open
function successful
[singularity:08921] mca: base: components_open: opening btl components
[singularity:08921] mca: base: components_open: found loaded component mosix
[singularity:08921] mca: base: components_open: component mosix register
function successful
[singularity:08921] mca: base: components_open: component mosix open
function successful
[singularity:08921] mca: base: components_open: found loaded component self
[singularity:08921] mca: base: components_open: component self has no
register function
[singularity:08921] mca: base: components_open: component self open
function successful
[singularity:08916] select: initializing btl component mosix
[singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting
attribute btl.mosix.1.7 data size 20
[singularity:08919] select: initializing btl component mosix
[singularity:08916] select: init of component mosix returned success
[singularity:08916] select: initializing btl component self
[singularity:08916] select: init of component self returned success
[singularity:08916] [[37778,1],0] grpcomm:base:modex: performing modex
[singularity:08916] [[37778,1],0] grpcomm:base:pack_modex: reporting 3
entries
[singularity:08916] [[37778,1],0] grpcomm:base:full:modex: executing
allgather
[singularity:08916] [[37778,1],0] grpcomm:bad entering allgather
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],0]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0
[singularity:08915] [[37778,0],0] ADDING [[37778,1],WILDCARD] TO
PARTICIPANTS
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 0
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08916] [[37778,1],0] grpcomm:bad allgather underway
[singularity:08916] [[37778,1],0] grpcomm:base:modex: modex posted
[singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting
attribute btl.mosix.1.7 data size 20
[singularity:08917] select: initializing btl component mosix
[singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting
attribute btl.mosix.1.7 data size 20
[singularity:08921] select: initializing btl component mosix
[singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting
attribute btl.mosix.1.7 data size 20
[singularity:08919] select: init of component mosix returned success
[singularity:08919] select: initializing btl component self
[singularity:08919] select: init of component self returned success
[singularity:08919] [[37778,1],2] grpcomm:base:modex: performing modex
[singularity:08919] [[37778,1],2] grpcomm:base:pack_modex: reporting 3
entries
[singularity:08919] [[37778,1],2] grpcomm:base:full:modex: executing
allgather
[singularity:08919] [[37778,1],2] grpcomm:bad entering allgather
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],2]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 0
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08919] [[37778,1],2] grpcomm:bad allgather underway
[singularity:08919] [[37778,1],2] grpcomm:base:modex: modex posted
[singularity:08917] select: init of component mosix returned success
[singularity:08917] select: initializing btl component self
[singularity:08917] select: init of component self returned success
[singularity:08917] [[37778,1],1] grpcomm:base:modex: performing modex
[singularity:08917] [[37778,1],1] grpcomm:base:pack_modex: reporting 3
entries
[singularity:08917] [[37778,1],1] grpcomm:base:full:modex: executing
allgather
[singularity:08917] [[37778,1],1] grpcomm:bad entering allgather
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],1]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 0
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08917] [[37778,1],1] grpcomm:bad allgather underway
[singularity:08917] [[37778,1],1] grpcomm:base:modex: modex posted
[singularity:08921] select: init of component mosix returned success
[singularity:08921] select: initializing btl component self
[singularity:08921] select: init of component self returned success
[singularity:08921] [[37778,1],3] grpcomm:base:modex: performing modex
[singularity:08921] [[37778,1],3] grpcomm:base:pack_modex: reporting 3
entries
[singularity:08921] [[37778,1],3] grpcomm:base:full:modex: executing
allgather
[singularity:08921] [[37778,1],3] grpcomm:bad entering allgather
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],3]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 0
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08915] [[37778,0],0] COLLECTIVE 0 LOCALLY COMPLETE -
SENDING TO GLOBAL COLLECTIVE
[singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: daemon
collective recvd from [[37778,0],0]
[singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: WORKING
COLLECTIVE 0
[singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 4
[singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job
[37778,1] tag 30
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient
list is empty!
[singularity:08921] [[37778,1],3] grpcomm:bad allgather underway
[singularity:08921] [[37778,1],3] grpcomm:base:modex: modex posted
[singularity:08921] [[37778,1],3] grpcomm:base:receive processing
collective return for id 0
[singularity:08921] [[37778,1],3] CHECKING COLL id 0
[singularity:08921] [[37778,1],3] STORING MODEX DATA
[singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:base:receive processing
collective return for id 0
[singularity:08916] [[37778,1],0] grpcomm:base:receive processing
collective return for id 0
[singularity:08916] [[37778,1],0] CHECKING COLL id 0
[singularity:08917] [[37778,1],1] CHECKING COLL id 0
[singularity:08916] [[37778,1],0] STORING MODEX DATA
[singularity:08917] [[37778,1],1] STORING MODEX DATA
[singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],2]
[singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],0]
[singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],0]
[singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],2]
[singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],1]
[singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],1]
[singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],3]
[singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],3]
[singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],3]
[singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],3]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],0]
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],3]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1
[singularity:08915] [[37778,0],0] ADDING [[37778,1],WILDCARD] TO
PARTICIPANTS
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 1
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],1]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],1]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],2]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],2]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],1]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],1]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],2]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],2]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08921] [[37778,1],3] grpcomm:bad entering barrier
[singularity:08921] [[37778,1],3] grpcomm:bad barrier underway
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],2]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],2]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],0]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],0]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],0]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],3]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],3]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],0]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],0]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],1]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],1]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],2]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:bad entering barrier
[singularity:08917] [[37778,1],1] grpcomm:bad entering barrier
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],0]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 1
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],1]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 1
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08917] [[37778,1],1] grpcomm:bad barrier underway
[singularity:08916] [[37778,1],0] grpcomm:bad barrier underway
[singularity:08919] [[37778,1],2] grpcomm:base:receive processing
collective return for id 0
[singularity:08919] [[37778,1],2] CHECKING COLL id 0
[singularity:08919] [[37778,1],2] STORING MODEX DATA
[singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],0]
[singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],0]
[singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],2]
[singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],2]
[singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],1]
[singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],1]
[singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex
entry for proc [[37778,1],3]
[singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries:
adding 3 entries for proc [[37778,1],3]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],0]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],0]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],1]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],1]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr OMPI_ARCH on proc [[37778,1],3]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 11 bytes
for attr OMPI_ARCH on proc [[37778,1],3]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],0]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],0]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],1]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],1]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],2]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],2]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes
for attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08919] [[37778,1],2] grpcomm:bad entering barrier
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],2]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 1
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08915] [[37778,0],0] COLLECTIVE 1 LOCALLY COMPLETE -
SENDING TO GLOBAL COLLECTIVE
[singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: daemon
collective recvd from [[37778,0],0]
[singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: WORKING
COLLECTIVE 1
[singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 4
[singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job
[37778,1] tag 30
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient
list is empty!
[singularity:08919] [[37778,1],2] grpcomm:bad barrier underway
[singularity:08916] [[37778,1],0] grpcomm:base:receive processing
collective return for id 1
[singularity:08916] [[37778,1],0] CHECKING COLL id 1
[singularity:08917] [[37778,1],1] grpcomm:base:receive processing
collective return for id 1
[singularity:08921] [[37778,1],3] grpcomm:base:receive processing
collective return for id 1
[singularity:08921] [[37778,1],3] CHECKING COLL id 1
[singularity:08917] [[37778,1],1] CHECKING COLL id 1
[singularity:08919] [[37778,1],2] grpcomm:base:receive processing
collective return for id 1
[singularity:08919] [[37778,1],2] CHECKING COLL id 1
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],0]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],0]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],1]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],1]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],2]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],2]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],3]
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],3]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],0]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],1]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],1]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],2]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],2]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],3]
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],3]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],0]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],0]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],0]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],1]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],1]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],2]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],3]
[singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],0]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],1]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],2]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],2]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for
attr MPI_THREAD_LEVEL on proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes
for attr MPI_THREAD_LEVEL on proc [[37778,1],3]
NAS Parallel Benchmarks 3.3 -- FT Benchmark
No input file inputft.data. Using compiled defaults
Size : 64x 64x 64
Iterations : 6
Number of processes : 4
Processor array : 1x 4
Layout type : 1D
[singularity:08916] btl: mosix: Establishind TCP link to address
127.0.0.1 and PID #8917
[singularity:08917] btl: mosix: Establishind TCP link to address
127.0.0.1 and PID #8921
[singularity:08916] btl: mosix: Establishind TCP link to address
127.0.0.1 and PID #8919
[singularity:08919] btl: mosix: Establishind TCP link to address
127.0.0.1 and PID #8921
[singularity:08921] btl: mosix: Establishind TCP link to address
127.0.0.1 and PID #8919
[singularity:08917] btl: mosix: Establishind TCP link to address
127.0.0.1 and PID #8916
[singularity:08921] btl: mosix: Establishind TCP link to address
127.0.0.1 and PID #8917
[singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job
[37778,0] tag 1
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient
list is empty!
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 8919 on node singularity
exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
[singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job
[37778,0] tag 1
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient
list is empty!
alex@singularity:~/huji/benchmarks/mpi/npb$