On 04/25/2012 02:57 PM, Ralph Castain wrote:
Strange that your code didn't generate any symbols - is that a mosix thing? 
Have you tried just adding opal_output (so it goes to a special diagnostic 
output channel) statements in your code to see where the segfault is occurring?

It looks like you are getting thru orte_init. You could add -mca 
grpcomm_base_verbose 5 to see if you are getting in/thru the modex - if so, 
then you are probably failing in add_procs.

I guess the symbols are a mosix thing, but it should still show some sort of segmentation fault trace, no? maybe only the assembly opcode... It seems that the SEGV is detected, rather then caught. This may also be related to mosix - I'll check it with the mosix developer.

I added the parameter you suggested and appended the output. Modex seems to be working because I use it to exchange the IP and PID, and as you can see at the bottom these are received OK. I'll try debug printouts specifically in add_procs. Thanks for the advice!

alex@singularity:~/huji/benchmarks/mpi/npb$ mpirun -mca grpcomm_base_verbose 5 -mca btl self,mosix -mca btl_base_verbose 100 -n 4 ft.S.4
[singularity:08915] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08915] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
[singularity:08915] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08915] [[37778,0],0] grpcomm:base:receive start comm
[singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,0] tag 1
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] grpcomm:base:xcast updating nidmap
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is empty!
[singularity:08916] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08916] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
[singularity:08916] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08916] [[37778,1],0] grpcomm:base:receive start comm
[singularity:08919] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08919] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
[singularity:08919] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08919] [[37778,1],2] grpcomm:base:receive start comm
[singularity:08917] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08917] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
[singularity:08917] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08917] [[37778,1],1] grpcomm:base:receive start comm
[singularity:08921] mca:base:select:(grpcomm) Querying component [bad]
[singularity:08921] mca:base:select:(grpcomm) Query of component [bad] set priority to 10
[singularity:08921] mca:base:select:(grpcomm) Selected component [bad]
[singularity:08921] [[37778,1],3] grpcomm:base:receive start comm
[singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting attribute MPI_THREAD_LEVEL data size 1 [singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting attribute OMPI_ARCH data size 11 [singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting attribute MPI_THREAD_LEVEL data size 1 [singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting attribute OMPI_ARCH data size 11 [singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting attribute MPI_THREAD_LEVEL data size 1 [singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting attribute OMPI_ARCH data size 11 [singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting attribute MPI_THREAD_LEVEL data size 1 [singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting attribute OMPI_ARCH data size 11
[singularity:08916] mca: base: components_open: Looking for btl components
[singularity:08916] mca: base: components_open: opening btl components
[singularity:08916] mca: base: components_open: found loaded component mosix
[singularity:08916] mca: base: components_open: component mosix register function successful [singularity:08916] mca: base: components_open: component mosix open function successful
[singularity:08916] mca: base: components_open: found loaded component self
[singularity:08916] mca: base: components_open: component self has no register function [singularity:08916] mca: base: components_open: component self open function successful
[singularity:08919] mca: base: components_open: Looking for btl components
[singularity:08917] mca: base: components_open: Looking for btl components
[singularity:08919] mca: base: components_open: opening btl components
[singularity:08919] mca: base: components_open: found loaded component mosix
[singularity:08919] mca: base: components_open: component mosix register function successful [singularity:08919] mca: base: components_open: component mosix open function successful
[singularity:08919] mca: base: components_open: found loaded component self
[singularity:08919] mca: base: components_open: component self has no register function [singularity:08919] mca: base: components_open: component self open function successful
[singularity:08921] mca: base: components_open: Looking for btl components
[singularity:08917] mca: base: components_open: opening btl components
[singularity:08917] mca: base: components_open: found loaded component mosix
[singularity:08917] mca: base: components_open: component mosix register function successful [singularity:08917] mca: base: components_open: component mosix open function successful
[singularity:08917] mca: base: components_open: found loaded component self
[singularity:08917] mca: base: components_open: component self has no register function [singularity:08917] mca: base: components_open: component self open function successful
[singularity:08921] mca: base: components_open: opening btl components
[singularity:08921] mca: base: components_open: found loaded component mosix
[singularity:08921] mca: base: components_open: component mosix register function successful [singularity:08921] mca: base: components_open: component mosix open function successful
[singularity:08921] mca: base: components_open: found loaded component self
[singularity:08921] mca: base: components_open: component self has no register function [singularity:08921] mca: base: components_open: component self open function successful
[singularity:08916] select: initializing btl component mosix
[singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting attribute btl.mosix.1.7 data size 20
[singularity:08919] select: initializing btl component mosix
[singularity:08916] select: init of component mosix returned success
[singularity:08916] select: initializing btl component self
[singularity:08916] select: init of component self returned success
[singularity:08916] [[37778,1],0] grpcomm:base:modex: performing modex
[singularity:08916] [[37778,1],0] grpcomm:base:pack_modex: reporting 3 entries [singularity:08916] [[37778,1],0] grpcomm:base:full:modex: executing allgather
[singularity:08916] [[37778,1],0] grpcomm:bad entering allgather
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],0]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0
[singularity:08915] [[37778,0],0] ADDING [[37778,1],WILDCARD] TO PARTICIPANTS
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 0
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08916] [[37778,1],0] grpcomm:bad allgather underway
[singularity:08916] [[37778,1],0] grpcomm:base:modex: modex posted
[singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting attribute btl.mosix.1.7 data size 20
[singularity:08917] select: initializing btl component mosix
[singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting attribute btl.mosix.1.7 data size 20
[singularity:08921] select: initializing btl component mosix
[singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting attribute btl.mosix.1.7 data size 20
[singularity:08919] select: init of component mosix returned success
[singularity:08919] select: initializing btl component self
[singularity:08919] select: init of component self returned success
[singularity:08919] [[37778,1],2] grpcomm:base:modex: performing modex
[singularity:08919] [[37778,1],2] grpcomm:base:pack_modex: reporting 3 entries [singularity:08919] [[37778,1],2] grpcomm:base:full:modex: executing allgather
[singularity:08919] [[37778,1],2] grpcomm:bad entering allgather
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],2]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 0
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08919] [[37778,1],2] grpcomm:bad allgather underway
[singularity:08919] [[37778,1],2] grpcomm:base:modex: modex posted
[singularity:08917] select: init of component mosix returned success
[singularity:08917] select: initializing btl component self
[singularity:08917] select: init of component self returned success
[singularity:08917] [[37778,1],1] grpcomm:base:modex: performing modex
[singularity:08917] [[37778,1],1] grpcomm:base:pack_modex: reporting 3 entries [singularity:08917] [[37778,1],1] grpcomm:base:full:modex: executing allgather
[singularity:08917] [[37778,1],1] grpcomm:bad entering allgather
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],1]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 0
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08917] [[37778,1],1] grpcomm:bad allgather underway
[singularity:08917] [[37778,1],1] grpcomm:base:modex: modex posted
[singularity:08921] select: init of component mosix returned success
[singularity:08921] select: initializing btl component self
[singularity:08921] select: init of component self returned success
[singularity:08921] [[37778,1],3] grpcomm:base:modex: performing modex
[singularity:08921] [[37778,1],3] grpcomm:base:pack_modex: reporting 3 entries [singularity:08921] [[37778,1],3] grpcomm:base:full:modex: executing allgather
[singularity:08921] [[37778,1],3] grpcomm:bad entering allgather
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],3]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 0
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 0
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08915] [[37778,0],0] COLLECTIVE 0 LOCALLY COMPLETE - SENDING TO GLOBAL COLLECTIVE [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: daemon collective recvd from [[37778,0],0] [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: WORKING COLLECTIVE 0
[singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 4
[singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,1] tag 30
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is empty!
[singularity:08921] [[37778,1],3] grpcomm:bad allgather underway
[singularity:08921] [[37778,1],3] grpcomm:base:modex: modex posted
[singularity:08921] [[37778,1],3] grpcomm:base:receive processing collective return for id 0
[singularity:08921] [[37778,1],3] CHECKING COLL id 0
[singularity:08921] [[37778,1],3] STORING MODEX DATA
[singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex entry for proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex entry for proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:base:receive processing collective return for id 0 [singularity:08916] [[37778,1],0] grpcomm:base:receive processing collective return for id 0
[singularity:08916] [[37778,1],0] CHECKING COLL id 0
[singularity:08917] [[37778,1],1] CHECKING COLL id 0
[singularity:08916] [[37778,1],0] STORING MODEX DATA
[singularity:08917] [[37778,1],1] STORING MODEX DATA
[singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],2] [singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex entry for proc [[37778,1],0] [singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex entry for proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex entry for proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],0] [singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex entry for proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex entry for proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],2] [singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex entry for proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex entry for proc [[37778,1],1] [singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],1] [singularity:08917] [[37778,1],1] grpcomm:base:store_modex adding modex entry for proc [[37778,1],3] [singularity:08916] [[37778,1],0] grpcomm:base:store_modex adding modex entry for proc [[37778,1],3] [singularity:08917] [[37778,1],1] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],3] [singularity:08916] [[37778,1],0] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],3] [singularity:08921] [[37778,1],3] grpcomm:base:store_modex adding modex entry for proc [[37778,1],3] [singularity:08921] [[37778,1],3] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],3] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],0]
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],3]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1
[singularity:08915] [[37778,0],0] ADDING [[37778,1],WILDCARD] TO PARTICIPANTS
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 1
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],1] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],1] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],2] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],2] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],1] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],1] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],2] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],2] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],3] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08921] [[37778,1],3] grpcomm:bad entering barrier
[singularity:08921] [[37778,1],3] grpcomm:bad barrier underway
[singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],2] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],2] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],3] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],3] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],0] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],0] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],0] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],3] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],3] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],0] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],0] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],1] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],1] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],3] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],3] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],2] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],3] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08916] [[37778,1],0] grpcomm:bad entering barrier
[singularity:08917] [[37778,1],1] grpcomm:bad entering barrier
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],0]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 1
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],1]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 1
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08917] [[37778,1],1] grpcomm:bad barrier underway
[singularity:08916] [[37778,1],0] grpcomm:bad barrier underway
[singularity:08919] [[37778,1],2] grpcomm:base:receive processing collective return for id 0
[singularity:08919] [[37778,1],2] CHECKING COLL id 0
[singularity:08919] [[37778,1],2] STORING MODEX DATA
[singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex entry for proc [[37778,1],0] [singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],0] [singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex entry for proc [[37778,1],2] [singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],2] [singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex entry for proc [[37778,1],1] [singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],1] [singularity:08919] [[37778,1],2] grpcomm:base:store_modex adding modex entry for proc [[37778,1],3] [singularity:08919] [[37778,1],2] grpcomm:base:update_modex_entries: adding 3 entries for proc [[37778,1],3] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],0] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],0] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],1] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],1] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr OMPI_ARCH on proc [[37778,1],3] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 11 bytes for attr OMPI_ARCH on proc [[37778,1],3] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],0] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],0] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],1] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],1] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],2] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],2] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr btl.mosix.1.7 on proc [[37778,1],3] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 20 bytes for attr btl.mosix.1.7 on proc [[37778,1],3]
[singularity:08919] [[37778,1],2] grpcomm:bad entering barrier
[singularity:08915] [[37778,0],0] COLLECTIVE RECVD FROM [[37778,1],2]
[singularity:08915] [[37778,0],0] WORKING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLLECTIVE 1
[singularity:08915] [[37778,0],0] PROGRESSING COLL id 1
[singularity:08915] [[37778,0],0] ALL LOCAL PROCS CONTRIBUTE 4
[singularity:08915] [[37778,0],0] COLLECTIVE 1 LOCALLY COMPLETE - SENDING TO GLOBAL COLLECTIVE [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: daemon collective recvd from [[37778,0],0] [singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: WORKING COLLECTIVE 1
[singularity:08915] [[37778,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 4
[singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,1] tag 30
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is empty!
[singularity:08919] [[37778,1],2] grpcomm:bad barrier underway
[singularity:08916] [[37778,1],0] grpcomm:base:receive processing collective return for id 1
[singularity:08916] [[37778,1],0] CHECKING COLL id 1
[singularity:08917] [[37778,1],1] grpcomm:base:receive processing collective return for id 1 [singularity:08921] [[37778,1],3] grpcomm:base:receive processing collective return for id 1
[singularity:08921] [[37778,1],3] CHECKING COLL id 1
[singularity:08917] [[37778,1],1] CHECKING COLL id 1
[singularity:08919] [[37778,1],2] grpcomm:base:receive processing collective return for id 1
[singularity:08919] [[37778,1],2] CHECKING COLL id 1
[singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],0] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],0] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],1] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],1] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],2] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],2] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],3] [singularity:08919] [[37778,1],2] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],3] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],0] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],1] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],1] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],2] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],2] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],3] [singularity:08921] [[37778,1],3] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],3] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],0] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],0] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],0] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],1] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],1] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],2] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],3] [singularity:08917] [[37778,1],1] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],3] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],0] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],1] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],2] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],2] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: searching for attr MPI_THREAD_LEVEL on proc [[37778,1],3] [singularity:08916] [[37778,1],0] grpcomm:get_proc_attr: found 1 bytes for attr MPI_THREAD_LEVEL on proc [[37778,1],3]


 NAS Parallel Benchmarks 3.3 -- FT Benchmark

 No input file inputft.data. Using compiled defaults
 Size                :   64x  64x  64
 Iterations          :              6
 Number of processes :              4
 Processor array     :         1x   4
 Layout type         :             1D
[singularity:08916] btl: mosix: Establishind TCP link to address 127.0.0.1 and PID #8917 [singularity:08917] btl: mosix: Establishind TCP link to address 127.0.0.1 and PID #8921 [singularity:08916] btl: mosix: Establishind TCP link to address 127.0.0.1 and PID #8919 [singularity:08919] btl: mosix: Establishind TCP link to address 127.0.0.1 and PID #8921 [singularity:08921] btl: mosix: Establishind TCP link to address 127.0.0.1 and PID #8919 [singularity:08917] btl: mosix: Establishind TCP link to address 127.0.0.1 and PID #8916 [singularity:08921] btl: mosix: Establishind TCP link to address 127.0.0.1 and PID #8917 [singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,0] tag 1
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is empty!
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 8919 on node singularity exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
[singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,0] tag 1
[singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
[singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is empty!
alex@singularity:~/huji/benchmarks/mpi/npb$

Reply via email to