Hi all, I would like to write a multi-thread parallel program. I used pthread. Basicly, I want to create two background threads besides the main thread(process). For example, if I use "-np 4", the program should have 4 main processes on four processors and two background threads for each main process. So there should be 8 threads totally. I wrote a test program and it worked unpredictable. Sometimes I got the result I want, but sometimes the program got segmentation fault. I used MPI_Isend and MPI_Irecv for sending and recving. I do not know why? I attached the error message as follow:
[cheetah:29780] *** Process received signal *** [cheetah:29780] Signal: Segmentation fault (11) [cheetah:29780] Signal code: Address not mapped (1) [cheetah:29780] Failing at address: 0x10 [cheetah:29779] *** Process received signal *** [cheetah:29779] Signal: Segmentation fault (11) [cheetah:29779] Signal code: Address not mapped (1) [cheetah:29779] Failing at address: 0x10 [cheetah:29780] [ 0] /lib64/libpthread.so.0 [0x334b00de70] [cheetah:29780] [ 1] /act/openmpi/gnu/lib/openmpi/mca_btl_sm.so [0x2b90e1227940] [cheetah:29780] [ 2] /act/openmpi/gnu/lib/openmpi/mca_pml_ob1.so [0x2b90e05d61ca] [cheetah:29780] [ 3] /act/openmpi/gnu/lib/openmpi/mca_pml_ob1.so [0x2b90e05cac86] [cheetah:29780] [ 4] /act/openmpi/gnu/lib/libmpi.so.0(PMPI_Send+0x13d) [0x2b90dde7271d] [cheetah:29780] [ 5] pt_muti(_Z6backIDPv+0x29b) [0x409929] [cheetah:29780] [ 6] /lib64/libpthread.so.0 [0x334b0062f7] [cheetah:29780] [ 7] /lib64/libc.so.6(clone+0x6d) [0x334a4d1e3d] [cheetah:29780] *** End of error message *** [cheetah:29779] [ 0] /lib64/libpthread.so.0 [0x334b00de70] [cheetah:29779] [ 1] /act/openmpi/gnu/lib/openmpi/mca_btl_sm.so [0x2b39785c0940] [cheetah:29779] [ 2] /act/openmpi/gnu/lib/openmpi/mca_pml_ob1.so [0x2b397796f1ca] [cheetah:29779] [ 3] /act/openmpi/gnu/lib/openmpi/mca_pml_ob1.so [0x2b3977963c86] [cheetah:29779] [ 4] /act/openmpi/gnu/lib/libmpi.so.0(PMPI_Send+0x13d) [0x2b397520b71d] [cheetah:29779] [ 5] pt_muti(_Z6backIDPv+0x29b) [0x409929] [cheetah:29779] [ 6] /lib64/libpthread.so.0 [0x334b0062f7] [cheetah:29779] [ 7] /lib64/libc.so.6(clone+0x6d) [0x334a4d1e3d] [cheetah:29779] *** End of error message *** I used gdb to "bt" the error and I got : Program terminated with signal 11, Segmentation fault. #0 0x00002b90e1227940 in mca_btl_sm_alloc () from /act/openmpi/gnu/lib/openmpi/mca_btl_sm.so (gdb) bt #0 0x00002b90e1227940 in mca_btl_sm_alloc () from /act/openmpi/gnu/lib/openmpi/mca_btl_sm.so #1 0x00002b90e05d61ca in mca_pml_ob1_send_request_start_copy () from /act/openmpi/gnu/lib/openmpi/mca_pml_ob1.so #2 0x00002b90e05cac86 in mca_pml_ob1_send () from /act/openmpi/gnu/lib/openmpi/mca_pml_ob1.so #3 0x00002b90dde7271d in PMPI_Send () from /act/openmpi/gnu/lib/libmpi.so.0 #4 0x0000000000409929 in backID (arg=0x0) at pt_muti.cpp:50 #5 0x000000334b0062f7 in start_thread () from /lib64/libpthread.so.0 #6 0x000000334a4d1e3d in clone () from /lib64/libc.so.6 So can anyone give me some suggestions or advice. Thanks very much. _________________________________________________________________ 上Windows Live 中国首页,下载最新版Messenger! http://www.windowslive.cn