On 4/2/2013 10:59 AM, Corinna Vinschen wrote:
Hi Marco,
On Apr 1 17:11, marco atzeri wrote:
I am building and testing openmpi-1.7.0rc9 on
CYGWIN_NT-6.1 1.7.18(0.263/5/3) 2013-03-28 22:07 x86_64 Cygwin
every looks fine except when all the processes on several cores
end and should return to lunching program, something go wrong
(of course on 32bit everyhing is OK)
Attached stackdump.
Not all 64 bit issues are Cygwin's fault. It's not clear in your
case, but it is pretty clear that this happens in a DLL other than
the Cygwin DLL. Look at the stack trace:
Stack trace:
Frame Function Args
0000022A630 00488F23380 (004CCD00004, 0060008C820, 00600082970, 0000022CCF0)
00600012500 00488F1C48C (00000000000, 0000022AB10, 00180134344, 00000000000)
0000022A900 0010040234B (0000022AB10, 00000000000, 00000000000, 0000022AB80)
0000022AAC0 001004010F3 (0000022AB10, 00000000000, 00000000030,
30001000000FF00)
0000022AB80 001800478A7 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000 0018004576B (00000000000, 00000000000, 00000000000, 00000000000)
00000000000 0018004592F (00000000000, 00000000000, 00000000000, 00000000000)
00000000000 00100407CB1 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000 00100401010 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000 0007710652D (00000000000, 00000000000, 00000000000, 00077189300)
00000000000 0007733C521 (00000000000, 00000000000, 00000000000, 00077189300)
End of stack trace
The function addresses starting with 0x4 are distro DLLs. So the crash
occurs in a DLL at address 0x4:88F23380. The easiest way to find out
which DLL is the culprit, is this: Call `rebase -i /bin/*.dll' and see
which DLL covers the 0x4:88Fxxxxx addresses.
it is one of the openmpi dll.
The next addresses on the stack are applicaton addresses (0x1:0xxxxxxx)
The Cygwin DLL itself only shows up in the 6th frame (0x1:8xxxxxxx).
This is apparently the DLL entry point which ultimately calls the
application's main function.
unfortunately GDB freezes before hitting breakpoints in code
portion where the segfault is supposed to be.
So something is fishy and GDB seems unable to catch it
Did you see my mail describing the memory layout on the
cygwin-developers list? This is a bit helpful I hope:
http://cygwin.com/ml/cygwin-developers/2013-02/msg00027.html
Corinna
Thanks
Marco