Re: [OMPI devel] How to debug segv

2012-04-25 Thread George Bosilca
On Apr 25, 2012, at 13:59 , Alex Margolin wrote: > I guess you are right. > > I started looking into the communication passing between processes and I may > have found a problem with the way I handle "reserved" data requested at > prepare_src()... I've tried to write pretty much the same as TC

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Shamis, Pavel
Alex, +1 vote for core. It is good starting point. * If you can't (from some reason) generate the core file, you may drop while (1) somewhere in the init code and attach the gdb later. * If you are looking for more user-friendly experience, you may try Allinea DDT (they have 30day trial version)

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Alex Margolin
I guess you are right. I started looking into the communication passing between processes and I may have found a problem with the way I handle "reserved" data requested at prepare_src()... I've tried to write pretty much the same as TCP (the relevant code is around "if(opal_convertor_need_buff

Re: [OMPI devel] How to debug segv

2012-04-25 Thread George Bosilca
Alex, You got the banner of the FT benchmark, so I guess at least the rank 0 successfully completed the MPI_Init call. This is a hint that you should investigate more into the point-to-point logic of your mosix BTL. george. On Apr 25, 2012, at 09:30 , Alex Margolin wrote: > NAS Parallel Ben

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Jeffrey Squyres
Another thing to try is to load up the core file in gdb and see if that gives you a valid stack trace of where exactly the segv occurred. On Apr 25, 2012, at 9:30 AM, Alex Margolin wrote: > On 04/25/2012 02:57 PM, Ralph Castain wrote: >> Strange that your code didn't generate any symbols - is t

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Alex Margolin
On 04/25/2012 02:57 PM, Ralph Castain wrote: Strange that your code didn't generate any symbols - is that a mosix thing? Have you tried just adding opal_output (so it goes to a special diagnostic output channel) statements in your code to see where the segfault is occurring? It looks like you

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Ralph Castain
Strange that your code didn't generate any symbols - is that a mosix thing? Have you tried just adding opal_output (so it goes to a special diagnostic output channel) statements in your code to see where the segfault is occurring? It looks like you are getting thru orte_init. You could add -mca

[OMPI devel] How to debug segv

2012-04-25 Thread Alex Margolin
Hi, I'm getting a segv error off my build of the trunk. I know that my BTL module is responsible ("-mca btl self,tcp" works, "-mca btl self,mosix" fails). Smaller/simpler test applications pass, NPB doesn't. Can anyone suggest how to proceed with debugging this? my attempts include some debug