Re: [OMPI devel] How to debug segv

2012-04-25 Thread George Bosilca
On Apr 25, 2012, at 13:59 , Alex Margolin wrote: > I guess you are right. > > I started looking into the communication passing between processes and I may > have found a problem with the way I handle "reserved" data requested at > prepare_src()... I've tried to write pretty much the same as TC

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Shamis, Pavel
Alex, +1 vote for core. It is good starting point. * If you can't (from some reason) generate the core file, you may drop while (1) somewhere in the init code and attach the gdb later. * If you are looking for more user-friendly experience, you may try Allinea DDT (they have 30day trial version)

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Alex Margolin
I guess you are right. I started looking into the communication passing between processes and I may have found a problem with the way I handle "reserved" data requested at prepare_src()... I've tried to write pretty much the same as TCP (the relevant code is around "if(opal_convertor_need_buff

Re: [OMPI devel] libevent socket code

2012-04-25 Thread Nathan Hjelm
Let me take a look. The code in question is in evutil.c and bufferevent_sock.c . If there is no option we might be able to get away with just removing these files from the Makefile.am. -Nathan On Wed, 25 Apr 2012, Jeff Squyres wrote: On Apr 25, 2012, at 12:50 PM, Ralph Castain wrote: Can't

Re: [OMPI devel] libevent socket code

2012-04-25 Thread Jeff Squyres
On Apr 25, 2012, at 12:50 PM, Ralph Castain wrote: > Can't it be done with configuring --without-libevent-sockets or some such > thing? I really hate munging the code directly as it creates lots of support > issues and makes it harder to upgrade. If there's a libevent configure option we should

Re: [OMPI devel] libevent socket code

2012-04-25 Thread Ralph Castain
Can't it be done with configuring --without-libevent-sockets or some such thing? I really hate munging the code directly as it creates lots of support issues and makes it harder to upgrade. On Apr 25, 2012, at 10:45 AM, Nathan Hjelm wrote: > Anyone object if I #if 0 out all the socket code in

[OMPI devel] libevent socket code

2012-04-25 Thread Nathan Hjelm
Anyone object if I #if 0 out all the socket code in libevent. We see lots of static compilation warnings because of that code and nothing in openmpi uses it. -Nathan

[OMPI devel] 1.6rc1 has been released

2012-04-25 Thread Jeff Squyres
Note that Open MPI 1.6 is the evolution of the 1.5 series -- it is not a new branch from the SVN trunk. Hence, 1.6 is essentially a bunch of bug fixes on top of 1.5.5. Please test: http://www.open-mpi.org/software/ompi/v1.6/ (note that the 1.6 page is not linked to from anywhere on the OM

[OMPI devel] Fwd: GNU autoconf 2.69 released [stable]

2012-04-25 Thread Jeffrey Squyres
There are a number of new Autoconf macros that would be useful for OMPI's Fortran configury. Meaning: we have klugearounds in our existing configury, but the new AC 2.69 macros are Better. How would people feel about upgrading the autoconf requirement on the trunk to AC 2.69? (Terry: please a

Re: [OMPI devel] How to debug segv

2012-04-25 Thread George Bosilca
Alex, You got the banner of the FT benchmark, so I guess at least the rank 0 successfully completed the MPI_Init call. This is a hint that you should investigate more into the point-to-point logic of your mosix BTL. george. On Apr 25, 2012, at 09:30 , Alex Margolin wrote: > NAS Parallel Ben

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Jeffrey Squyres
Another thing to try is to load up the core file in gdb and see if that gives you a valid stack trace of where exactly the segv occurred. On Apr 25, 2012, at 9:30 AM, Alex Margolin wrote: > On 04/25/2012 02:57 PM, Ralph Castain wrote: >> Strange that your code didn't generate any symbols - is t

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Alex Margolin
On 04/25/2012 02:57 PM, Ralph Castain wrote: Strange that your code didn't generate any symbols - is that a mosix thing? Have you tried just adding opal_output (so it goes to a special diagnostic output channel) statements in your code to see where the segfault is occurring? It looks like you

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Ralph Castain
Strange that your code didn't generate any symbols - is that a mosix thing? Have you tried just adding opal_output (so it goes to a special diagnostic output channel) statements in your code to see where the segfault is occurring? It looks like you are getting thru orte_init. You could add -mca

[OMPI devel] How to debug segv

2012-04-25 Thread Alex Margolin
Hi, I'm getting a segv error off my build of the trunk. I know that my BTL module is responsible ("-mca btl self,tcp" works, "-mca btl self,mosix" fails). Smaller/simpler test applications pass, NPB doesn't. Can anyone suggest how to proceed with debugging this? my attempts include some debug