[BUG] Memory Courruption (was: RE: [Q] SIGSEGV After fork())
Dear mod_perl experts: Collectively, we've been at this for more than two weeks and have searched various mod_perl archives, all to no avail. Symptom: === SIGSEGV after fork(). Very reproducible. Memory corruption gets moved around if the codebase changes. [ SNIP ] The above is the key: moved around. Therefore, I need Purify or similar tool. I'm going to have to go this route, since nobody has any ideas. Go go gadget purchasing! :( The only other way I can think of to solve this is to send my module list to this audience. Please find it, attached, with home-grown modules deleted. More info: In speaking with Ged (who is very knowledgeable, thanks!), I was led down a path that caused my server to start (setting PERL_DESTRUCT_LEVEL to 0), but it doesn't solve the memory corruption that perl_destruct ends up stumbling on, only hides it. For some reason, in my case, the address of the PV_sv_undef symbol ends up being the target of my Perl_safesysfree, below (the xpv_pv address was, for some reason, 0x4046cc18, and that is the address of the PV_sv_undef symbol). Stack Trace: === #0 __pthread_mutex_lock (mutex=0x8bf04999) at mutex.c:99 #1 0x401b9cc8 in __libc_free (mem=0x4046cc18) at malloc.c:3152 #2 0x403ce028 in Perl_safesysfree (where=0x4046cc18) at util.c:158 #3 0x403f20d8 in Perl_sv_clear (sv=0x8198f60) at sv.c:3827 #4 0x403f2473 in Perl_sv_free (sv=0x8198f60) at sv.c:3950 #5 0x403f80e1 in do_clean_all (sv=0x8198f60) at sv.c:8411 #6 0x403e9c5e in S_visit (f=0x403f8094 do_clean_all) at sv.c:162 #7 0x403e9ce2 in Perl_sv_clean_all () at sv.c:193 #8 0x4038594a in perl_destruct (my_perl=0x809a9a8) at perl.c:665 #9 0x4035629c in perl_shutdown (s=0x0, p=0x0) at mod_perl.c:294 #10 0x40356be6 in mp_dso_unload (data=0x808e714) at mod_perl.c:489 #11 0x08050f34 in run_cleanups (c=0x809c8ac) at alloc.c:1713 #12 0x0804f5fa in ap_clear_pool (a=0x808e714) at alloc.c:538 #13 0x08062128 in standalone_main (argc=7, argv=0xb294) at http_main.c:5014 #14 0x08062cb2 in main (argc=7, argv=0xb294) at http_main.c:5401 #15 0x40155627 in __libc_start_main (main=0x80627d4 main, argc=7, ubp_av=0xb294, init=0x804e3e4 _init, fini=0x807aa40 _fini, rtld_fini=0x4000dcd4 _dl_fini, stack_end=0xb28c) at ../sysdeps/generic/libc-start.c:129 ***NOTE*** the following gdb session was gleaned from sv.c and refers to the freed memory location (0x4046cc18) above: (gdb) p *((XPV*)(sv)-sv_any) $13 = {xpv_pv = 0x4046cc18 , xpv_cur = 135562488, xpv_len = 135617180} [ SNIP ] -- \_/} Mark P. Fister Java, Java, everywhere, and all\_/} \_/} eBay, Inc. the cups did shrink; Java, Java\_/} \_/} Austin, TX everywhere, nor any drop to drink! \_/} module_list_ulist.txt Description: Binary data
Re: [BUG] Memory Courruption (was: RE: [Q] SIGSEGV After fork())
The only other way I can think of to solve this is to send my module list to this audience. Please find it, attached, with home-grown modules deleted. Have you tried debugging the old-fashioned way, i.e. remove things until it works? That's your best bet. I suspect you will find that you have some module doing something with XS or sockets or filehandles that can't deal with being forked. - Perrin
Re: [BUG] Memory Courruption (was: RE: [Q] SIGSEGV After fork())
At 11:44 AM -0600 2/15/02, Fister, Mark wrote: Dear mod_perl experts: Collectively, we've been at this for more than two weeks and have searched various mod_perl archives, all to no avail. Symptom: === SIGSEGV after fork(). Very reproducible. Memory corruption gets moved around if the codebase changes. [ SNIP ] The above is the key: moved around. Therefore, I need Purify or similar tool. I'm going to have to go this route, since nobody has any ideas. Go go gadget purchasing! :( Are you running any XS stuff created with SWIG? I had a very similar problem some time ago (RH 5.1, I think) with SWIG creating strange XS files that corrupted memory when used under mod_perl... There was no corruption when running as a perl script or cgi. I eventually scrapped SWIG (a little too complicated for what I was doing) in favor of h2xs. Just for fun, I tried using it with mod_perl and it worked perfectly. After reviewing it with my father (who's a die hard C guy), he found a potential problem with $var = undef; call_xs_sub_to_populate($var); Which might hose undef. He suggested instead $var = \0 x SIZE_OF_POPULATED_DATA; call_xs_sub_to_populate($var); Which I never actually tried. Since I'm not a C guy, I don't really run into too many segfaults. Unfortunatly I no longer have the code I was testing this with or I'd give it another shot... Hope that helps some... Rob -- When I used a Mac, they laughed because I had no command prompt. When I used Linux, they laughed because I had no GUI.
Re: [BUG] Memory Courruption (was: RE: [Q] SIGSEGV After fork())
On Fri, Feb 15, 2002 at 11:44:03AM -0600, Fister, Mark wrote: Dear mod_perl experts: Collectively, we've been at this for more than two weeks and have searched various mod_perl archives, all to no avail. Symptom: === SIGSEGV after fork(). Very reproducible. Memory corruption gets moved around if the codebase changes. [ SNIP ] The above is the key: moved around. Therefore, I need Purify or similar tool. I'm going to have to go this route, since nobody has any ideas. Go go gadget purchasing! :( The only other way I can think of to solve this is to send my module list to this audience. Please find it, attached, with home-grown modules deleted. To further diagnose this problem you might consider using the sigtrap module and paying careful attention to your logs... This at least led me to the portion of my perl that was causing the problem. A simple use sigtrap; The default signal handler used in this module gives you a stack trace before the core dump.. -- Paul Lindner[EMAIL PROTECTED] | | | | | | | | | | mod_perl Developer's Cookbook http://www.modperlcookbook.org/ Human Rights Declaration http://www.unhchr.ch/udhr/index.htm
RE: [BUG] Memory Courruption (was: RE: [Q] SIGSEGV After fork())
The only other way I can think of to solve this is to send my module list to this audience. Please find it, attached, with home-grown modules deleted. Have you tried debugging the old-fashioned way, i.e. remove things until it works? That's your best bet. I suspect you will find that you have some module doing something with XS or sockets or filehandles that can't deal with being forked. That's just the thing with memory corruption: Adding or removing random code causes the SIGSEGV signature to change (or causes the server to suddenly start working). I am nearly certain that the memory corruption is happening BEFORE the fork, anyway, which is why I modified the subject line of this thread. Thank you VERY, VERY much for your ideas, though - I will keep looking while I wait for the powers that be to get my bounds checking software! - Perrin -- \_/} Mark P. Fister Java, Java, everywhere, and all\_/} \_/} eBay, Inc. the cups did shrink; Java, Java\_/} \_/} Austin, TX everywhere, nor any drop to drink! \_/}
RE: [BUG] Memory Courruption (was: RE: [Q] SIGSEGV After fork())
On Fri, Feb 15, 2002 at 12:17:07PM -0800, Paul Lindner wrote: On Fri, Feb 15, 2002 at 11:44:03AM -0600, Fister, Mark wrote: Dear mod_perl experts: Collectively, we've been at this for more than two weeks and have searched various mod_perl archives, all to no avail. Symptom: === SIGSEGV after fork(). Very reproducible. Memory corruption gets moved around if the codebase changes. [ SNIP ] The above is the key: moved around. Therefore, I need Purify or similar tool. I'm going to have to go this route, since nobody has any ideas. Go go gadget purchasing! :( The only other way I can think of to solve this is to send my module list to this audience. Please find it, attached, with home-grown modules deleted. To further diagnose this problem you might consider using the sigtrap module and paying careful attention to your logs... This at least led me to the portion of my perl that was causing the problem. A simple use sigtrap; The default signal handler used in this module gives you a stack trace before the core dump.. Unless use sigtrap; itself causes a SIGSEGV, which invokes the signal handler, which causes a SIGSEGV, which invokes... and all of a sudden your httpd process goes to many thousands of stack levels deep and consumes 1GB of memory... ;) -- \_/} Mark P. Fister Java, Java, everywhere, and all\_/} \_/} eBay, Inc. the cups did shrink; Java, Java\_/} \_/} Austin, TX everywhere, nor any drop to drink! \_/}
RE: [Q] SIGSEGV After fork()
On Thu, Feb 07, 2002 at 01:03:29AM +, Ged Haywood wrote: Hi there, Hi! Thank you SOOO much for the reply! [SNIP] You might try usemymalloc. Tried that. Note: you also tried to help a fellow back in November of 2001 on this VERY same stack trace. http://groups.yahoo.com/group/modperl/message/39560 Compiler: optimize='-g', H... See below. ccversion='', gccversion='2.96 2731 (Red Hat Linux 7.1 2.96-85)', gccosandvers='' You've obviously read the docs, so I take it the same compiler built Aapche, mod_perl and Perl. Have you tried this on RH6.2 with the compiler that came with that? Yes. Note also: the problem didn't use to happen with perl 5.00404, mod_perl 1.08 and apache 1.3b5 (with exactly the same codebase). See below. Stack Trace: === #0 __pthread_mutex_lock (mutex=0x8bf04999) at mutex.c:99 #1 0x401b9cc8 in __libc_free (mem=0x4046cc18) at malloc.c:3152 #2 0x403ce028 in Perl_safesysfree (where=0x4046cc18) at util.c:158 #3 0x403f20d8 in Perl_sv_clear (sv=0x8198f60) at sv.c:3827 #4 0x403f2473 in Perl_sv_free (sv=0x8198f60) at sv.c:3950 #5 0x403f80e1 in do_clean_all (sv=0x8198f60) at sv.c:8411 #6 0x403e9c5e in S_visit (f=0x403f8094 do_clean_all) at sv.c:162 #7 0x403e9ce2 in Perl_sv_clean_all () at sv.c:193 #8 0x4038594a in perl_destruct (my_perl=0x809a9a8) at perl.c:665 #9 0x4035629c in perl_shutdown (s=0x0, p=0x0) at mod_perl.c:294 #10 0x40356be6 in mp_dso_unload (data=0x808e714) at mod_perl.c:489 Have you tried a statically linked mod_perl? Yes. See below. 73, Ged. Here's what HAS been tried: - -O2 vs. -O3 vs. -g - Perl's malloc vs. system malloc - static vs. dynamic loading of httpd modules - Different Berkeley db in case there were discrepancies with that - Different compilers, RedHat releases, glibc releases - --enable-rule=EXPAT vs. --disable-rule=EXPAT All failed. -- \_/} Mark P. Fister Java, Java, everywhere, and all\_/} \_/} eBay, Inc. the cups did shrink; Java, Java\_/} \_/} Austin, TX everywhere, nor any drop to drink! \_/}
RE: [Q] SIGSEGV After fork()
Hi there, On Thu, 7 Feb 2002, Fister, Mark wrote: Tried that. Note: you also tried to help a fellow back in November of 2001 on this VERY same stack trace. http://groups.yahoo.com/group/modperl/message/39560 Heh, didn't get very far with Lynx on that URL... does anybody know what happened to that one? You've obviously read the docs, so I take it the same compiler built Aapche, mod_perl and Perl. Have you tried this on RH6.2 with the compiler that came with that? Yes. Note also: the problem didn't use to happen with perl 5.00404, mod_perl 1.08 and apache 1.3b5 (with exactly the same codebase). 5.00404 ?? 1.08 !?!... Ah. Now we're getting somewhere. Maybe. Why not try Perl 5.7.2? I'm using it in development, did some pretty heavy stuff with 5.7.0 and it was fine, then I ran into SIGSEVs and things trying to do some simple profiling with Devel::DProf on some simple code (heavy data:) which went away when I installed 5.7.2. (BTW thanks Stas!:) 73, Ged.
RE: [Q] SIGSEGV After fork()
On Thu, Feb 07, 2002 at 09:35:18PM +, Ged Haywood wrote: Hi there, On Thu, 7 Feb 2002, Fister, Mark wrote: Tried that. Note: you also tried to help a fellow back in November of 2001 on this VERY same stack trace. http://groups.yahoo.com/group/modperl/message/39560 Heh, didn't get very far with Lynx on that URL... does anybody know what happened to that one? You've obviously read the docs, so I take it the same compiler built Aapche, mod_perl and Perl. Have you tried this on RH6.2 with the compiler that came with that? Yes. Note also: the problem didn't use to happen with perl 5.00404, mod_perl 1.08 and apache 1.3b5 (with exactly the same codebase). 5.00404 ?? 1.08 !?!... Ah. Now we're getting somewhere. Maybe. Why not try Perl 5.7.2? I'm using it in development, did some pretty heavy stuff with 5.7.0 and it was fine, then I ran into SIGSEVs and things trying to do some simple profiling with Devel::DProf on some simple code (heavy data:) which went away when I installed 5.7.2. (BTW thanks Stas!:) Tried 5.7.2. I still have core dumps. This is why I decided to try the mod_perl list instead of p5p. I'm definitely nearly in tears. :( NOTE: some of our Apache-based servers have no problem with the same Apache/Perl/mod_perl that we're running. However, others do... and trying to do module list comparisons between the ones that do and don't doesn't come up with anything definitive. 73, Ged. -- \_/} Mark P. Fister Java, Java, everywhere, and all\_/} \_/} eBay, Inc. the cups did shrink; Java, Java\_/} \_/} Austin, TX everywhere, nor any drop to drink! \_/}
Re: [Q] SIGSEGV After fork()
Hi there, On Wed, 6 Feb 2002, Mark P. Fister wrote: Collectively, we've been at this for more than two weeks and have searched various mod_perl archives, all to no avail. :( SIGSEGV after fork(). Very reproducible. Memory corruption gets moved around if the codebase changes. [snip] Perl: [snip] config_args='-des -Doptimize=-g -Dusedevel -Uinstallusrbinperl -Ubincompat5005 -Uusemymalloc -Dcc=gcc -pipe -g -I/usr/include/db3 -Dcccdlflags=-fPIC -Dinstallprefix=/home/mfister/vendor/perl-5.6.1 -Dprefix=/home/mfister/vendor/perl-5.6.1 [EMAIL PROTECTED] -Dinstallman1dir=/home/mfister/vendor/perl-5.6.1/man/man1 -Dinstallman3dir=/home/mfister/vendor/perl-5.6.1/man/man3 -Dman1dir=/home/mfister/vendor/perl-5.6.1/man/man1 -Dman3dir=/home/mfister/vendor/perl-5.6.1/man/man3 -Dd_dosuid=n -Dd_semctl_semun -Di_db -Di_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Uuselargefiles' You might try usemymalloc. Compiler: optimize='-g', H... ccversion='', gccversion='2.96 2731 (Red Hat Linux 7.1 2.96-85)', gccosandvers='' You've obviously read the docs, so I take it the same compiler built Aapche, mod_perl and Perl. Have you tried this on RH6.2 with the compiler that came with that? Stack Trace: === #0 __pthread_mutex_lock (mutex=0x8bf04999) at mutex.c:99 #1 0x401b9cc8 in __libc_free (mem=0x4046cc18) at malloc.c:3152 #2 0x403ce028 in Perl_safesysfree (where=0x4046cc18) at util.c:158 #3 0x403f20d8 in Perl_sv_clear (sv=0x8198f60) at sv.c:3827 #4 0x403f2473 in Perl_sv_free (sv=0x8198f60) at sv.c:3950 #5 0x403f80e1 in do_clean_all (sv=0x8198f60) at sv.c:8411 #6 0x403e9c5e in S_visit (f=0x403f8094 do_clean_all) at sv.c:162 #7 0x403e9ce2 in Perl_sv_clean_all () at sv.c:193 #8 0x4038594a in perl_destruct (my_perl=0x809a9a8) at perl.c:665 #9 0x4035629c in perl_shutdown (s=0x0, p=0x0) at mod_perl.c:294 #10 0x40356be6 in mp_dso_unload (data=0x808e714) at mod_perl.c:489 Have you tried a statically linked mod_perl? 73, Ged.