Hi there people,

I have inherited a web server running mod_perl and I am experiencing a
somewhat critical problem: http processes sometimes get into an infinite
loop, using 100% cpu time, and given enough time bring the machine to a
halt.

I've done a lot of testing, and there isn't a specific http request that
triggers the behaviour, eventhough it always happens after a request. It
seems to happen every few hours: the httpd process simply starts hogging
up the CPU, and won't let go of it. After a while, I have a few of these
processes running, and the machine's load average skyrockets. Sometimes
it's bad enough I'm not even able to log in via console.

I'll upgrade all the software to new versions, but apparently this
problem has been ocurring for a while, and survived a couple of
hardware/software upgrades. I'll also be rewriting the perl code running
there to see if it stops the problem (the code isn't too clean - lots of
global variables, not written under strict, etc, but "it works").
However, it would be cool if someone could enlighten me on what's going
on, and possibly suggest a fix :).

Thanks a lot!

signed,
gustavo

<begin debugging info>

=> our OS is:

[root@blueland /root]# uname -a
Linux blueland 2.2.14-5.0 #4 Wed Apr 12 20:28:28 MDT 2000 i586 unknown

=> Apache:

Server Version: Apache/1.3.6 (Unix) mod_perl/1.19 mod_ssl/2.2.8
OpenSSL/0.9.2b

=> let's look into one of the monster processes:

497 ?        R    288:06 /usr/local/apache_1.3.6/bin/httpd

=> (nice cpu time there...)
=> now for gdb...

[root@blueland /root]# gdb /usr/local/apache/bin/httpd 497
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.

Attaching to program: /usr/local/apache/bin/httpd, Pid 497
Reading symbols from /lib/libNoVersion.so.1...done.
Reading symbols from /lib/libm.so.6...done.
Reading symbols from /lib/libcrypt.so.1...done.
Reading symbols from
/usr/lib/perl5/5.00502/i586-linux/CORE/libperl.so...done.
Reading symbols from /lib/libnsl.so.1...done.
Reading symbols from /lib/libdl.so.2...done.
Reading symbols from /lib/libc.so.6...done.
Reading symbols from /lib/ld-linux.so.2...done.
Reading symbols from /lib/libnss_files.so.2...done.
Reading symbols from
/usr/lib/perl5/site_perl/5.005/i586-linux/auto/Sybase/DBlib/DBlib.none...done.

Reading symbols from /opt/sybase/lib/libsybdb.so...done.
Reading symbols from /opt/sybase/lib/libinsck.so...done.
Reading symbols from /lib/libnss_nisplus.so.2...done.
Reading symbols from /lib/libnss_nis.so.2...done.
0x80a7407 in free_blocks ()

=> let's see what the stack tells us...

(gdb) i s
#0  0x80a7407 in free_blocks ()
#1  0x80a7660 in ap_clear_pool ()
#2  0x80a76a1 in ap_destroy_pool ()
#3  0x80be71a in ap_destroy_sub_req ()
#4  0x8070d8a in XS_Apache__SubRequest_DESTROY ()
#5  0x400d0f45 in Perl_pp_entersub () at pp_hot.c:1965
#6  0x4007aa8d in perl_call_sv () at perl.c:1008
#7  0x400d9d4c in Perl_sv_clear () at sv.c:2418
#8  0x400da451 in Perl_sv_free () at sv.c:2418
#9  0x400d24ab in do_clean_objs (sv=0x8385744) at sv.c:338
#10 0x400d237c in visit (f=0x400d2410 <do_clean_objs>) at sv.c:306
#11 0x400d263f in Perl_sv_clean_objs () at sv.c:359
#12 0x40077fa5 in perl_destruct () at perl.c:1008
#13 0x80635a3 in perl_shutdown ()
#14 0x80645c5 in perl_child_exit ()
#15 0x8064430 in perl_child_exit_cleanup ()
#16 0x80a8dd6 in run_cleanups ()
#17 0x80a762c in ap_clear_pool ()
#18 0x80a76a1 in ap_destroy_pool ()
#19 0x80b3b03 in clean_child_exit ()
#20 0x80b6773 in child_main ()
#21 0x80b6e14 in make_child ()
#22 0x80b716e in perform_idle_server_maintenance ()
#23 0x80b7665 in standalone_main ()
#24 0x80b7cd3 in main ()

=> registers have:

(gdb) i r
eax            0x8a3e408        144958472
ecx            0x8608414        140542996
edx            0x8a3e408        144958472
ebx            0x8239208        136548872
esp            0xbffff90c       0xbffff90c
ebp            0xbffff910       0xbffff910
esi            0x1      1
edi            0xffffffff       -1
eip            0x80a7407        0x80a7407
eflags         0x202    514
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x0      0
gs             0x0      0

=> this is a dump of the running function

(gdb) disas
Dump of assembler code for function free_blocks:
0x80a73e0 <free_blocks>:        push   %ebp
0x80a73e1 <free_blocks+1>:      mov    %esp,%ebp
0x80a73e3 <free_blocks+3>:      sub    $0x4,%esp
0x80a73e6 <free_blocks+6>:      cmpl   $0x0,0x8(%ebp)
0x80a73ea <free_blocks+10>:     jne    0x80a73f0 <free_blocks+16>
0x80a73ec <free_blocks+12>:     jmp    0x80a7439 <free_blocks+89>
0x80a73ee <free_blocks+14>:     mov    %esi,%esi
0x80a73f0 <free_blocks+16>:     mov    0x8146b88,%eax
0x80a73f5 <free_blocks+21>:     mov    %eax,0xfffffffc(%ebp)
0x80a73f8 <free_blocks+24>:     mov    0x8(%ebp),%eax
0x80a73fb <free_blocks+27>:     mov    %eax,0x8146b88
0x80a7400 <free_blocks+32>:     mov    0x8(%ebp),%eax
0x80a7403 <free_blocks+35>:     cmpl   $0x0,0x4(%eax)
0x80a7407 <free_blocks+39>:     jne    0x80a740c <free_blocks+44>
0x80a7409 <free_blocks+41>:     jmp    0x80a7424 <free_blocks+68>
0x80a740b <free_blocks+43>:     nop
0x80a740c <free_blocks+44>:     mov    0x8(%ebp),%eax
0x80a740f <free_blocks+47>:     mov    0x8(%ebp),%ecx
0x80a7412 <free_blocks+50>:     add    $0xc,%ecx
0x80a7415 <free_blocks+53>:     mov    %ecx,0x8(%eax)
0x80a7418 <free_blocks+56>:     mov    0x8(%ebp),%eax
0x80a741b <free_blocks+59>:     mov    0x4(%eax),%edx
0x80a741e <free_blocks+62>:     mov    %edx,0x8(%ebp)
0x80a7421 <free_blocks+65>:     jmp    0x80a7400 <free_blocks+32>
0x80a7423 <free_blocks+67>:     nop
0x80a7424 <free_blocks+68>:     mov    0x8(%ebp),%eax
0x80a7427 <free_blocks+71>:     mov    0x8(%ebp),%ecx
0x80a742a <free_blocks+74>:     add    $0xc,%ecx
0x80a742d <free_blocks+77>:     mov    %ecx,0x8(%eax)
0x80a7430 <free_blocks+80>:     mov    0x8(%ebp),%eax
0x80a7433 <free_blocks+83>:     mov    0xfffffffc(%ebp),%edx
0x80a7436 <free_blocks+86>:     mov    %edx,0x4(%eax)
0x80a7439 <free_blocks+89>:     leave
0x80a743a <free_blocks+90>:     ret

=> strace doesn't show any system calls going on, which makes sense,
since we don't ever leave free_blocks()

.EOF.

Reply via email to