**********************
**********************
My question: how can I fix this bug, avoid this bug, or get a new
release of libapreq2 that solves this problem? Thank you in advance.
Details follow. We're willing to add new debugging to our code to
help out in debugging this. Let me know if I can provide more info.
**********************
**********************
The problem:
With apache 2.2.8 with worker_mpm on linux 64 bit AMD (more OS details
dumped below):
libapreq2, checked out of subversion on May 6, 2008, and dating back
to at least a version from early 2007, has a consistent problem. The
problem is that occasionally, the following code will cause apache to
segfault:
_________
static int do_post(request_rec *r)
{
apreq_handle_t *handle;
const apreq_module_t *module;
apreq_param_t *filebody, *testfilebody;
char *loopfilename, *myip, *disk, *tmpfile, *newfilename,
*extensionfilename, *lastdot;
EVP_MD_CTX mdctx;
const EVP_MD *md;
unsigned char md_value[EVP_MAX_MD_SIZE];
char *localorigfilename_raw, *localorigfilename;
char null = '\0';
apr_file_t *f = NULL;
unsigned int md_len, copies, filesize = 0, any_processed = 0;
apr_size_t bytes;
int i, lresult = 0, isactivex = 0, isflash = 0, tmplen = 0;
modbittybe_config *s_cfg =
ap_get_module_config(r->server->module_config, &bittybe_module);
myip = s_cfg->myip;
copies = s_cfg->copies;
connect_to_fsdb_t(&m, s_cfg->dbip, s_cfg->dbport);
handle = (apreq_handle_t *)apreq_handle_apache2(r);
fprintf(stderr, "++++++++apache handle (3: %#lx)++++++++++\n", handle);
fflush(stderr);
module = handle->module;
_____________
At the last line, apache segfaults approximately .05% of the time (3
times a day or so on a thousand plus POST uploads). You can download
the core dump here: http://abra.ms/apreq/core.5174.bz2. Here is what
the core dump shows:
-------------- GDB SESSION ------------
[EMAIL PROTECTED]:~$ gdb /usr/local/apache20/bin/httpd
/tmp/apache2-gdb-dump/core.5174
.. symbol loading removed ...
(gdb) thread apply all bt full
.. many threads ignored -- failing thread shows ..
Thread 7 (process 5201):
#0 0x00002ada07239409 in kill () from /lib/libc.so.6
No symbol table info available.
#1 <signal handler called>
No symbol table info available.
#2 0x00002ada09aab7ea in do_post_put (r=0x2aaaaac089a8) at mod_bittybe.c:838
retval = <value optimized out>
ua = 0x2aaaaac0a1ac "Shockwave Flash"
lenp = <value optimized out>
clen = 3679200
#3 0x00000000004348c9 in ap_run_handler (r=0x2aaaaac089a8) at config.c:157
n = 9
rv = 121909408
#4 0x00000000004379ac in ap_invoke_handler (r=0x2aaaaac089a8) at config.c:372
handler = 0x5ce950 "?VE"
result = 0
old_handler = 0x62f298 "bittyfs-be"
ignore = <value optimized out>
#5 0x0000000000458318 in ap_process_request (r=0x2aaaaac089a8) at
http_request.c:258
access_status = -1430251192
#6 0x00000000004557bc in ap_process_http_connection (c=0x6cc080) at
http_core.c:190
r = (request_rec *) 0x2aaaaac089a8
csd = (apr_socket_t *) 0x0
#7 0x000000000043b561 in ap_run_process_connection (c=0x6cc080) at
connection.c:43
n = 0
rv = 121909408
#8 0x000000000046e5c6 in worker_thread (thd=0x637800, dummy=<value
optimized out>) at worker.c:544
process_slot = 3
thread_slot = 25
csd = (apr_socket_t *) 0x6cbe68
bucket_alloc = (apr_bucket_alloc_t *) 0x2aaaaac0a948
last_ptrans = <value optimized out>
ptrans = (apr_pool_t *) 0x6cbde8
rv = <value optimized out>
is_idle = <value optimized out>
#9 0x00002ada06ff6f1a in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#10 0x00002ada072d35d2 in clone () from /lib/libc.so.6
No symbol table info available.
#11 0x0000000000000000 in ?? ()
No symbol table info available.
## line 838 is the last line from above: module = handle->module;
.. more threads we don't care about ..
(gdb) thread 7
[Switching to thread 7 (process 5201)]#0 0x00002ada07239409 in kill
() from /lib/libc.so.6
(gdb) where
#0 0x00002ada07239409 in kill () from /lib/libc.so.6
#1 <signal handler called>
#2 0x00002ada09aab7ea in do_post_put (r=0x2aaaaac089a8) at mod_bittybe.c:838
#3 0x00000000004348c9 in ap_run_handler (r=0x2aaaaac089a8) at config.c:157
#4 0x00000000004379ac in ap_invoke_handler (r=0x2aaaaac089a8) at config.c:372
#5 0x0000000000458318 in ap_process_request (r=0x2aaaaac089a8) at
http_request.c:258
#6 0x00000000004557bc in ap_process_http_connection (c=0x6cc080) at
http_core.c:190
#7 0x000000000043b561 in ap_run_process_connection (c=0x6cc080) at
connection.c:43
#8 0x000000000046e5c6 in worker_thread (thd=0x637800, dummy=<value
optimized out>) at worker.c:544
#9 0x00002ada06ff6f1a in start_thread () from /lib/libpthread.so.0
#10 0x00002ada072d35d2 in clone () from /lib/libc.so.6
#11 0x0000000000000000 in ?? ()
## line 838 is the last line from above: module = handle->module;
(gdb) up
#1 <signal handler called>
(gdb) up
#2 0x00002ada09aab7ea in do_post_put (r=0x2aaaaac089a8) at mod_bittybe.c:838
838 module = handle->module;
(gdb) info registers
rax 0xffffffffaac01948 -1430251192
rbx 0x3823e0 3679200
rcx 0x0 0
rdx 0x2ada074430a0 47115913146528
rsi 0x0 0
rdi 0x0 0
rbp 0x619170 0x619170
rsp 0x4d819eb0 0x4d819eb0
r8 0xffffffff 4294967295
r9 0xffffffff 4294967295
r10 0x2aaaaac01988 46912497523080
r11 0x202 514
r12 0x62f298 6484632
r13 0x2aaaaac09fc8 46912497557448
r14 0x2aaaaac089a8 46912497551784
r15 0x2aaaaac0a1ac 46912497557932
rip 0x2ada09aab7ea 0x2ada09aab7ea <do_post_put+762>
eflags 0x10246 [ PF ZF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb)
-------------- GDB SESSION END ------------
Note that RAX has a weird negative number. That turns out to be the
memory location listed for handle. Note the fprintf and fflush
debugging lines just before the problem? Here are the relevant log
lines from error.log:
___ error.log ___
{UPLOAD-POST-(5174, 1300343136) (POST)
(/save/DEFAULT)(appserver=www.tabblo.com&djsession=upload.665586)}
++++++++connect to mysql in do_post (1)++++++++++
++++++++connected! (2)++++++++++
++++++++apache handle (3: 0xffffffffaac01948)++++++++++
[Tue May 06 23:42:26 2008] [notice] child pid 5174 exit signal
Segmentation fault (11), possible coredump in /tmp/apache2-gdb-dump
______________
Normally, handle has a fairly low memory address, e.g.
__error.log__
{UPLOAD-POST-(5207, 1208023392) (POST) (/save/DEFAULT)}
++++++++connect to mysql in do_post (1)++++++++++
++++++++connected! (2)++++++++++
++++++++apache handle (3: 0x6bcb88)++++++++++
__________
The segfault appears to be due to handle having an illegal-to-read
memory address by the time its module member is referenced. I do not
appear to be doing anything untoward with respect to initializing
apreq, and so my current, defeasible, assumption is that the bug is
somewhere in the apreq library.
**********************
**********************
Some data:
$ uname -a
Linux g1t0241 2.6.18-6-amd64 #1 SMP Sun Feb 10 17:50:19 UTC 2008
x86_64 GNU/Linux
$ free
total used free shared buffers cached
Mem: 8179412 8119592 59820 0 1741940 1647724
-/+ buffers/cache: 4729928 3449484
Swap: 15976600 212 15976388
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/localhost-root
30512072 13434752 15527396 47% /
tmpfs 4089704 0 4089704 0% /lib/init/rw
udev 10240 192 10048 2% /dev
tmpfs 4089704 16 4089688 1% /dev/shm
/dev/cciss/c0d0p1 93307 13783 74707 16% /boot
/dev/mapper/localhost-home
8063408 4442740 3211068 59% /home
/dev/mapper/localhost-opt
8063408 380668 7273140 5% /opt
/dev/mapper/localhost-tmp
4031680 440880 3386000 12% /tmp
/dev/mapper/localhost-var
4031680 533076 3293804 14% /var
/dev/mapper/disk1 2112755696 740531332 1264902612 37% /mnt/disks/disk1
/dev/mapper/disk2 2112755696 779444608 1333311088 37% /mnt/disks/disk2
/dev/mapper/disk3 2112755696 778142276 1334613420 37% /mnt/disks/disk3
/dev/mapper/disk4 2112755696 782172644 1330583052 38% /mnt/disks/disk4