Re: segfault in apr_bucket_delete
On Sun, 23 May 2004, Noah Misch wrote: > Perhaps a ``configure'' option for bucket debugging is in order? For that > matter, why not make it the default; most folks who build their own APU do so > for a development project, so intuitive failure modes will trump efficiency in > early use. I can roll a patch, if appropriate. I feel like I went to implement that at some point and ran into trouble... but at this point I don't remember what that would have been. Maybe something to do with certain flags being shared between apr and apr-util. Anyway if you want to do it, feel free. :) --Cliff
Re: segfault in apr_bucket_delete
On Sat, May 22, 2004 at 04:21:02PM -0400, Cliff Woolley wrote: > On Fri, 21 May 2004, Stas Bekman wrote: > > > I understand all that, but I guess I fail to pass the point across. It is > > not > > a problem that I encounter in my code. On the contrary I'm writing tests > > that > > exercise, both valid and invalid ways the API can be called. API that hangs > > when called in invalid way is a problem. Don't you think? > > > >APR_BUCKET_INSERT_BEFORE(fb, db); > > The thing is, it would not be this macro that hangs. All this macro can > do is segfault (if one of the pointers is null, meaning the brigade was > previously corrupted), or do what it's supposed to do (though in doing so > it could potentially corrupt some other brigade, which is what happens > here -- if the bucket being inserted is still in a brigade, as db is, then > that brigade will be corrupted by this operation). The only way to detect > that such corruption will occur is to check the entire ring... that's a > linear time checking operation tacked on to a constant time insertion > operation... not acceptable. :) However, if you compile with bucket > debugging turned on, those validity checks WILL be done. Perhaps a ``configure'' option for bucket debugging is in order? For that matter, why not make it the default; most folks who build their own APU do so for a development project, so intuitive failure modes will trump efficiency in early use. I can roll a patch, if appropriate.
Re: segfault in apr_bucket_delete
On Sat, 22 May 2004, Stas Bekman wrote: > > that brigade will be corrupted by this operation). > > Do you suggest that the sample program that I posted doesn't hang in that > macro, but after it? That should be correct, yes. You'll end up creating a loop in the brigade, and walking through the brigade will thus hang. The first time you walk through the brigade in that code is when you clean it up. > I guess that works for me. If in the future someone reports a problem, I can > suggest to them what you've prescribed above. It's just that there could be > other reasons for the hanging, which is usually hard to figure out w/o being > in the user's shoes. > > Thanks Cliff and Joe. You're welcome. :) --Cliff
Re: segfault in apr_bucket_delete
Cliff Woolley wrote: On Fri, 21 May 2004, Stas Bekman wrote: I understand all that, but I guess I fail to pass the point across. It is not a problem that I encounter in my code. On the contrary I'm writing tests that exercise, both valid and invalid ways the API can be called. API that hangs when called in invalid way is a problem. Don't you think? APR_BUCKET_INSERT_BEFORE(fb, db); The thing is, it would not be this macro that hangs. All this macro can do is segfault (if one of the pointers is null, meaning the brigade was previously corrupted), or do what it's supposed to do (though in doing so it could potentially corrupt some other brigade, which is what happens here -- if the bucket being inserted is still in a brigade, as db is, then that brigade will be corrupted by this operation). Do you suggest that the sample program that I posted doesn't hang in that macro, but after it? I didn't step through to check, just saw that when I remove it or fix the order things work just fine, so it could be just so. I need to check that. The only way to detect that such corruption will occur is to check the entire ring... that's a linear time checking operation tacked on to a constant time insertion operation... not acceptable. :) Absolutely! However, if you compile with bucket debugging turned on, those validity checks WILL be done. I guess that works for me. If in the future someone reports a problem, I can suggest to them what you've prescribed above. It's just that there could be other reasons for the hanging, which is usually hard to figure out w/o being in the user's shoes. Thanks Cliff and Joe. -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
On Fri, 21 May 2004, Stas Bekman wrote: > I understand all that, but I guess I fail to pass the point across. It is not > a problem that I encounter in my code. On the contrary I'm writing tests that > exercise, both valid and invalid ways the API can be called. API that hangs > when called in invalid way is a problem. Don't you think? > >APR_BUCKET_INSERT_BEFORE(fb, db); The thing is, it would not be this macro that hangs. All this macro can do is segfault (if one of the pointers is null, meaning the brigade was previously corrupted), or do what it's supposed to do (though in doing so it could potentially corrupt some other brigade, which is what happens here -- if the bucket being inserted is still in a brigade, as db is, then that brigade will be corrupted by this operation). The only way to detect that such corruption will occur is to check the entire ring... that's a linear time checking operation tacked on to a constant time insertion operation... not acceptable. :) However, if you compile with bucket debugging turned on, those validity checks WILL be done. --Cliff
Re: segfault in apr_bucket_delete
Cliff Woolley wrote: On Fri, 21 May 2004, Stas Bekman wrote: Joe Orton wrote: On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote: fb = apr_bucket_flush_create(ba); db = apr_bucket_transient_create("aaa", 3, ba); APR_BRIGADE_INSERT_HEAD(bb, db); APR_BUCKET_INSERT_BEFORE(fb, db); The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works for me with the arguments switched. right, but why does it hang when reversed. APR_BUCKET_INSERT_BEFORE(fb, db) expands to something like: APR_BUCKET_NEXT(db) = fb; APR_BUCKET_PREV(db) = APR_BUCKET_PREV(fb); APR_BUCKET_NEXT(APR_BUCKET_PREV(fb)) = db; APR_BUCKET_PREV(fb) = db; Obviously for this to work, all that has to happen is that fb's prev pointer and the next pointer of that bucket must correctly point to each other. Everything else is arbitrarily overwritten. Did you try running this with bucket debugging turned on like I suggested? If you do that, then a bunch of ring consistency checks will be run for you at strategic times that might help you discern when it is that your brigade gets corrupted. Shouldn't it work both ways? If not, then it should produce an error and not hang. No... it's just a macro manipulating some pointers. Error handling would be difficult (given the number of layers of macros) and expensive. I understand all that, but I guess I fail to pass the point across. It is not a problem that I encounter in my code. On the contrary I'm writing tests that exercise, both valid and invalid ways the API can be called. API that hangs when called in invalid way is a problem. Don't you think? APR_BUCKET_INSERT_BEFORE(fb, db); is not the most intuitive API, and it's very easy to mix the arguments (since both are of the same type). I have to pause every time and think hard to see whether I've got it right. Granted, if I was passing NULL or a corrupted reference and getting a segfault, then it'll be my problem. But how do you suggest that we protect users from doing mistakes and more important how do we point out those mistakes in the error message and not having each user submit a bug report, us waste hours trying to understand what the problem is, just to discover that the user got the arguments in the wrong order. I suppose if APR doesn't do validation, we will be forced to write wrappers which will do the validation :( I understand that this validation may slow things down and therefore an undesired thing. I'm not sure what's the happy compromise here. -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
On Fri, 21 May 2004, Stas Bekman wrote: > Joe Orton wrote: > > On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote: > > > >>fb = apr_bucket_flush_create(ba); > >>db = apr_bucket_transient_create("aaa", 3, ba); > >>APR_BRIGADE_INSERT_HEAD(bb, db); > >>APR_BUCKET_INSERT_BEFORE(fb, db); > > > > The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works > > for me with the arguments switched. > > right, but why does it hang when reversed. APR_BUCKET_INSERT_BEFORE(fb, db) expands to something like: APR_BUCKET_NEXT(db) = fb; APR_BUCKET_PREV(db) = APR_BUCKET_PREV(fb); APR_BUCKET_NEXT(APR_BUCKET_PREV(fb)) = db; APR_BUCKET_PREV(fb) = db; Obviously for this to work, all that has to happen is that fb's prev pointer and the next pointer of that bucket must correctly point to each other. Everything else is arbitrarily overwritten. Did you try running this with bucket debugging turned on like I suggested? If you do that, then a bunch of ring consistency checks will be run for you at strategic times that might help you discern when it is that your brigade gets corrupted. > Shouldn't it work both ways? If > not, then it should produce an error and not hang. No... it's just a macro manipulating some pointers. Error handling would be difficult (given the number of layers of macros) and expensive. --Cliff
Re: segfault in apr_bucket_delete
Joe Orton wrote: On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote: fb = apr_bucket_flush_create(ba); db = apr_bucket_transient_create("aaa", 3, ba); APR_BRIGADE_INSERT_HEAD(bb, db); APR_BUCKET_INSERT_BEFORE(fb, db); The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works for me with the arguments switched. right, but why does it hang when reversed. Shouldn't it work both ways? If not, then it should produce an error and not hang. -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
On Thu, May 20, 2004 at 03:54:58PM -0700, Stas Bekman wrote: > fb = apr_bucket_flush_create(ba); > db = apr_bucket_transient_create("aaa", 3, ba); > APR_BRIGADE_INSERT_HEAD(bb, db); > APR_BUCKET_INSERT_BEFORE(fb, db); The arguments to APR_BUCKET_INSERT_BEFORE are reversed, right? It works for me with the arguments switched. joe
Re: segfault in apr_bucket_delete
Ok, here is a mod_perl handler that reliably segfaults: sub handler { my $r = shift; my $ba = $r->connection->bucket_alloc; my $d1 = APR::Bucket->new("d1"); my $f1 = APR::Bucket::flush_create($ba); my $bb = APR::Brigade->new($r->pool, $ba); $bb->insert_head($d1); # d1->f1 $f1->insert_before($d1); 0; } I'm writing all kind of tests to exercise various insertion techniques and make sure it works or fails with a useful error message, rather than segfault. In this case I create a bucket brigade, one data and one flush buckets. Now I insert the data bucket into the head of bb, and then try to insert that data bucket before the flush bucket, thus I think linking bb->db->fb. It segfaults as reported before (thought the circumstances are right this time). Though when I try to convert it to an equivalent C program, it hangs. Here is a small program I've used to try to reproduce the problem. It's not exactly the same as a perl case, where a custom bucket type is used. But I can't get the C one to run and hopefully give you a test case: #include #include "apr_general.h" #include "apr_hooks.h" #include "apr_buckets.h" #include "apr_pools.h" int main(void) { apr_status_t rv; apr_initialize(); if (apr_hook_global_pool == NULL) { apr_pool_t *global_pool; rv = apr_pool_create(&global_pool, NULL); if (rv != APR_SUCCESS) { fprintf(stderr, "failed to create pool"); exit(1); } apr_hook_global_pool = global_pool; } { apr_pool_t *pool; apr_bucket_alloc_t *ba; apr_bucket_brigade *bb; apr_bucket *fb, *db; rv = apr_pool_create(&pool, apr_hook_global_pool); if (rv != APR_SUCCESS) { fprintf(stderr, "failed to create pool"); exit(1); } ba = apr_bucket_alloc_create(pool); bb = apr_brigade_create(pool, ba); fb = apr_bucket_flush_create(ba); db = apr_bucket_transient_create("aaa", 3, ba); APR_BRIGADE_INSERT_HEAD(bb, db); APR_BUCKET_INSERT_BEFORE(fb, db); apr_pool_clear(pool); } apr_terminate(); exit(0); } I've built it as: gcc -I/home/stas/httpd/prefork/include -Wall -L/home/stas/httpd/prefork/lib -lapr-0 -lrt -lm -lcrypt -lnsl -lpthread -ldl -laprutil-0 -lgdbm -ldb-4.0 -lexpat bb.c -o bb It hangs in APR_BUCKET_INSERT_BEFORE(fb, db); ... set_thread_area({entry_number:-1 -> 6, base_addr:0x4030ea20, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 munmap(0x40018000, 86655) = 0 set_tid_address(0x4030ea68) = 28339 rt_sigaction(SIGRTMIN, {0x400cb650, [], SA_RESTORER|SA_SIGINFO, 0x400d2210}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [RTMIN], NULL, 8) = 0 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0 brk(0) = 0x804a000 brk(0x806b000) = 0x806b000 brk(0) = 0x806b000 Any idea why? -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
Trying to reproduce it in a standalone C program didn't work, it's more complex than what I thought, will keep you posted when I get a reproducable test case. -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
Joe Orton wrote: On Thu, May 20, 2004 at 02:41:43AM -0700, Stas Bekman wrote: Stas Bekman wrote: Doing just: apr_brigade_create(p, ba); and leaving here segfaults: [...] With what 'p' and 'ba'? r->pool r->connection->bucket_alloc > Can you post the complete test case? Well, it's coming from mod_perl. Is there something similar to mod_example.c that I can throw the C code in to give you a reproducible case? I guess I could use mod_example for that :) If you have mod_perl 2 around just write this simple handler: use APR::Brigade (); use Apache::Connection (); use Apache::RequestRec (); sub handler { my $r = shift; my $bb = APR::Brigade->new($r->pool, $r->connection->bucket_alloc); 0; } it translates to: apr_brigade_create(r->pool, r->connection->bucket_alloc) I think the segfault happens, when r->pool's cleanup is run. -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: segfault in apr_bucket_delete
On Thu, 20 May 2004, Stas Bekman wrote: > #0 0x4037db83 in mallopt () from /lib/tls/libc.so.6 > #1 0x4037b8ba in free () from /lib/tls/libc.so.6 A segfault in free() more or less always means heap corruption has previously occurred. You might try enabling bucket debugging at compile time -- that will help check for double-frees and so forth. Or just do what Joe said and post the whole test case. :) Okay, I'm hittin the road. Talk to you guys from Corbin, KY.
Re: segfault in apr_bucket_delete
On Thu, May 20, 2004 at 02:41:43AM -0700, Stas Bekman wrote: > Stas Bekman wrote: > >Doing just: > > > > apr_brigade_create(p, ba); > > > >and leaving here segfaults: > [...] With what 'p' and 'ba'? Can you post the complete test case? joe
Re: segfault in apr_bucket_delete
Stas Bekman wrote: Doing just: apr_brigade_create(p, ba); and leaving here segfaults: [...] I think my manual expansion of multiple nested macros went wrong somewhere, the real backtrace for the cvs version is: #0 0x4037db83 in mallopt () from /lib/tls/libc.so.6 #1 0x4037b8ba in free () from /lib/tls/libc.so.6 #2 0x4017a847 in apr_brigade_cleanup (data=0x93e6ff8) at apr_brigade.c:50 #3 0x4017a7d6 in brigade_cleanup (data=0x93e6ff8) at apr_brigade.c:33 #4 0x40277e4e in run_cleanups (cref=0x93f88f0) at apr_pools.c:1997 #5 0x402775eb in apr_pool_destroy (pool=0x93d90f0) at apr_pools.c:763 #6 0x402775ad in apr_pool_clear (pool=0x93d30d8) at apr_pools.c:723 #7 0x080d9a76 in child_main (child_num_arg=1) at prefork.c:528 #8 0x080d9dbb in make_child (s=0x81420c0, slot=1) at prefork.c:703 #9 0x080d9e30 in startup_children (number_to_start=1) at prefork.c:721 #10 0x080da235 in ap_mpm_run (_pconf=0x813d0a8, plog=0x81851c8, s=0x81420c0) at prefork.c:940 #11 0x080e0ea9 in main (argc=9, argv=0xb264) at main.c:619 So I guess it segfaults elsewhere inside the multiple nested macros. I hope you have a clear head, I don't, I'm heading to sleep. Just write that one liner from above to reproduce. Thanks! -- __ Stas BekmanJAm_pH --> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com