On 31/12/08 05:39, Tupshin Harper wrote:
Some comments inline, but the really useful stuff at the bottom.

Same here, keep reading ;-)

Thank you very much for your help.

-Tupshin

Philippe M. Chiasson wrote:
On 29/12/08 18:56, Tupshin Harper wrote:
1. Problem Description:

I'm attempting to upgrade one of the largest (measured both by users and
lines of code, I suspect) mod_perl sites from mod_perl 1 to mod_perl 2,
and also from 32 bit OS to 64 bit at the same time. I converted our
calls to use the new API, and basic functionality started working.
However, I am experiencing frequent segfaults in APR::Table (stack trace
below) when loading pages.
Just out of curiosity, are you handling APR::Table objects directly ?
The only places where APR is ever mentioned are:
http://code.livejournal.org/trac/livejournal/browser/branches/modernize/cgi-bin/LJ/Request/Apache2.pm
(comment line 76)
and
http://code.livejournal.org/trac/bml/browser/branches/modernize/lib/Apache/BML.pm
(the two "use" statements, but those appear to actually be unused.

Sometimes, when you call something in mod_perl land, you might get an 
APR::Table back, say:

my $o = $r->headers_out;

#$o is an APR::Table now, and you need to
use APR::Table;

#before you can call
$o->get([...])

So no.

Okay, so this probably means you are hitting a mod_perl bug, not something
evil your code is doing with APR::Table's

Somewhere betwen 1 out of every 2-4 page
loads will cause it. Identical problem occurs on:
64 bit Debian Lenny with stock mod_perl 2.0.4
64 bit Debian Lenny with hand-built mod_perl 2.0.5-dev from latest
source.
64 bit Centos 5.2 with stock mod_perl 2.0.2.

Let me know if there is any other information you need.
See below. Of course, a shorter, reproducible test case would be the
ideal.
Agreed, but given the complexity of the entire system and the fact that
we are using a home-brewed templating system (bml), makes it quite
difficult. I'll work on that if nothing else proves fruitful.

I know, it's sometimes tricky to boil down problems that only appear sometimes
in the wild and in a large system.

I have not yet
tried it with mod_perl 2 on a 32-bit OS.

[...]

Method it crashes in:

/* Try to shortcut apr_table_get by fetching the key using the current
   * iterator (unless it's inactive or points at different key).
   */
static MP_INLINE const char *mpxs_APR__Table_FETCH(pTHX_ SV *tsv,
                                                     const char *key)
{
      SV* rv = modperl_hash_tied_object_rv(aTHX_ "APR::Table", tsv);
      const int i = mpxs_apr_table_iterix(rv);
      apr_table_t *t = INT2PTR(apr_table_t *, SvIVX(SvRV(rv)));
Possibly smells like a 64 bit issue to me.
My next step will be to confirm this theory by bringing it up on a 32
bit instance.

Any changes/update with that ?

      const apr_array_header_t *arr = apr_table_elts(t);
      apr_table_entry_t *elts = (apr_table_entry_t
*)arr->elts;<---crashing line 186
Can you get a little more information out of the current local variables.

i.e. I'd be interested in seeing the value of:

i
*t
*arr

Which you can easily do from withing gdb with

(gdb) display *t
(gdb) display *arr

"i" is never anything but zero in the cases I'm looking at
a typical value for "t" is "(apr_table_t *) 0x956bfa0"
but printing *t always generates<incomplete type>
however, there is useful wrongness in "arr" and "elts".

A quick adendum to my previous report:

Sometimes it crashes directly on line 186, and in those cases, arr =
0x4f5349203a746573 (or something similar), and printing *arr reasonably
says "Cannot access memory at address 0x4f5349203a746573"

In other cases, it crashes within the apr_table_get(t, key) call on line
192. In those cases, "arr" is more reasonable, e.g.
(const apr_array_header_t *) 0x956bfa0
but *arr is:
  {pool = 0x636f6c2f7273752f, elt_size = 1932487777, nelts = 980314466,
nalloc = 1920169263,
   elts = 0x2f3a6e69622f6c61<Address 0x2f3a6e69622f6c61 out of bounds>}
elts is:
(apr_table_entry_t *) 0x2f3a6e69622f6c61
and *elts is:
Cannot access memory at address 0x2f3a6e69622f6c61

So, to summarize, when it crashes on line 186, *arr is a bad pointer,
and when it crashes when calling apr_table_get from line 192, *elts is a
bad pointer.

Starting to smell more and more like bad pointer mangling when in 64bit.

Forgot to ask, but can you dump the SV *rv and *tsv like so:

(gdb) call sv_dump(rv)
(gdb) call sv_dump(tsv)

Thanks, in the meantime, I am trying to visualize why this might be
hapenning. Everything so far looks like it's using the correct macros
to safely convery between IV and pointer. Hrm.

One way to dig into this further would be to add extra debugging to
 modperl_hash_tied_object
 modperl_hash_tie

when classname=="APR::Table"

and see how the void * is converted back and forth.

--
Philippe M. Chiasson     GPG: F9BFE0C2480E7680 1AE53631CB32A107 88C3A5A5
http://gozer.ectoplasm.org/       m/gozer\@(apache|cpan|ectoplasm)\.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@perl.apache.org
For additional commands, e-mail: dev-h...@perl.apache.org

Reply via email to