On Fri, 2008-03-07 at 16:36 -0600, Scott T. Hildreth wrote:
> This seems to be a threading error with the linux kernel version.
> I am running this process on newer kernels (2.6.22.x) and the error
> never occurs.  We also are experiencing a lot the "Futex WAIT" issues
> with Oracle and the 2.6.20 kernels.

The kernel upgrade didn't solve the problem.  Since the process didn't
crash on some of our servers and not the others, I narrowed down the
difference in the servers.  I concluded that all the SuSE Enterprise 
10 servers had the problem and the crash only occurred when execute_array()
method was used.  All of our servers have Oracle 10.2.0.3 so there wasn't 
a difference there and it didn't seem to matter if DBD::Oracle was 1.19 or 
1.20.  We basically decided that this was race problem with the threads, 
especially since it was an intermittent problem.  I decided to compile a 
DBD::Oracle with debugging symbols, hopping I would get better info from
the core file and gdb.  When I was running perl Makefile.PL a message 
appeared that I often ignored.

WARNING: If you have problems you may need to rebuild perl with threading 
enabled.

I build our own Perl in /usr/local/ and leave the vendor Perl alone.  I 
never compile with threads, since we have not found a need for them, yet.
So I used the /usr/bin/perl, which is always compiled with threading, and
the process stopped crashing.   So the WARNING never applied until now. I 
guess I will start building a threaded Perl on our SuSE Enterprise servers
from now on.  This seems to fixed the problem (knocking on wood).  I thought
would share my findings, just in case someone else runs into this same 
situation.
Save yourself time, read the WARNINGS. :-)   

Any ideas on why array processing would cause this to occur?  Did I just get 
lucky
and hit the right scenario for this to happen? Just curious.

                  Thanks.

> 
>                   Thanks for listening.
> 
> On Fri, 2008-03-07 at 11:25 -0600, Scott T. Hildreth wrote:
> > Should have posted to users not dev.  This is really a bizarre problem.
> > I can get it to fail about every fifth iteration otherwise the process
> > works.  I ran it from another server connect to the same database and 
> > it will intermittently fail.  I run it from a third sever and I can't 
> > get it to core dump.  All 3 servers have the same kernel & Perl
> > versions.  I've tried recompiling Perl, DBI, DBD::Oracle, still no luck.
> > I created a test case, which uses execute_array and of course I can't
> > get it to core dump.  If anyone has any ideas on what might be going on
> > here,  I would love to hear them!
> > 
> >                           Thanks
> >                              STH
> > 
> > On Wed, 2008-03-05 at 15:21 -0600, Scott T. Hildreth wrote:
> > > I am not sure how to describe this, my co-worker will run his process and 
> > > get a core dump
> > > (I pasted the back trace below) and then run the process again with no 
> > > core dumps.  Sometimes
> > > it will core dump several times in a row and then the next run it 
> > > finishes fine.  I ran the process
> > > with DBI_TRACE=9 and this is what shows up at the end of the log,
> > > 
> > > 
> > > 1   -> execute_for_fetch for DBD::Oracle::st (DBI::st=HASH(0xN)~INNER 
> > > CODE(0xN) undef)
> > >     ora_st_execute_array UPDATE count=10 (ARRAY(0xN) undef undef)...
> > >         
> > > OCIBindByName(112df38,1132188,10a9138,":p1",placeh_len=3,value_p=0,value_sz=-1517274788,dty=1,indp=0,alenp=0,rcodep=0,maxarr_len=0,curelep=0
> > >  (*=0),mode=2)=ERROR
> > >         OCIErrorGet(10a9138,1,"<NULL>",7fff058d684c,"ORA-02005: implicit 
> > > (-1) length not valid for this bind or define datatype
> > > ",1024,2)=SUCCESS
> > >     OCIErrorGet after OCIBindByName (er1:ok): -1, 2005: ORA-02005: 
> > > implicit (-1) length not valid for this bind or define datatype
> > > 
> > >         OCIErrorGet(10a9138,2,"<NULL>",7fff058d684c,"ORA-02005: implicit 
> > > (-1) length not valid for this bind or define datatype
> > > ",1024,2)=NO_DATA
> > > 
> > > At first I thought it was a 32bit library with a 64bit Perl problem, but 
> > > Oracle.so & Perl are both linked
> > > with the correct 64 bit libs.  The Oracle client is 10.2.0.3 and DBI 
> > > versions are,
> > > 
> > >   Perl            : 5.008008    (x86_64-linux)
> > >   OS              : linux       (2.6.20.19)
> > >   DBI             : 1.602
> > >   DBD::mysql      : 4.005
> > >   DBD::Sponge     : 12.010002
> > >   DBD::SQLite     : 1.13
> > >   DBD::Proxy      : 0.2004
> > >   DBD::Oracle     : 1.20
> > >   DBD::Multiplex  : 2.04
> > >   DBD::Gofer      : 0.010103
> > >   DBD::File       : 0.35
> > >   DBD::ExampleP   : 12.010007
> > >   DBD::DBM        : 0.03
> > > 
> > > I am going to try to isolate a small test case, but right now I wanted to 
> > > post what I 
> > > have found so far.
> > > 
> > >                              Thanks,
> > >                                STH
> > > 
> > > ############## Back Trace 
> > > #############################################################
> > > 
> > > (gdb) bt
> > > #0  0x00002b66ec7d9b95 in raise () from /lib64/libc.so.6
> > > #1  0x00002b66ec7daf90 in abort () from /lib64/libc.so.6
> > > #2  0x00002b66ec81035b in __libc_message () from /lib64/libc.so.6
> > > #3  0x00002b66ec81534e in malloc_printerr () from /lib64/libc.so.6
> > > #4  0x00002b66ec81695c in free () from /lib64/libc.so.6
> > > #5  0x00002b66ef0ac102 in ora_st_execute_array () from 
> > > /usr/local/perl-5.8.8/lib/site_perl/5.8.8/x86_64-linux/auto/DBD/Oracle/Oracle.so
> > > #6  0x00002b66ef0a62bf in XS_DBD__Oracle__st_ora_execute_array ()
> > >    from 
> > > /usr/local/perl-5.8.8/lib/site_perl/5.8.8/x86_64-linux/auto/DBD/Oracle/Oracle.so
> > > #7  0x000000000046bc47 in Perl_pp_entersub ()
> > > #8  0x000000000046a29e in Perl_runops_standard ()
> > > #9  0x000000000041e82d in Perl_call_sv ()
> > > #10 0x00002b66ec9ee038 in XS_DBI_dispatch () from 
> > > /usr/local/perl-5.8.8/lib/site_perl/5.8.8/x86_64-linux/auto/DBI/DBI.so
> > > #11 0x000000000046bc47 in Perl_pp_entersub ()
> > > #12 0x000000000046a29e in Perl_runops_standard ()
> > > #13 0x000000000041e82d in Perl_call_sv ()
> > > #14 0x00002b66ec9ee038 in XS_DBI_dispatch () from 
> > > /usr/local/perl-5.8.8/lib/site_perl/5.8.8/x86_64-linux/auto/DBI/DBI.so
> > > #15 0x000000000046bc47 in Perl_pp_entersub ()
> > > #16 0x000000000046a29e in Perl_runops_standard ()
> > > #17 0x000000000041f1d1 in perl_run ()
> > > #18 0x000000000041ba2c in main ()
> > > 
> > > ########################################################################################
> > > 
> > > *** glibc detected *** /usr/local/bin/perl: double free or corruption 
> > > (!prev): 0x0000000001163e10 ***
> > > ======= Backtrace: =========
> > > /lib64/libc.so.6[0x2ad39584e34e]
> > > /lib64/libc.so.6(__libc_free+0x6c)[0x2ad39584f95c]
> > > /usr/local/perl-5.8.8/lib/site_perl/5.8.8/x86_64-linux/auto/DBD/Oracle/Oracle.so(ora_st_execute_array+0xfa4)[0x2ad3980e6e94]
> > > /usr/local/perl-5.8.8/lib/site_perl/5.8.8/x86_64-linux/auto/DBD/Oracle/Oracle.so(XS_DBD__Oracle__st_ora_execute_array+0xef)[0x2ad3980e0f9f]
> > > /usr/local/bin/perl(Perl_pp_entersub+0x6b7)[0x46bae7]
> > > /usr/local/bin/perl(Perl_runops_standard+0xe)[0x46a13e]
> > > /usr/local/bin/perl(Perl_call_sv+0x49d)[0x41e80d]
> > > /usr/local/perl-5.8.8/lib/site_perl/5.8.8/x86_64-linux/auto/DBI/DBI.so(XS_DBI_dispatch+0x7a8)[0x2ad395a27068]
> > > /usr/local/bin/perl(Perl_pp_entersub+0x6b7)[0x46bae7]
> > > /usr/local/bin/perl(Perl_runops_standard+0xe)[0x46a13e]
> > > /usr/local/bin/perl(Perl_call_sv+0x49d)[0x41e80d]
> > > /usr/local/perl-5.8.8/lib/site_perl/5.8.8/x86_64-linux/auto/DBI/DBI.so(XS_DBI_dispatch+0x7a8)[0x2ad395a27068]
> > > /usr/local/bin/perl(Perl_pp_entersub+0x6b7)[0x46bae7]
> > > /usr/local/bin/perl(Perl_runops_standard+0xe)[0x46a13e]
> > > /usr/local/bin/perl(perl_run+0x2c1)[0x41f1b1]
> > > /usr/local/bin/perl(main+0xac)[0x41ba2c]
> > > /lib64/libc.so.6(__libc_start_main+0xf4)[0x2ad395800154]
> > > /usr/local/bin/perl[0x41b8e9]
> > > ======= Memory map: ========
> > > 00400000-004fc000 r-xp 00000000 08:02 427095                             
> > > /usr/local/perl-5.8.8/bin/perl
> > > 005fb000-00601000 rw-p 000fb000 08:02 427095                             
> > > /usr/local/perl-5.8.8/bin/perl
> > > 00601000-01183000 rw-p 00601000 00:00 0                                  
> > > [heap]
> > > 2ad39511b000-2ad395136000 r-xp 00000000 08:02 4295                       
> > > /lib64/ld-2.4.so
> > > 2ad395136000-2ad395137000 rw-p 2ad395136000 00:00 0 
> > > 2ad395147000-2ad395148000 rw-p 2ad395147000 00:00 0 
> > > 2ad395235000-2ad395237000 rw-p 0001a000 08:02 4295                       
> > > /lib64/ld-2.4.so
> > > 2ad395237000-2ad39524a000 r-xp 00000000 08:02 4168                       
> > > /lib64/libnsl-2.4.so
> > > 2ad39524a000-2ad395349000 ---p 00013000 08:02 4168                       
> > > /lib64/libnsl-2.4.so
> > > 2ad395349000-2ad39534b000 rw-p 00012000 08:02 4168                       
> > > /lib64/libnsl-2.4.so
> > > 2ad39534b000-2ad39534d000 rw-p 2ad39534b000 00:00 0 
> > > 2ad39534d000-2ad39534f000 r-xp 00000000 08:02 4163                       
> > > /lib64/libdl-2.4.so
> > > 2ad39534f000-2ad39544f000 ---p 00002000 08:02 4163                       
> > > /lib64/libdl-2.4.so
> > > 2ad39544f000-2ad395451000 rw-p 00002000 08:02 4163                       
> > > /lib64/libdl-2.4.so
> > > 2ad395451000-2ad3954a5000 r-xp 00000000 08:02 4165                       
> > > /lib64/libm-2.4.so
> > > 2ad3954a5000-2ad3955a4000 ---p 00054000 08:02 4165                       
> > > /lib64/libm-2.4.so
> > > 2ad3955a4000-2ad3955a6000 rw-p 00053000 08:02 4165                       
> > > /lib64/libm-2.4.so
> > > 2ad3955a6000-2ad3955a7000 rw-p 2ad3955a6000 00:00 0 
> > > 2ad3955a7000-2ad3955b0000 r-xp 00000000 08:02 4161                       
> > > /lib64/libcrypt-2.4.so
> > > 2ad3955b0000-2ad3956af000 ---p 00009000 08:02 4161                       
> > > /lib64/libcrypt-2.4.so
> > > 2ad3956af000-2ad3956b2000 rw-p 00008000 08:02 4161                       
> > > /lib64/libcrypt-2.4.so
> > > 2ad3956b2000-2ad3956e0000 rw-p 2ad3956b2000 00:00 0 
> > > 2ad3956e0000-2ad3956e2000 r-xp 00000000 08:02 4191                       
> > > /lib64/libutil-2.4.so
> > > 2ad3956e2000-2ad3957e1000 ---p 00002000 08:02 4191                       
> > > /lib64/libutil-2.4.so
> > > 2ad3957e1000-2ad3957e3000 rw-p 00001000 08:02 4191                       
> > > /lib64/libutil-2.4.so
> > > 2ad3957e3000-2ad39590a000 r-xp 00000000 08:02 4157                       
> > > /lib64/libc-2.4.so
> > > 2ad39590a000-2ad395a0a000 ---p 00127000 08:02 4157                       
> > > /lib64/libc-2.4.so
> > > 2ad395a0a000-2ad395a0d000 r--p 00127000 08:02 4157                       
> > > /lib64/libc-2.4.so
> > > 2ad395a0d000-2ad395a0f000 rw-p 0012a000 08:02 4157                       
> > > /lib64/libc-2.4.so
> > > 2ad395a0f000-2ad395a16000 rw-p 2ad395a0f000 00:00 0 
> > > 2ad395a16000-2ad395a30000 r-xp 00000000 08:02 442510                     
> > > /usr/local/perl-5.8.8/lib/site_perl/5.8.8/x86_64-linux/auto/DBI/DBI.so
> > > 2ad395a30000-2ad395b30000 ---p 0001a000 08:02 442510                     
> > > /usr/local/perl-5.8.8/lib/site_pzsh: 26245 abort (core dumped)  
> > > DBI_TRACE=2 

Reply via email to