RE: Hunting down (possible) memory leak in DBD::Oracle
Thanks, Chris, I looked at the two kde bugs you pointed out. Neither of them was an "invalid write". Also the software is actually failing here - not "working but getting valgrind errors" as was the case with the kde bugs. In my case I have "failing and getting valgrind errors". I can run the same case on an older version of perl and an older version of "DBD::Oracle" module (with the exact same oracle database) and get no errors. Valgrind documentation describes an "invalid write" as follows: http://valgrind.org/docs/manual/mc-manual.html#mc-manual.badrw For example: Invalid read of size 4 at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9) by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9) by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326) by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621) Address 0xB0E0 is not stack'd, malloc'd or free'd This happens when your program reads or writes memory at a place which Memcheck reckons it shouldn't. In this example, the program did a 4-byte read at address 0xB0E0, somewhere within the system-supplied library libpng.so.2.1.0.9, which was called from somewhere else in the same library, called from line 326 of qpngio.cpp, and so on. Memcheck tries to establish what the illegal address might relate to, since that's often useful. So, if it points into a block of memory which has already been freed, you'll be informed of this, and also where the block was freed. Likewise, if it should turn out to be just off the end of a heap block, a common result of off-by-one-errors in array subscripting, you'll be informed of this fact, and also where the block was allocated. If you use the --read-var-info option Memcheck will run more slowly but may give a more detailed description of any illegal address. In this example, Memcheck can't identify the address. Actually the address is on the stack, but, for some reason, this is not a valid stack address -- it is below the stack pointer and that isn't allowed. In this particular case it's probably caused by GCC generating invalid code, a known bug in some ancient versions of GCC. Note that Memcheck only tells you that your program is about to access memory at an illegal address. It can't stop the access from happening. So, if your program makes an access which normally would result in a segmentation fault, you program will still suffer the same fate -- but you will get a message from Memcheck immediately prior to this. In this particular example, reading junk on the stack is non-fatal, and the program stays alive. -Original Message- From: Christopher Jones [mailto:christopher.jo...@oracle.com] Sent: Monday, January 15, 2018 6:56 PM To: dbi-users@perl.org Subject: Re: Hunting down (possible) memory leak in DBD::Oracle On 16/1/18 9:17 am, Fennell, Brian wrote: > $ egrep -B1 -A20 -i 'invalid write' > /copy/sandbox/feeds/data/search4_1/valgrind-log.txt | head -22 > ==19402== ==19402== Invalid write of size 4 > ==19402==at 0xBD747E6: __intel_ssse3_rep_memcpy (in > /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) > ==19402==by 0xBD6CE95: _intel_fast_memcpy.P (in > /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) Valgrind doesn't do the greatest on binaries built with the Intel compiler: https://bugs.kde.org/show_bug.cgi?id=286769 https://bugs.kde.org/show_bug.cgi?id=139776 Chris -- http://twitter.com/ghrd
Re: Hunting down (possible) memory leak in DBD::Oracle
On 16/1/18 9:17 am, Fennell, Brian wrote: $ egrep -B1 -A20 -i 'invalid write' /copy/sandbox/feeds/data/search4_1/valgrind-log.txt | head -22 ==19402== ==19402== Invalid write of size 4 ==19402==at 0xBD747E6: __intel_ssse3_rep_memcpy (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xBD6CE95: _intel_fast_memcpy.P (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) Valgrind doesn't do the greatest on binaries built with the Intel compiler: https://bugs.kde.org/show_bug.cgi?id=286769 https://bugs.kde.org/show_bug.cgi?id=139776 Chris -- http://twitter.com/ghrd
RE: Hunting down (possible) memory leak in DBD::Oracle
I got this case running with valgrind - Valgrind reported - 8 invalid writes, 8 invalid writes, 2239 accesses to uninitialized values. All invalid writes have a stack trace leading back to XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) and ora_st_fetch (oci8.c:4032) Details follow: export DATE_MANIP=DM5; export ORACLE_HOME=/db/app/oracle/product/12.1.0/client_1 ; export LD_LIBRARY_PATH=$ORACLE_HOME/lib ; export PATH="$ORACLE_HOME/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin"; export NLS_LANG="AMERICAN_AMERICA.WE8ISO8859P1"; export NLS_NCHAR="AL16UTF16" ; export NLS_NCHAR_CHARACTERSET="AL16UTF16" ; export LANGUAGE=en_US:en:C; export LANG=C; export LC_ALL=C; export LC_CTYPE=C; /bin/rm -rf/feeds/data/search4_1/ProductAttributes-dbi-trace.txt\ /feeds/data/search4_1/getTableData-debug-log.txt \ /feeds/data/search4_1/valgrind-log.txt ; /usr/local/bin/valgrind \ --tool=memcheck \ --leak-check=yes \ --track-origins=yes \ --leak-check=full \ --show-leak-kinds=all \ --num-callers=100 \ --error-limit=no \ --log-file=/feeds/data/search4_1/valgrind-log.txt \ /usr/local/bin/perl \ [ . . . ] $ wc -l /copy/sandbox/feeds/data/search4_1/valgrind-log.txt 696325 /copy/sandbox/feeds/data/search4_1/valgrind-log.txt $ du -h /copy/sandbox/feeds/data/search4_1/valgrind-log.txt 53M /copy/sandbox/feeds/data/search4_1/valgrind-log.txt $ egrep -c -i 'invalid write' /copy/sandbox/feeds/data/search4_1/valgrind-log.txt 8 $ egrep -c -i 'invalid read' /copy/sandbox/feeds/data/search4_1/valgrind-log.txt 8 $ egrep -c -i 'uninitialised value' /copy/sandbox/feeds/data/search4_1/valgrind-log.txt 2239 $ egrep -n -A20 -i 'invalid write' /copy/sandbox/feeds/data/search4_1/valgrind-log.txt | egrep '(oci8\.c|Oracle\.xsi)' 51575-==19402==by 0xAA0F269: ora_st_fetch (oci8.c:4032) 51576-==19402==by 0xA9EE595: XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) 51596-==19402==by 0xAA0F269: ora_st_fetch (oci8.c:4032) 51597-==19402==by 0xA9EE595: XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) 51614-==19402==by 0xAA0F269: ora_st_fetch (oci8.c:4032) 51615-==19402==by 0xA9EE595: XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) 51670-==19402==by 0xAA0F269: ora_st_fetch (oci8.c:4032) 51671-==19402==by 0xA9EE595: XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) 51710-==19402==by 0xAA0F269: ora_st_fetch (oci8.c:4032) 51711-==19402==by 0xA9EE595: XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) 51731-==19402==by 0xAA0F269: ora_st_fetch (oci8.c:4032) 51732-==19402==by 0xA9EE595: XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) 51771-==19402==by 0xAA0F269: ora_st_fetch (oci8.c:4032) 51772-==19402==by 0xA9EE595: XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) 51830-==19402==by 0xAA0F269: ora_st_fetch (oci8.c:4032) 51831-==19402==by 0xA9EE595: XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) $ egrep -B1 -A20 -i 'invalid write' /copy/sandbox/feeds/data/search4_1/valgrind-log.txt | head -22 ==19402== ==19402== Invalid write of size 4 ==19402==at 0xBD747E6: __intel_ssse3_rep_memcpy (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xBD6CE95: _intel_fast_memcpy.P (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xCF70D36: kpccclr (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xCF705D7: kpccs2c (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xCF6ED9E: ttccfpg (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xCF6C799: ttcfour (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xCF3B0AE: kpufcpf (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xCF392F5: kpufch0 (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xCF37C34: kpufch (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xB14F1CA: OCIStmtFetch2 (in /db/app/oracle/product/12.1.0/client_1/lib/libclntsh.so.12.1) ==19402==by 0xAA0F269: ora_st_fetch (oci8.c:4032) ==19402==by 0xA9EE595: XS_DBD__Oracle__st_fetchrow_array (Oracle.xsi:662) ==19402==by 0x6D33069: XS_DBI_dispatch (DBI.xs:3797) ==19402==by 0x4EF240E: Perl_pp_entersub (in /usr/lib64/perl5/CORE/libperl.so) ==19402==by 0x4EEAB85: Perl_runops_standard (in /usr/lib64/perl5/CORE/libperl.so) ==19402==by 0x4E87984: perl_run (in /usr/lib64/perl5/CORE/libperl.so) ==19402==by 0x400D98: ??? (in /usr/bin/perl) ==19402==by 0x6174C04: (below main) (in /usr/lib64/libc-2.17.so) ==19402== Address 0x14ec4275f is 4,095,739,679 bytes inside a block of size 4,095,741,856 in arena "client" ==19402==
RE: Hunting down (possible) memory leak in DBD::Oracle
John, Thanks for your incites. I tried what you said. I read up on NVARCHAR2 v VARCHAR2 - interesting. I also see that Oracle has a way (more than one way) to specify if a VARCHAR2 should contain bytes or characters - further while a VARCHAR2(11 byte) and a VARCHAR2(11 char) are different (the second can have as many as 4 times as many bytes in it as the first) VARCHAR2(4000 byte) and VARCHAR(4000 char) are not different - 4000 bytes is the max for either - no matter the size of a character. Unicode makes everything harder. "CHAR" may mean "BYTE" "2 BYTES" or "4 BYTES". And client and server have to agree. I tried this: I uses SUBSTRB(field,1,) to truncate the actual bytes coming from oracle back to the client. Results are a bit odd. For 3999 no change. For 2000 all errors disappear. For 3000 - some cases that used to error now succeed - but some cases that used to succeed now fail. I also played with "matching up" the inner select and the outer - which isn't really very valuable - the client never sees the data from the inner select - and I only added the outer select so I could select by row number (which I couldn't do on the inner select since "rownum" is a pseudo column). The outer select is basically the interface that OCI sees. Specifically cast(SUBSTRB(field,1,2000) as VARCHAR(4000)) errors out exactly the same way as no substrb(). For the 2000 byte case I used cast(SUBSTRB(field,1,2000) as VARCHAR(2000 byte)). For the 3000 byte case I used cast(SUBSTRB(field,1,3000) as VARCHAR(3000 byte)). I don't think truncating to 2000 bytes is a solution, but I suppose I could try breaking the field into 2 2000 byte strings (or 4 1000 bytes strings) and recombine them in some other part of the code. I would need to consider the best way to do that - perhaps a output column naming convention. I ran all of my variations on the old server and the new server and nothing failed on the old server. Too much output to send it all to the list - if anyone wants something specific I can send it. My money is still on a wild pointer or similar. I looked thru the DBI log and found the following various field rc error codes (sorted): field #3 with rc=12851(UNKNOWN RC=12851)) field #3 with rc=12854(UNKNOWN RC=12854)) field #3 with rc=20041(UNKNOWN RC=20041)) field #3 with rc=25934(UNKNOWN RC=25934)) field #3 with rc=26962(UNKNOWN RC=26962)) field #3 with rc=48(UNKNOWN RC=48)) field #3 with rc=83(UNKNOWN RC=83)) field #4 with rc=1280(UNKNOWN RC=1280)) field #4 with rc=12870(UNKNOWN RC=12870)) field #4 with rc=14128(UNKNOWN RC=14128)) field #4 with rc=17230(UNKNOWN RC=17230)) field #4 with rc=18688(UNKNOWN RC=18688)) field #4 with rc=24919(UNKNOWN RC=24919)) field #4 with rc=25196(UNKNOWN RC=25196)) field #4 with rc=25926(UNKNOWN RC=25926)) field #4 with rc=26691(UNKNOWN RC=26691)) I am not really sure which goes with what case - but I am really not thinking that the codes are real oracle error codes - but junk data which is a symptom of a problem somewhere else. I could modify the tests to clear out the log and capture the "UNKNOWN RC" codes for each, but I would rather to more digging on the original case I reported to this list. Brian Fennell
Re: Hunting down (possible) memory leak in DBD::Oracle
On 2017-12-19 20:55:30 +, Fennell, Brian wrote: > And, also with the log level set to 15 here are the LAST 200 lines [...] > -> fetchrow_array for DBD::Oracle::st > (AAA::DBI::Connection::st=HASH(0x3abd310)~0x39f75f0) thr#24d4010 > dbd_st_fetch 6 fields... > dbd_st_fetched 6 fields with status of 0(SUCCESS) > field #1 with rc=0(OK) > > 3abd340 (field=0): '1105427' > field #2 with rc=0(OK) > > 3abd340 (field=1): '1268254' > field #3 with rc=0(OK) > > 3abd340 (field=2): 'sampl...' > field #4 with rc=25196(UNKNOWN RC=25196)) > OCIErrorGet(3b535c8,1,"",7ffd7b72b7cc,"ORA-01403: no data found > ",1024,2)=SUCCESS > OCIErrorGet after ORA-25196 error on field 4 of 6, ora_type 2 > (er1:ok): -1, 1403: ORA-01403: no data found > > OCIErrorGet(3b535c8,2,"",7ffd7b72b7cc,"ORA-01403: no data found > ",1024,2)=NO_DATA > -- HandleSetErr err=1403, errstr='ORA-01403: no data found (DBD ERROR: > ORA-25196 error on field 4 of 6, ora_type 2)', state=undef, undef > > 3abd340 (field=3): undef Can you check the actual contents of any rows in tableA with field1=1268254 and field2='sampl...'? The error occurs in field3 which is numeric, but it's adjacent to field4 which is VARCHAR2(4000). I wonder if an overflow is possible if field4 is actually close to 4000 characters and it is expanded further by character encoding. Which raises the next question: What are the database and client encodings set to? hp -- _ | Peter J. Holzer | I think we need two definitions: |_|_) | WSR - Softwaredevelopment | 1) The problem the *users* want us to solve | | | und Projektunterstützung | 2) The problem our solution addresses. __/ | h...@wsr.ac.at |-- Phillip Hallam-Baker on spam signature.asc Description: PGP signature
RE: Hunting down (possible) memory leak in DBD::Oracle
And, also with the log level set to 15 here are the LAST 200 lines 3abd340 (field=0): '1127646' field #2 with rc=0(OK) 3abd340 (field=1): '1268251' field #3 with rc=0(OK) 3abd340 (field=2): 'a...' field #4 with rc=0(OK) 3abd340 (field=3): '1' field #5 with rc=1405(NULL) 3abd340 (field=4): undef field #6 with rc=0(OK) 3abd340 (field=5): 'en_US' <- fetchrow_array= ( '1127646' '1268251' 'aa' '1' undef 'en_US' ) [6 items] row858 at /dirname/scriptname.pl line 196 -> fetchrow_array for DBD::Oracle::st (AAA::DBI::Connection::st=HASH(0x3abd310)~0x39f75f0) thr#24d4010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) 3abd340 (field=0): '1093644' field #2 with rc=0(OK) 3abd340 (field=1): '1268251' field #3 with rc=0(OK) 3abd340 (field=2): 'sampl...' field #4 with rc=0(OK) 3abd340 (field=3): '1' field #5 with rc=1405(NULL) 3abd340 (field=4): undef field #6 with rc=0(OK) 3abd340 (field=5): 'en_US' <- fetchrow_array= ( '1093644' '1268251' 'sampledata' '1' undef 'en_US' ) [6 items] row859 at /dirname/scriptname.pl line 196 -> fetchrow_array for DBD::Oracle::st (AAA::DBI::Connection::st=HASH(0x3abd310)~0x39f75f0) thr#24d4010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) 3abd340 (field=0): '1093997' field #2 with rc=0(OK) 3abd340 (field=1): '1268252' field #3 with rc=0(OK) 3abd340 (field=2): 'sampl...' field #4 with rc=0(OK) 3abd340 (field=3): '1' field #5 with rc=1405(NULL) 3abd340 (field=4): undef field #6 with rc=0(OK) 3abd340 (field=5): 'en_US' <- fetchrow_array= ( '1093997' '1268252' 'sampledata' '1' undef 'en_US' ) [6 items] row860 at /dirname/scriptname.pl line 196 -> fetchrow_array for DBD::Oracle::st (AAA::DBI::Connection::st=HASH(0x3abd310)~0x39f75f0) thr#24d4010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) 3abd340 (field=0): '1093904' field #2 with rc=0(OK) 3abd340 (field=1): '1268252' field #3 with rc=0(OK) 3abd340 (field=2): 'sampl...' field #4 with rc=0(OK) 3abd340 (field=3): '1' field #5 with rc=1405(NULL) 3abd340 (field=4): undef field #6 with rc=0(OK) 3abd340 (field=5): 'en_US' <- fetchrow_array= ( '1093904' '1268252' 'samplecc' '1' undef 'en_US' ) [6 items] row861 at /dirname/scriptname.pl line 196 -> fetchrow_array for DBD::Oracle::st (AAA::DBI::Connection::st=HASH(0x3abd310)~0x39f75f0) thr#24d4010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) 3abd340 (field=0): '1127647' field #2 with rc=0(OK) 3abd340 (field=1): '1268253' field #3 with rc=0(OK) 3abd340 (field=2): 'd...' field #4 with rc=0(OK) 3abd340 (field=3): '0' field #5 with rc=1405(NULL) 3abd340 (field=4): undef field #6 with rc=0(OK) 3abd340 (field=5): 'en_US' <- fetchrow_array= ( '1127647' '1268253' 'd' '0' undef 'en_US' ) [6 items] row862 at /dirname/scriptname.pl line 196 -> fetchrow_array for DBD::Oracle::st (AAA::DBI::Connection::st=HASH(0x3abd310)~0x39f75f0) thr#24d4010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) 3abd340 (field=0): '1127648' field #2 with rc=0(OK) 3abd340 (field=1): '1268253' field #3 with rc=0(OK) 3abd340 (field=2): 'a...' field #4 with rc=0(OK) 3abd340 (field=3): '0' field #5 with rc=1405(NULL) 3abd340 (field=4): undef field #6 with rc=0(OK) 3abd340 (field=5): 'en_US' <- fetchrow_array= ( '1127648' '1268253' 'aa' '0' undef 'en_US' ) [6 items] row863 at /dirname/scriptname.pl line 196 -> fetchrow_array for DBD::Oracle::st (AAA::DBI::Connection::st=HASH(0x3abd310)~0x39f75f0) thr#24d4010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) 3abd340 (field=0): '1105426' field #2 with rc=0(OK) 3abd340 (field=1): '1268253' field #3 with rc=0(OK) 3abd340 (field=2): 'samp
RE: Hunting down (possible) memory leak in DBD::Oracle
With the log level set to 15 here are the first 200 lines of log AAA::DBI::Connection::db=HASH(0x3abce00) trace level set to 0x0/15 (DBI @ 0x0/0) in DBI 1.637-ithread (pid 12594) -> STORE for DBD::Oracle::db (AAA::DBI::Connection::db=HASH(0x3abce00)~INNER 'RowCacheSize' 2097152) thr#24d4010 <- STORE= ( 1 ) [1 items] at /dirname/scriptname.pl line 78 -> prepare for DBD::Oracle::db (AAA::DBI::Connection::db=HASH(0x3abcef0)~0x3abce00 'SELECT [ yada yada yada ]') thr#24d4010 New 'AAA::DBI::Connection::st' (for DBD::Oracle::st, parent=AAA::DBI::Connection::db=HASH(0x3abce00), id=undef) dbih_setup_handle(AAA::DBI::Connection::st=HASH(0x3abd310)=>AAA::DBI::Connection::st=HASH(0x39f75f0), DBD::Oracle::st, 268ae18, Null!) dbih_make_com(AAA::DBI::Connection::db=HASH(0x3abce00), 3abdfc0, DBD::Oracle::st, 464, 0) thr#24d4010 dbih_setup_attrib(AAA::DBI::Connection::st=HASH(0x39f75f0), Err, AAA::DBI::Connection::db=HASH(0x3abce00)) SCALAR(0x268acc8) (already defined) dbih_setup_attrib(AAA::DBI::Connection::st=HASH(0x39f75f0), State, AAA::DBI::Connection::db=HASH(0x3abce00)) SCALAR(0x268ad88) (already defined) dbih_setup_attrib(AAA::DBI::Connection::st=HASH(0x39f75f0), Errstr, AAA::DBI::Connection::db=HASH(0x3abce00)) SCALAR(0x268ad28) (already defined) dbih_setup_attrib(AAA::DBI::Connection::st=HASH(0x39f75f0), TraceLevel, AAA::DBI::Connection::db=HASH(0x3abce00)) 15 (already defined) dbih_setup_attrib(AAA::DBI::Connection::st=HASH(0x39f75f0), FetchHashKeyName, AAA::DBI::Connection::db=HASH(0x3abce00)) 'NAME' (already defined) dbih_setup_attrib(AAA::DBI::Connection::st=HASH(0x39f75f0), HandleSetErr, AAA::DBI::Connection::db=HASH(0x3abce00)) undef (not defined) dbih_setup_attrib(AAA::DBI::Connection::st=HASH(0x39f75f0), HandleError, AAA::DBI::Connection::db=HASH(0x3abce00)) undef (not defined) dbih_setup_attrib(AAA::DBI::Connection::st=HASH(0x39f75f0), ReadOnly, AAA::DBI::Connection::db=HASH(0x3abce00)) undef (not defined) dbih_setup_attrib(AAA::DBI::Connection::st=HASH(0x39f75f0), Profile, AAA::DBI::Connection::db=HASH(0x3abce00)) undef (not defined) OCIHandleAlloc(3b2c0f0,3be0cc8,OCI_HTYPE_STMT,0,0)=SUCCESS OCIStmtPrepare(3ba3190,3b535c8,'SELECT [ yada yada yada ]',513,1,0)=SUCCESS OCIAttrGet(3ba3190,OCI_HTYPE_STMT,3be0cdc,0,OCI_ATTR_STMT_TYPE,3b535c8)=SUCCESS dbd_st_prepare'd sql SELECT ( auto_lob1, check_sql1) dbd_describe SELECT (EXPLICIT, lb 80)... OCIStmtExecute(3b5b028,3ba3190,3b535c8,0,0,0,0,mode=DESCRIBE_ONLY,16)=SUCCESS OCIAttrGet(3ba3190,OCI_HTYPE_STMT,7ffd7b72baa4,0,OCI_ATTR_PARAM_COUNT,3b535c8)=SUCCESS OCIParamGet(3ba3190,4,3b535c8,3be10e0,1,OCI_HTYPE_STMT)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be1110,0,OCI_ATTR_DATA_TYPE,3b535c8)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be1112,0,OCI_ATTR_DATA_SIZE,3b535c8)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be1128,0,OCI_ATTR_CHAR_USED,3b535c8)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be112a,0,OCI_ATTR_CHAR_SIZE,3b535c8)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be112c,0,OCI_ATTR_CHARSET_ID,3b535c8)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be112e,0,OCI_ATTR_CHARSET_FORM,3b535c8)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be1114,0,OCI_ATTR_PRECISION,3b535c8)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be1116,0,OCI_ATTR_SCALE,3b535c8)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be1117,0,OCI_ATTR_IS_NULL,3b535c8)=SUCCESS OCIAttrGet(3ba2e90,OCI_DTYPE_PARAM,3be1118,7ffd7b72baac,OCI_ATTR_NAME,3b535c8)=SUCCESS Describe col #1 type=2(NVARCHAR2) Described col 1: dbtype 2(NVARCHAR2), scale -127, prec 0, nullok 1, name ROW_NUMBER : dbsize 22, char_used 0, char_size 0, csid 0, csform 0(0), disize 171 fbh 1: 'ROW_NUMBER' NULLable, otype 2-> 5, dbsize 22/172, p0.s-127 OCIParamGet(3ba3190,4,3b535c8,3be1188,2,OCI_HTYPE_STMT)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11b8,0,OCI_ATTR_DATA_TYPE,3b535c8)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11ba,0,OCI_ATTR_DATA_SIZE,3b535c8)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11d0,0,OCI_ATTR_CHAR_USED,3b535c8)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11d2,0,OCI_ATTR_CHAR_SIZE,3b535c8)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11d4,0,OCI_ATTR_CHARSET_ID,3b535c8)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11d6,0,OCI_ATTR_CHARSET_FORM,3b535c8)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11bc,0,OCI_ATTR_PRECISION,3b535c8)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11be,0,OCI_ATTR_SCALE,3b535c8)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11bf,0,OCI_ATTR_IS_NULL,3b535c8)=SUCCESS OCIAttrGet(3ba07a0,OCI_DTYPE_PARAM,3be11c0,7ffd7b72baac,OCI_ATTR_NAME,3b535c8)=SUCCESS Describe col #2 type=2(NVARCHAR2) Described col 2: dbtype 2(NV
RE: Hunting down (possible) memory leak in DBD::Oracle
John, Thanks for the ideas to change the cache params - I will try that! Here is the SQL and the field types: SELECT d.ROW_NUMBER, d.f1, d.f2, d.f3, d.f4, d.f5 FROM ( SELECT /*+ FULL(A) PARALLEL(A 6) */ rownum ROW_NUMBER, A.field1 f1 , A.field2 f2, A.field3 f3, A.field4 f4, B.field5 f5 FROM tableA A, tableB B WHERE B.field6 IN ( 'TOK3', 'TOK4', 'TOK5' ) AND B.field7 LIKE'A%' AND B.field8 IN ('TOK1', 'TOK2') AND B.fkfield1= A.field1 ORDER BY 2, 3, 4, 5, 6 ) d WHERE d.row_number < 202 AND d.row_number >= 100 Field Types: A.field1 NUMBER(12) A.field2 VARCHAR2(20) A.field3 NUMBER(15,3) A.field4 VARCHAR2(4000) B.field5 VARCHAR2(5) B.field6 VARCHAR2(20) B.field7 VARCHAR2(8) B.field8 VARCHAR2(8) B.fkfield1 NUMBER(12)
Re: Hunting down (possible) memory leak in DBD::Oracle
Well it will be in either one of two .c files dbdimp.c or oci8.c The XS side of things Oracle.xs is not used very much. The level 15 debug will get deep inside the c to see where it is happening. The trace you gave is a little high level you are right ORA-01403 dose not make much sense here. If could be running out of buffer. Give some of the caching params a tweak https://metacpan.org/pod/DBD::Oracle#RowCacheSize if you can try give fetchrow_hashref a try as see if the error happens there as well. Cheers John DBD::Oracle - Oracle database driver for the DBI module ...<https://metacpan.org/pod/DBD::Oracle#RowCacheSize> metacpan.org Oracle database driver for the DBI module ... NAME; VERSION; SYNOPSIS; DESCRIPTION; CONSTANTS; DBI CLASS METHODS. connect. OS authentication From: Fennell, Brian Sent: December 18, 2017 11:25 AM To: John Scoles; dbi-users@perl.org Subject: RE: Hunting down (possible) memory leak in DBD::Oracle John, Thanks so much for your reply! I have put off this work for a few years and now the pressure is on - the original box and OS are so old that the DBA and System Engineer and the Operations manager have all ganged up on me. I suppose I could try and work around by downgrading both the perl and the DBD::Oracle to the same version we use in production, but it would be nice to actually fix the bug if I can. I tried just downgrading the DBD::Oracle, but changes in perl 5 to support MULTIPLICITY made that look like more than just a little work - spend two days on it and then backed off. I am a polyglot programmer so I can program in C and Perl (and about a dozen other languages). I have done enough time with C that it doesn't scare me. Valgrind is new to me, but make and gcc and ld are not. I have started to read the Valgrind docs and it seems to make sense - it basically emulates all the CPU instructions with injected instrumentation - I assume it works for Intel and Red Hat if it works at all (and it seems to have a long history and good open source support community). Perhaps I am fooling myself, but I figure it is worth a try. I have negotiated support from both DBA and System Engineering (the Red Hat OS guys) so if I am going to fix this now is the time. The only other option I can think of is to try to get the old code working with the DBD::JDBC driver (which would mean adding a JVM running in parallel and additional overhead - so I would rather not). 1) The error changes depending on the data - which is why I think it is a buffer overrun or a wild pointer - but it is always in "field N of N" - Current I can reproduce with ORA-01403 2) I will re-try at level 15 and post the results - current at 4 (or perhaps 5) here is a section from the log (which suggests to me it is happing in the C code and not in the Perl -> fetchrow_array for DBD::Oracle::st (GSI::DBI::Connection::st=HASH(0x29353c8)~0x286f6b0) thr#134c010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) field #2 with rc=0(OK) field #3 with rc=0(OK) field #4 with rc=1405(NULL) field #5 with rc=0(OK) field #6 with rc=0(OK) -> fetchrow_array for DBD::Oracle::st (GSI::DBI::Connection::st=HASH(0x29353c8)~0x286f6b0) thr#134c010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) field #2 with rc=0(OK) field #3 with rc=0(OK) field #4 with rc=14135(UNKNOWN RC=14135)) OCIErrorGet after ORA-14135 error on field 4 of 6, ora_type 2 (er1:ok): -1, 1403: ORA-01403: no data found -- HandleSetErr err=1403, errstr='ORA-01403: no data found (DBD ERROR: ORA-14135 error on field 4 of 6, ora_type 2)', state=undef, undef field #5 with rc=0(OK) field #6 with rc=0(OK) 1 -> FETCH for DBD::Oracle::st (GSI::DBI::Connection::st=HASH(0x286f6b0)~INNER 'ParamValues') thr#134c010 3) I think the most exotic thing in these tables is a VARCHAR2 but I will check and post the results. 4) I looks like it is in the XS to me (see answer to 2) - but I suppose it could be elsewhere - like a loopback-perl-ref that should be weak but is not. 5) I think I have what I need, DBA installed Oracle 12 OCI client and "dot.so" libraries but currently I am concerned that I am using "ins_rdbms.mk" when I should be using "demo.mk" or similar - I am getting a Warning (see details below) when I run Makefile.PL - I asked DBA to look into installing the "demo.mk" file and consider opening up a Oracle METALINK support ticket to see if another customer had already solved this with Oracle's help. Details: # /usr/local/bin/perl Makefile.PL -g Using DBI 1.637 (for perl 5.016003 on x86_64-linux-thread-multi) installed in /usr/local/lib64/perl5/auto/DB
RE: Hunting down (possible) memory leak in DBD::Oracle
Pluta, Looks like it is worth a try - when I looked at the project before it looked like it was for installing a "per user" perl. Does it work for root / all users on a box as well? Brian
RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL **
Good question - I have asked DBA and am waiting for a reply. -Original Message- From: Howard, Chris [mailto:howa...@prpa.org] Sent: Monday, December 18, 2017 11:36 AM To: Fennell, Brian ; dbi-users@perl.org Subject: RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Fennell, Are you seeing anything on the database side, in the alert log, etc.? Howard -Original Message- From: Fennell, Brian [mailto:fenne...@radial.com] Sent: Monday, December 18, 2017 8:30 AM To: Howard, Chris ; dbi-users@perl.org Subject: RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Thanks for the reply, Howard, I am using the exact same database - a test database that has copies of the production data put into it once a day - besides that it doesn't change much (if at all) during the day. The SQL I am using doesn't leave out a "bad guy" - I thought of that and actually had an off-by-one gap in some of my early tests. Closed that hole by changing a ">" to a ">=". Here is the SQL (with the original table and field names changed to allow for sharing SELECT d.ROW_NUMBER, d.f1, d.f2, d.f3, d.f4, d.f5 FROM ( SELECT /*+ FULL(A) PARALLEL(A 6) */ rownum ROW_NUMBER, A.field1 f1 , A.field2 f2, A.field3 f3, A.field4 f4, B.field5 f5 FROM tableA A, tableB B WHERE B.field6 IN ( 'TOK3', 'TOK4', 'TOK5' ) AND B.field7 LIKE'A%' AND B.field8 IN ('TOK1', 'TOK2') AND B.fkfield1= A.field1 ORDER BY 1, 2, 3, 4, 5 ) d WHERE d.row_number < 202 AND d.row_number >= 100 -Original Message- From: Howard, Chris [mailto:howa...@prpa.org] Sent: Monday, December 18, 2017 9:21 AM To: Fennell, Brian ; dbi-users@perl.org Subject: RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Same database... do you mean the actual very same data source? What is the Oracle error? To eliminate problems based on data (implicit conversions, that kind of thing) can you do a run from row 500,000 to 1,500,000 ? (I think you have this covered, but maybe that row right at the breaking spot is somehow a bad guy.) -Original Message- From: Fennell, Brian [mailto:fenne...@radial.com] Sent: Saturday, December 16, 2017 3:19 PM To: dbi-users@perl.org Subject: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Dear DBI people - I am trying to port some old perl code to a new box. (see Details below) Needless to say the original box and code works fine, but the new box (and old code) does not. Specifically what I am seeing is that when I select slightly over a million records from a specific join of two tables (to be dumped one row at a time into a TSV file) we get strange ORA-N errors that don't really make any sense in this context. The Same database and same table works fine on the original box with the large number of records. 2 million records always causes errors but two groups of 1million (divided up by ROWNUM - the EXACT same rows) causes no errors. I am using a test database with little activity do I am reasonably certain that the queries deal with the same rows. So I am thinking the problem is data volume and not any specific piece of data (originally I thought it might be an odd string/data related error, but I am starting to think it is a memory leak of some kind). The error always happens inside of fetchrow_array - and "$dbh->trace( 4 , $filename )" shows that the error originates inside the DBD::Oracle module while reading field 3 of 6. Researching the ORA-NN error gives a perfectly sane description that makes no sense at all in the context of reading a specific field. We are going thru an Audit and tightening up security so there are some things (like REAL hostnames and REAL column/table names) that I cannot share - but I will try to share as much as I can. The Host I am calling "prod" below is the only one NOT exhibiting this issue. Things I want to try - 1) recompile the DBD::Oracle module on Host "sandbox" with "perl Makefile.PL -g" and then use Valgrind. I haven't used Valgrind before, but I guess it is time to learn. 2) Anything else this list suggests. Details: Host: prod OS: Red Hat Enterprise Linux AS release 4 (Nahant Update 4) Perl: 5.8.8 built for x86_64-linux DBI: 1.53 DBD::Oracle: 1.19 Oracle: 10.2.0.1.0 Host: dev OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built for x86_64-linux DBI: 1.631 DBD::Oracle: 1.74 Oracle: 11.2.0.3.0 Host: prodnew OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built
RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL **
Ok . . . I am already benefiting from the support from this list : - ) I noticed that I was actually ordering my query BY rownum (which doesn't make much sense . . . and perhaps oracle's optimizer recognized this and ignored the pseudo column.) Just to be sure, I change the query and then reran my tests - the results, however, did not change New SQL SELECT d.ROW_NUMBER, d.f1, d.f2, d.f3, d.f4, d.f5 FROM ( SELECT /*+ FULL(A) PARALLEL(A 6) */ rownum ROW_NUMBER, A.field1 f1 , A.field2 f2, A.field3 f3, A.field4 f4, B.field5 f5 FROM tableA A, tableB B WHERE B.field6 IN ( 'TOK3', 'TOK4', 'TOK5' ) AND B.field7 LIKE'A%' AND B.field8 IN ('TOK1', 'TOK2') AND B.fkfield1= A.field1 ORDER BY 2, 3, 4, 5, 6 ) d WHERE d.row_number < 202 AND d.row_number >= 100 Here is a summary of the test results (I have a test harness which uses ssh to run the exact same tests on more than one host). INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 200 0 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 2 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 200 0 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 100 0 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 100 0 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 200 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 200 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 202 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 202 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 203 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 2 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 203 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 )
RE: Hunting down (possible) memory leak in DBD::Oracle
John, Thanks so much for your reply! I have put off this work for a few years and now the pressure is on - the original box and OS are so old that the DBA and System Engineer and the Operations manager have all ganged up on me. I suppose I could try and work around by downgrading both the perl and the DBD::Oracle to the same version we use in production, but it would be nice to actually fix the bug if I can. I tried just downgrading the DBD::Oracle, but changes in perl 5 to support MULTIPLICITY made that look like more than just a little work - spend two days on it and then backed off. I am a polyglot programmer so I can program in C and Perl (and about a dozen other languages). I have done enough time with C that it doesn't scare me. Valgrind is new to me, but make and gcc and ld are not. I have started to read the Valgrind docs and it seems to make sense - it basically emulates all the CPU instructions with injected instrumentation - I assume it works for Intel and Red Hat if it works at all (and it seems to have a long history and good open source support community). Perhaps I am fooling myself, but I figure it is worth a try. I have negotiated support from both DBA and System Engineering (the Red Hat OS guys) so if I am going to fix this now is the time. The only other option I can think of is to try to get the old code working with the DBD::JDBC driver (which would mean adding a JVM running in parallel and additional overhead - so I would rather not). 1) The error changes depending on the data - which is why I think it is a buffer overrun or a wild pointer - but it is always in "field N of N" - Current I can reproduce with ORA-01403 2) I will re-try at level 15 and post the results - current at 4 (or perhaps 5) here is a section from the log (which suggests to me it is happing in the C code and not in the Perl -> fetchrow_array for DBD::Oracle::st (GSI::DBI::Connection::st=HASH(0x29353c8)~0x286f6b0) thr#134c010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) field #2 with rc=0(OK) field #3 with rc=0(OK) field #4 with rc=1405(NULL) field #5 with rc=0(OK) field #6 with rc=0(OK) -> fetchrow_array for DBD::Oracle::st (GSI::DBI::Connection::st=HASH(0x29353c8)~0x286f6b0) thr#134c010 dbd_st_fetch 6 fields... dbd_st_fetched 6 fields with status of 0(SUCCESS) field #1 with rc=0(OK) field #2 with rc=0(OK) field #3 with rc=0(OK) field #4 with rc=14135(UNKNOWN RC=14135)) OCIErrorGet after ORA-14135 error on field 4 of 6, ora_type 2 (er1:ok): -1, 1403: ORA-01403: no data found -- HandleSetErr err=1403, errstr='ORA-01403: no data found (DBD ERROR: ORA-14135 error on field 4 of 6, ora_type 2)', state=undef, undef field #5 with rc=0(OK) field #6 with rc=0(OK) 1 -> FETCH for DBD::Oracle::st (GSI::DBI::Connection::st=HASH(0x286f6b0)~INNER 'ParamValues') thr#134c010 3) I think the most exotic thing in these tables is a VARCHAR2 but I will check and post the results. 4) I looks like it is in the XS to me (see answer to 2) - but I suppose it could be elsewhere - like a loopback-perl-ref that should be weak but is not. 5) I think I have what I need, DBA installed Oracle 12 OCI client and "dot.so" libraries but currently I am concerned that I am using "ins_rdbms.mk" when I should be using "demo.mk" or similar - I am getting a Warning (see details below) when I run Makefile.PL - I asked DBA to look into installing the "demo.mk" file and consider opening up a Oracle METALINK support ticket to see if another customer had already solved this with Oracle's help. Details: # /usr/local/bin/perl Makefile.PL -g Using DBI 1.637 (for perl 5.016003 on x86_64-linux-thread-multi) installed in /usr/local/lib64/perl5/auto/DBI/ Configuring DBD::Oracle for perl 5.016003 on linux (x86_64-linux-thread-multi) Remember to actually *READ* the README file! Especially if you have any problems. Installing on a linux, Ver#3.10 Using Oracle in /db/app/oracle/product/12.1.0/client_1 DEFINE _SQLPLUS_RELEASE = "1201000200" (CHAR) Oracle version 12.1.0.2 (12.1) Found /db/app/oracle/product/12.1.0/client_1/rdbms/lib/ins_rdbms.mk Using /db/app/oracle/product/12.1.0/client_1/rdbms/lib/ins_rdbms.mk Your LD_LIBRARY_PATH env var is set to '/db/app/oracle/product/12.1.0/client_1/lib:/db/app/oracle/product/12.1.0/client_1' Reading /db/app/oracle/product/12.1.0/client_1/rdbms/lib/ins_rdbms.mk Reading /db/app/oracle/product/12.1.0/client_1/rdbms/lib/env_rdbms.mk WARNING: Oracle /db/app/oracle/product/12.1.0/client_1/rdbms/lib/ins_rdbms.mk doesn't define a 'build' rule. WARNING: I will now try to guess how to build and link DBD::Oracle for you. This kind of guess work is very error prone and Oracle-version sensitive. It is possible that it won't be supported in future versions of DBD::Oracle.
RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL **
Thanks for the reply, Howard, I am using the exact same database - a test database that has copies of the production data put into it once a day - besides that it doesn't change much (if at all) during the day. The SQL I am using doesn't leave out a "bad guy" - I thought of that and actually had an off-by-one gap in some of my early tests. Closed that hole by changing a ">" to a ">=". Here is the SQL (with the original table and field names changed to allow for sharing SELECT d.ROW_NUMBER, d.f1, d.f2, d.f3, d.f4, d.f5 FROM ( SELECT /*+ FULL(A) PARALLEL(A 6) */ rownum ROW_NUMBER, A.field1 f1 , A.field2 f2, A.field3 f3, A.field4 f4, B.field5 f5 FROM tableA A, tableB B WHERE B.field6 IN ( 'TOK3', 'TOK4', 'TOK5' ) AND B.field7 LIKE'A%' AND B.field8 IN ('TOK1', 'TOK2') AND B.fkfield1= A.field1 ORDER BY 1, 2, 3, 4, 5 ) d WHERE d.row_number < 202 AND d.row_number >= 100 -Original Message- From: Howard, Chris [mailto:howa...@prpa.org] Sent: Monday, December 18, 2017 9:21 AM To: Fennell, Brian ; dbi-users@perl.org Subject: RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Same database... do you mean the actual very same data source? What is the Oracle error? To eliminate problems based on data (implicit conversions, that kind of thing) can you do a run from row 500,000 to 1,500,000 ? (I think you have this covered, but maybe that row right at the breaking spot is somehow a bad guy.) -Original Message- From: Fennell, Brian [mailto:fenne...@radial.com] Sent: Saturday, December 16, 2017 3:19 PM To: dbi-users@perl.org Subject: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Dear DBI people - I am trying to port some old perl code to a new box. (see Details below) Needless to say the original box and code works fine, but the new box (and old code) does not. Specifically what I am seeing is that when I select slightly over a million records from a specific join of two tables (to be dumped one row at a time into a TSV file) we get strange ORA-N errors that don't really make any sense in this context. The Same database and same table works fine on the original box with the large number of records. 2 million records always causes errors but two groups of 1million (divided up by ROWNUM - the EXACT same rows) causes no errors. I am using a test database with little activity do I am reasonably certain that the queries deal with the same rows. So I am thinking the problem is data volume and not any specific piece of data (originally I thought it might be an odd string/data related error, but I am starting to think it is a memory leak of some kind). The error always happens inside of fetchrow_array - and "$dbh->trace( 4 , $filename )" shows that the error originates inside the DBD::Oracle module while reading field 3 of 6. Researching the ORA-NN error gives a perfectly sane description that makes no sense at all in the context of reading a specific field. We are going thru an Audit and tightening up security so there are some things (like REAL hostnames and REAL column/table names) that I cannot share - but I will try to share as much as I can. The Host I am calling "prod" below is the only one NOT exhibiting this issue. Things I want to try - 1) recompile the DBD::Oracle module on Host "sandbox" with "perl Makefile.PL -g" and then use Valgrind. I haven't used Valgrind before, but I guess it is time to learn. 2) Anything else this list suggests. Details: Host: prod OS: Red Hat Enterprise Linux AS release 4 (Nahant Update 4) Perl: 5.8.8 built for x86_64-linux DBI: 1.53 DBD::Oracle: 1.19 Oracle: 10.2.0.1.0 Host: dev OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built for x86_64-linux DBI: 1.631 DBD::Oracle: 1.74 Oracle: 11.2.0.3.0 Host: prodnew OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built for x86_64-linux DBI: 1.631 DBD::Oracle: 1.70 Oracle: 11.2.0.1.0 Host: sandbox OS: CentOS Linux release 7.4.1708 (Core) Perl: 5.16.3 built for x86_64-linux-thread-multi DBI: 1.637 DBD::Oracle: 1.74 Oracle: 12.1.0.2.0 -- Brian Fennell, Software Engineer | Radial O: 610 491 7308 | M: 484 354 1699 fenne...@radial.com The information contained in this electronic mail transmission is intended only for the use of the individual or entity named in this transmission. If you are not the intended recipient of this transmission, you are hereby notified that any disclosure, copying or distribution of the contents of this tran
Re: Hunting down (possible) memory leak in DBD::Oracle
Hmm this type of DBD::Oracle debugging will be tricky. Could be almost anything. You are jumping versions in a big way but that still should be ok A few questions 1) What is the ORA-NN in question 2) Set trace to 15 to see if that give you more details 3) What are the type of fields? Lob and blob and large varchars can be tricky 4) does the error happen in perl or XS (the 15 trace should) 5) To recompile you will need the latest version of the OCI client. Not sure what that is Cheers John From: Fennell, Brian Sent: December 16, 2017 5:19 PM To: dbi-users@perl.org Subject: Hunting down (possible) memory leak in DBD::Oracle Dear DBI people - I am trying to port some old perl code to a new box. (see Details below) Needless to say the original box and code works fine, but the new box (and old code) does not. Specifically what I am seeing is that when I select slightly over a million records from a specific join of two tables (to be dumped one row at a time into a TSV file) we get strange ORA-N errors that don't really make any sense in this context. The Same database and same table works fine on the original box with the large number of records. 2 million records always causes errors but two groups of 1million (divided up by ROWNUM - the EXACT same rows) causes no errors. I am using a test database with little activity do I am reasonably certain that the queries deal with the same rows. So I am thinking the problem is data volume and not any specific piece of data (originally I thought it might be an odd string/data related error, but I am starting to think it is a memory leak of some kind). The error always happens inside of fetchrow_array - and "$dbh->trace( 4 , $filename )" shows that the error originates inside the DBD::Oracle module while reading field 3 of 6. Researching the ORA-NN error gives a perfectly sane description that makes no sense at all in the context of reading a specific field. We are going thru an Audit and tightening up security so there are some things (like REAL hostnames and REAL column/table names) that I cannot share - but I will try to share as much as I can. The Host I am calling "prod" below is the only one NOT exhibiting this issue. Things I want to try - 1) recompile the DBD::Oracle module on Host "sandbox" with "perl Makefile.PL -g" and then use Valgrind. I haven't used Valgrind before, but I guess it is time to learn. 2) Anything else this list suggests. Details: Host: prod OS: Red Hat Enterprise Linux AS release 4 (Nahant Update 4) Perl: 5.8.8 built for x86_64-linux DBI: 1.53 DBD::Oracle: 1.19 Oracle: 10.2.0.1.0 Host: dev OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built for x86_64-linux DBI: 1.631 DBD::Oracle: 1.74 Oracle: 11.2.0.3.0 Host: prodnew OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built for x86_64-linux DBI: 1.631 DBD::Oracle: 1.70 Oracle: 11.2.0.1.0 Host: sandbox OS: CentOS Linux release 7.4.1708 (Core) Perl: 5.16.3 built for x86_64-linux-thread-multi DBI: 1.637 DBD::Oracle: 1.74 Oracle: 12.1.0.2.0 -- Brian Fennell, Software Engineer | Radial O: 610 491 7308 | M: 484 354 1699 fenne...@radial.com The information contained in this electronic mail transmission is intended only for the use of the individual or entity named in this transmission. If you are not the intended recipient of this transmission, you are hereby notified that any disclosure, copying or distribution of the contents of this transmission is strictly prohibited and that you should delete the contents of this transmission from your system immediately. Any comments or statements contained in this transmission do not necessarily reflect the views or position of Radial or its subsidiaries and/or affiliates. >++[>++>++>++>+++>+++>++>+++>+++><-]>-->++>+>>>+>-->--><>.>.>.>.>.>.>.>.