RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL **
Good question - I have asked DBA and am waiting for a reply. -Original Message- From: Howard, Chris [mailto:howa...@prpa.org] Sent: Monday, December 18, 2017 11:36 AM To: Fennell, Brian ; dbi-users@perl.org Subject: RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Fennell, Are you seeing anything on the database side, in the alert log, etc.? Howard -Original Message- From: Fennell, Brian [mailto:fenne...@radial.com] Sent: Monday, December 18, 2017 8:30 AM To: Howard, Chris ; dbi-users@perl.org Subject: RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Thanks for the reply, Howard, I am using the exact same database - a test database that has copies of the production data put into it once a day - besides that it doesn't change much (if at all) during the day. The SQL I am using doesn't leave out a "bad guy" - I thought of that and actually had an off-by-one gap in some of my early tests. Closed that hole by changing a ">" to a ">=". Here is the SQL (with the original table and field names changed to allow for sharing SELECT d.ROW_NUMBER, d.f1, d.f2, d.f3, d.f4, d.f5 FROM ( SELECT /*+ FULL(A) PARALLEL(A 6) */ rownum ROW_NUMBER, A.field1 f1 , A.field2 f2, A.field3 f3, A.field4 f4, B.field5 f5 FROM tableA A, tableB B WHERE B.field6 IN ( 'TOK3', 'TOK4', 'TOK5' ) AND B.field7 LIKE'A%' AND B.field8 IN ('TOK1', 'TOK2') AND B.fkfield1= A.field1 ORDER BY 1, 2, 3, 4, 5 ) d WHERE d.row_number < 202 AND d.row_number >= 100 -Original Message- From: Howard, Chris [mailto:howa...@prpa.org] Sent: Monday, December 18, 2017 9:21 AM To: Fennell, Brian ; dbi-users@perl.org Subject: RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Same database... do you mean the actual very same data source? What is the Oracle error? To eliminate problems based on data (implicit conversions, that kind of thing) can you do a run from row 500,000 to 1,500,000 ? (I think you have this covered, but maybe that row right at the breaking spot is somehow a bad guy.) -Original Message- From: Fennell, Brian [mailto:fenne...@radial.com] Sent: Saturday, December 16, 2017 3:19 PM To: dbi-users@perl.org Subject: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Dear DBI people - I am trying to port some old perl code to a new box. (see Details below) Needless to say the original box and code works fine, but the new box (and old code) does not. Specifically what I am seeing is that when I select slightly over a million records from a specific join of two tables (to be dumped one row at a time into a TSV file) we get strange ORA-N errors that don't really make any sense in this context. The Same database and same table works fine on the original box with the large number of records. 2 million records always causes errors but two groups of 1million (divided up by ROWNUM - the EXACT same rows) causes no errors. I am using a test database with little activity do I am reasonably certain that the queries deal with the same rows. So I am thinking the problem is data volume and not any specific piece of data (originally I thought it might be an odd string/data related error, but I am starting to think it is a memory leak of some kind). The error always happens inside of fetchrow_array - and "$dbh->trace( 4 , $filename )" shows that the error originates inside the DBD::Oracle module while reading field 3 of 6. Researching the ORA-NN error gives a perfectly sane description that makes no sense at all in the context of reading a specific field. We are going thru an Audit and tightening up security so there are some things (like REAL hostnames and REAL column/table names) that I cannot share - but I will try to share as much as I can. The Host I am calling "prod" below is the only one NOT exhibiting this issue. Things I want to try - 1) recompile the DBD::Oracle module on Host "sandbox" with "perl Makefile.PL -g" and then use Valgrind. I haven't used Valgrind before, but I guess it is time to learn. 2) Anything else this list suggests. Details: Host: prod OS: Red Hat Enterprise Linux AS release 4 (Nahant Update 4) Perl: 5.8.8 built for x86_64-linux DBI: 1.53 DBD::Oracle: 1.19 Oracle: 10.2.0.1.0 Host: dev OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built for x86_64-linux DBI: 1.631 DBD::Oracle: 1.74 Oracle: 11.2.0.3.0 Host: prodnew OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built
RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL **
Ok . . . I am already benefiting from the support from this list : - ) I noticed that I was actually ordering my query BY rownum (which doesn't make much sense . . . and perhaps oracle's optimizer recognized this and ignored the pseudo column.) Just to be sure, I change the query and then reran my tests - the results, however, did not change New SQL SELECT d.ROW_NUMBER, d.f1, d.f2, d.f3, d.f4, d.f5 FROM ( SELECT /*+ FULL(A) PARALLEL(A 6) */ rownum ROW_NUMBER, A.field1 f1 , A.field2 f2, A.field3 f3, A.field4 f4, B.field5 f5 FROM tableA A, tableB B WHERE B.field6 IN ( 'TOK3', 'TOK4', 'TOK5' ) AND B.field7 LIKE'A%' AND B.field8 IN ('TOK1', 'TOK2') AND B.fkfield1= A.field1 ORDER BY 2, 3, 4, 5, 6 ) d WHERE d.row_number < 202 AND d.row_number >= 100 Here is a summary of the test results (I have a test harness which uses ssh to run the exact same tests on more than one host). INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 200 0 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 2 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 200 0 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 100 0 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 100 0 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 200 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 200 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 202 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 202 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 ) INPUT: ( ( HOSTNAME = sandbox ) ( RANGE = 203 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 2 ) INPUT: ( ( HOSTNAME = prod) ( RANGE = 203 100 ) ) OUTPUT: ( ORACLE_ERROR_COUNT = 0 )
RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL **
Thanks for the reply, Howard, I am using the exact same database - a test database that has copies of the production data put into it once a day - besides that it doesn't change much (if at all) during the day. The SQL I am using doesn't leave out a "bad guy" - I thought of that and actually had an off-by-one gap in some of my early tests. Closed that hole by changing a ">" to a ">=". Here is the SQL (with the original table and field names changed to allow for sharing SELECT d.ROW_NUMBER, d.f1, d.f2, d.f3, d.f4, d.f5 FROM ( SELECT /*+ FULL(A) PARALLEL(A 6) */ rownum ROW_NUMBER, A.field1 f1 , A.field2 f2, A.field3 f3, A.field4 f4, B.field5 f5 FROM tableA A, tableB B WHERE B.field6 IN ( 'TOK3', 'TOK4', 'TOK5' ) AND B.field7 LIKE'A%' AND B.field8 IN ('TOK1', 'TOK2') AND B.fkfield1= A.field1 ORDER BY 1, 2, 3, 4, 5 ) d WHERE d.row_number < 202 AND d.row_number >= 100 -Original Message- From: Howard, Chris [mailto:howa...@prpa.org] Sent: Monday, December 18, 2017 9:21 AM To: Fennell, Brian ; dbi-users@perl.org Subject: RE: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Same database... do you mean the actual very same data source? What is the Oracle error? To eliminate problems based on data (implicit conversions, that kind of thing) can you do a run from row 500,000 to 1,500,000 ? (I think you have this covered, but maybe that row right at the breaking spot is somehow a bad guy.) -Original Message- From: Fennell, Brian [mailto:fenne...@radial.com] Sent: Saturday, December 16, 2017 3:19 PM To: dbi-users@perl.org Subject: Hunting down (possible) memory leak in DBD::Oracle ** EXTERNAL ** Dear DBI people - I am trying to port some old perl code to a new box. (see Details below) Needless to say the original box and code works fine, but the new box (and old code) does not. Specifically what I am seeing is that when I select slightly over a million records from a specific join of two tables (to be dumped one row at a time into a TSV file) we get strange ORA-N errors that don't really make any sense in this context. The Same database and same table works fine on the original box with the large number of records. 2 million records always causes errors but two groups of 1million (divided up by ROWNUM - the EXACT same rows) causes no errors. I am using a test database with little activity do I am reasonably certain that the queries deal with the same rows. So I am thinking the problem is data volume and not any specific piece of data (originally I thought it might be an odd string/data related error, but I am starting to think it is a memory leak of some kind). The error always happens inside of fetchrow_array - and "$dbh->trace( 4 , $filename )" shows that the error originates inside the DBD::Oracle module while reading field 3 of 6. Researching the ORA-NN error gives a perfectly sane description that makes no sense at all in the context of reading a specific field. We are going thru an Audit and tightening up security so there are some things (like REAL hostnames and REAL column/table names) that I cannot share - but I will try to share as much as I can. The Host I am calling "prod" below is the only one NOT exhibiting this issue. Things I want to try - 1) recompile the DBD::Oracle module on Host "sandbox" with "perl Makefile.PL -g" and then use Valgrind. I haven't used Valgrind before, but I guess it is time to learn. 2) Anything else this list suggests. Details: Host: prod OS: Red Hat Enterprise Linux AS release 4 (Nahant Update 4) Perl: 5.8.8 built for x86_64-linux DBI: 1.53 DBD::Oracle: 1.19 Oracle: 10.2.0.1.0 Host: dev OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built for x86_64-linux DBI: 1.631 DBD::Oracle: 1.74 Oracle: 11.2.0.3.0 Host: prodnew OS: Red Hat Enterprise Linux Server release 6.5 (Santiago) Perl: 5.16.1 built for x86_64-linux DBI: 1.631 DBD::Oracle: 1.70 Oracle: 11.2.0.1.0 Host: sandbox OS: CentOS Linux release 7.4.1708 (Core) Perl: 5.16.3 built for x86_64-linux-thread-multi DBI: 1.637 DBD::Oracle: 1.74 Oracle: 12.1.0.2.0 -- Brian Fennell, Software Engineer | Radial O: 610 491 7308 | M: 484 354 1699 fenne...@radial.com The information contained in this electronic mail transmission is intended only for the use of the individual or entity named in this transmission. If you are not the intended recipient of this transmission, you are hereby notified that any disclosure, copying or distribution of the contents of this tran