John, The part of the story that troubles me is this: "it was working fine - nothing changed - then it broke". It is the middle part - something must have changed. If I were in your shoes I would be looking for the "what changed". I had a piece of perl / Oracle code break when we upgraded the OS. When the OS changed the version of Perl 5 changed, the version of the Oracle module had to change along with the version of perl changing, because there were changes in perl to that made certain "only once" internal variables part of a structure which could be instantiated once per thread. The Oracle C code for the version of the module which worked, relied on there only being one, with the old naming convention. The new version of the Oracle Perl Module had a bug related to buffering n-byte charter sets (which the OS, perl, C, Oracle Client and Oracle Server all had to agree on in order to keep certain buffers from over flowing and causing a seg-fault core dump. Finding the buffer overflow bug we like trying to find a needle in a haystack - I didn't have the time to re-ramp up on C-buffer overflow kung-fu and the free and very-not-free tools which are needed to hunt down the bug. The version on CPAN was abandon-ware, but the few people who wanted to adopt it couldn't get CPAN to break control of the owner - so there were non-official forks in GIT - rebuilding from them was non-trivial. I tried rolling back the version of the Oracle Perl Module to the one that had been working but it was not incompatible with the new version of perl 5 which came with the OS. Finally I had to roll back the version of perl 5, building from source and making a second perl install on my box - then change all the code which ran perl-scripts from perl-scripts from perl-scipts (with system and qx) to make sure none of the legacy code "fell off" the new version of perl 5. Then it all worked again - the n-byte character bug was not exposed, the buffer overflow did not occur and the legacy code wen back to working, but the OS could be upgraded (a security concern). But - I had to ask "what changed" - then hunt down the root cause. Because "nothing changed" is never the root cause. Brian Fennell
-----Original Message----- From: JohnD Blackburn <johnd.blackb...@au.abb.com> Sent: Tuesday, February 18, 2020 2:57 AM To: dbi-users@perl.org Subject: RE: Perl script excessively executing statement After reviewing the log4perl output from DBIx when running this script I found the following perl module being executed: /usr/local/lib64/perl5/DBD/Oracle.pm Which contains the following subroutine: sub execute_for_fetch { my ($sth, $fetch_tuple_sub, $tuple_status) = @_; my $row_count = 0; my $err_total = 0; my $tuple_count="0E0"; my $tuple_batch_status; my $dbh = $sth->{Database}; my $batch_size =($dbh->{'ora_array_chunk_size'}||= 1000); if(defined($tuple_status)) { @$tuple_status = (); $tuple_batch_status = [ ]; } my $finished; while (1) { my @tuple_batch; for (my $i = 0; $i < $batch_size; $i++) { $finished = $fetch_tuple_sub->(); push @tuple_batch, [@{$finished || last}]; } last unless @tuple_batch; my $err_count = 0; my $res = ora_execute_array($sth, \@tuple_batch, scalar(@tuple_batch), $tuple_batch_status, $err_count ); if (defined($res)) { #no error $row_count += $res; } else { $row_count = undef; } $err_total += $err_count; $tuple_count+=@tuple_batch; push @$tuple_status, @$tuple_batch_status if defined($tuple_status); last if !$finished; } #error check here return $sth->set_err($DBI::stderr, "executing $tuple_count generated $err_total errors") if $err_total; return wantarray ? ($tuple_count, defined $row_count ? $row_count : undef) : $tuple_count; } Is that "while(1)" loop a potential contender for causing this? -----Original Message----- From: JohnD Blackburn <johnd.blackb...@au.abb.com> Sent: Tuesday, 18 February 2020 8:51 AM To: Steven Lembark <lemb...@wrkhors.com>; dbi-users@perl.org Subject: RE: Perl script excessively executing statement The problem is not about how well the SQL runs. It runs in less than a second, and it should only be getting executed 12 times an hour. What I'm trying to do is to figure out how the SQL statement got executed 1000 times more than it should have done (which brought the db to its knees for 50 minutes with nothing being able to connect). If there is something in DBI that is automatically re-executing for some unknown reason, I need to be able to mange how often it retries, and how many times, and even how long between re-executions. The logic in the script does not account for 12000 executions in a 1 hour period, so I need to figure out how it happened so I can prevent it happening in the future. There is locking around the perl script execution so that the perl script can't be started if the previous execution has not completed, and the logs show that in the 50 minute period in question, the script only ran once . It didn't report any errors, and appeared to eventually exit successfully. The only thing abnormal is that Oracle Stats Pack shows that SQL statement in the script executed 12000 times and was consuming all available DB threads. I've reviewed the DBI debugging links provided by Brian Fennell and have now added some DBI debugging to the script in the hopes that I can get more info on how DBI is behaving. I came to this mailing list because it seems to me that the problem is with DBI, either I've hit some bug, or my usage is not correct (or there is some other problem deeper down in the stack) Cheers, John -----Original Message----- From: Steven Lembark <lemb...@wrkhors.com> Sent: Tuesday, 18 February 2020 3:47 AM To: dbi-users@perl.org Cc: lemb...@wrkhors.com Subject: Re: Perl script excessively executing statement CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. On Thu, 13 Feb 2020 06:58:37 +0000 JohnD Blackburn <johnd.blackb...@au.abb.com> wrote: > The DBA pulled info from some cache that showed the SQL statement from > the script was executed 12610 times. > > So if I were to add an “or die $!” statement after the > fetchrow_array(), that might fix things? (or prevent it from trying > to continue if there are errors?) Or catch errors after the execute > statement? Q: What are you really trying to accomplish? Basic rules: If you need to run SQL more than once parepare it. If you are not sure how well it runs then you can evaluate it beorehand -- which also spots errors, can help track down un-used indexes. As noted, generating a single query, preparing and running once, then extracting the rows (preferably as arrays rather than hashes) will be the most effecient way. The database is going to do a much better job filtering anything it can locally in native format beore you convert it to ASCII char strings and pull the same content across a LAN. -- Steven Lembark 3646 Flora Place Workhorse Computing St. Louis, MO 63110 lemb...@wrkhors.com +1 888 359 3508 The information contained in this electronic mail transmission is intended only for the use of the individual or entity named in this transmission. If you are not the intended recipient of this transmission, you are hereby notified that any disclosure, copying or distribution of the contents of this transmission is strictly prohibited and that you should delete the contents of this transmission from your system immediately. Any comments or statements contained in this transmission do not necessarily reflect the views or position of Radial or its subsidiaries and/or affiliates.