Re: Clarification sought on execute_array

John Scoles Tue, 18 Jan 2011 05:21:56 -0800

 On 18/01/2011 7:52 AM, H.Merijn Brand wrote:

On Tue, 18 Jan 2011 07:40:25 -0500, John Scoles<sco...@pythian.com>
wrote:

   On 17/01/2011 3:34 PM, Martin J. Evans wrote:

There appear to be differences between DBDs which do not handle
execute_array (so DBI does it for them) and DBDs which do handle
execute_array (e.g., DBD::Oracle). The main ones discussed on
#dbix-class which I investigated are whether the driver sets the err
and errstr or even raises an error. Some of the guys writing
DBIx::Class think execute_array should raise and error and fail on the
first error but I explained since execute_array may send the entire
batch to the server and it is server dependent when it stops this is
beyond definition by DBI. Never the less the following script seems to
show some large differences between DBI's execute_array and
DBD::Oracle's:

The first question is do any other DBIs utilize a native array_execute??

DBD::Unify and DBD::CSV do not


Looking more and more that only DBD::Oracle has a native exe_array

Anyway

Well lets go back to DBI and see what it says

      When called in scalar context the execute_array() method returns
   the number of tuples executed, or |undef| if an error occurred. Like
   execute(), a successful execute_array() always returns true regardless
   of the number of tuples executed, even if it's zero. If there were any
   errors the ArrayTupleStatus array can be used to discover which tuples
   failed and with what errors.

In DBD::Oracle you will never get 'undef' returned as the execute will
always be successful even though all of your tuples may fail.

So It agrees with the first para and works in scalar.

To get the extra info that comes out in a non-DBD specific array_execute
we would have to build in an extra iteration over the results to give a
count of the Failed/Pass.  As some of my customers use this with batch
loads of 5meg plus of inserts the iteration may take some time and sort
of defeat the purpose of a quick way to do bulk inserts.

I think (you will have to ask Tim to verify) that the Idea behind
array_execute is a 'Batch' processor. ie send a Batch, to the server
then figure out what to with what is returned.

so with

a) even though RaiseError was set, no error was raised although a
    warning was.
    We know there was a problem so we have to fail the batch or at
    least report on it is what the warning is telling us

b) execute_array returned undef (correct)
    Well at least that is a good thing

c) errstr is set but err is not (0)

d) the HandleError routine was not called - due to (a)?
    Which is correct as is did do exactly what was expected.  ie
    'execute a bath and report back'

                 ^^^^
The batch will take long enough to have a comfortable bath? :)


Dislexia stick again?

You see we do not believe in Dog!

As a side point one client who uses the exe_array takes 2h and 45m torun so just enough time for a good bath :). To put it in perspective ituse to take 19~25 hours (DBD::Oracle 1.17) and she now has almost twicethe number of records than at the 29 hour mark.


cheers

e) the count of rows affected is -1 for all rows which worked - I
    believe this is permissible
    I will have to check on that.

In the end I do not think this should ever error

      eval {
          $inserted = $sth->execute_array(
              { ArrayTupleStatus =>  \@tuple_status } );
      };

It is the wrong way to process a batch job. JMHO though


I guess the real sort of problem is that in the normal DBI array fetch
it is just iterating over array and doing the insert one at a time so
you get your good and error counts as you go.  As well as each iteration
is a separate execute you will get a raise_error with it which is think
is suppressed but I would have to look at the code.

I think you are right that the the chaps at DBIx have it wrong.  It
should be a batch job and they would have to handle in that way.

1) bind
2) exe
3) commit if all successful or  process if an error is returned.

Anyway lets see what Tim has to say.

We could add in the list context for DBD::Oracle and do some of this
processing with the caveat that it will take longer than the scalar context

Cheers
John

use DBI;
use strict;
use Data::Dumper;

sub fred
{
     print "Error Handler called\n";
     print Dumper(\@_);
     my ($msg, $handle, $val) = @_;

     print "handle_error: $msg\nhandle: $handle\nval=$val\n";
     0;
}

my $dbh = DBI->connect(
     'DBI:Oracle:host=xxx;sid=devel', 'xxx', 'xxx',
     { RaiseError =>  1, PrintError =>  0, HandleError =>  \&fred
});
do_it($dbh);

my $dbh = DBI->connect(
     'DBI:ODBC:DSN=xxx', 'xxx', 'xxx',
     { RaiseError =>  1, PrintError =>  0, HandleError =>  \&fred
});

do_it($dbh);

sub do_it {
     my $dbh = shift;

     eval {$dbh->do(q/drop table mytest/);};
     $dbh->do(q/create table mytest (a int primary key, b char(20))/);

     my $sth = $dbh->prepare(q/insert into mytest values (?,?)/);
     $sth->bind_param(1, 1);
     $sth->bind_param(2, 'onetwothree');
     $sth->execute;

     $sth->bind_param_array(1, [51,1,52,53]);
     $sth->bind_param_array(2, ['fiftyone', 'fiftytwo', 'fiftythree',
'one']);
     my (@tuple_status, $inserted);
     eval {
         $inserted = $sth->execute_array(
             { ArrayTupleStatus =>  \@tuple_status } );
     };
     if ($@) {
         print "Exception: $@\n";
     }
     print "Error from execute_array - " . $sth->errstr . ",",
$sth->err ."\n"
         if (!$inserted);
     for (@tuple_status) {
         print Dumper($_), "\n";
     }
}

which outputs for the DBD::Oracle part:

$ perl execute_array/execute_array.pl
DBD::Oracle::st execute_array warning: ORA-24381: error(s) in array
DML (DBD SUCCESS_WITH_INFO: OCIStmtExecute) [for Statement "insert
into mytest values (?,?)"] at execute_array/execute_array.pl line 43.
Error from execute_array - ORA-24381: error(s) in array DML (DBD
SUCCESS_WITH_INFO: OCIStmtExecute),0
$VAR1 = -1;

$VAR1 = [
           1,
           'ORA-00001: unique constraint (BET.SYS_C0096150) violated
(DBD SUCCESS_WITH_INFO)'
         ];

$VAR1 = -1;

$VAR1 = -1;

Notable from this is that:

a) even though RaiseError was set, no error was raised although a
warning was.
b) execute_array returned undef (correct)
c) errstr is set but err is not (0)
d) the HandleError routine was not called - due to (a)?
e) the count of rows affected is -1 for all rows which worked - I
believe this is permissible

For the DBD::ODBC run which does not do execute_array itself you get:

Error Handler called
$VAR1 = [
           'DBD::ODBC::st execute_array failed: executing 4 generated 1
errors',
           bless( {}, 'DBI::st' ),
           undef
         ];
handle_error: DBD::ODBC::st execute_array failed: executing 4
generated 1 errors
handle: DBI::st=HASH(0xa071d00)
val=Exception: DBD::ODBC::st execute_array failed: executing 4
generated 1 errors at
  execute_array/execute_array.pl line 43.

Error from execute_array - executing 4 generated 1 errors,2000000000
$VAR1 = 1;

$VAR1 = [
           1,
           '[unixODBC][Easysoft][SQL Server Driver][SQL
Server]Violation of PRIMARY KEY constraint \'PK__mytest__3661ABE9\'.
Cannot insert duplicate key in object \'dbo.mytest\'. (SQL-23000)
[state was 23000 now 01000]
[unixODBC][Easysoft][SQL Server Driver][SQL Server]The statement has
been terminated. (SQL-01000)',
           '01000'
         ];

$VAR1 = 1;

$VAR1 = 1;

Notice the difference:

a) an error was raised (different from DBD::Oracle) saying 1 of 4 failed
b) execute_array returned undef (the same)
c) both errstr and err are set although where 2000000000 comes from
I'm not sure
d) the HandleError routine was called (different from DBD::Oracle)
e) the count of rows affected is 1 for all the rows which worked

For anyone using execute_array this represents somewhat of a problem
unless they write substantial code per DBD. The clarification required
is:

a) if execute_array fails on any row should that raise an error?
     Obviously, if it does, then HandleError comes in to it
b) if execute_array fails should that set errstr AND err

I believe the count per row of affected is driver dependent so I'll
ignore that but there is a lot of code out there (perhaps doing things
wrong) which examines "err" (like DBIx::Class) which is not set in
DBD::Oracle's case. The strict interpretation of the pod for
execute_array suggests execute_array will return undef on any failure
(which it does in both cases) but not whether any row is an
error/warning and whether "err" and "errstr" are set.

BTW, please keep Peter (ribasushi) on the cc list as he is not
subscribed to dbi-dev but is an interested party.

Martin

Re: Clarification sought on execute_array

Reply via email to