Re: more on execute_array not complying with the specification

Martin J. Evans Wed, 02 Feb 2011 07:52:15 -0800

The argument seems to have got confused here between what DBI does, what it 
says it does and DBD::Oracle (which does neither). I don't want (and don't 
think) any change (other than in pod) in DBI is necessary. The thread started 
with my observations of differences between DBI and DBD::Oracle with the 
summary:

a) even though RaiseError was set, no error was raised although a warning was.
b) execute_array returned undef (correct)
c) errstr is set but err is not (0)
d) the HandleError routine was not called - due to (a)?
e) the count of rows affected is -1 for all rows which worked - I believe this 
is permissible

I later noticed:

f) DBD::Oracle was never returning the total rows affected in list context
g) DBI was documenting 2 fields in ArrayTupleStatus for an error put 3 in

On 02/02/11 12:15, Tim Bunce wrote:
> On Tue, Feb 01, 2011 at 09:02:26PM +0000, Martin J. Evans wrote:
>> On 01/02/2011 20:50, Tim Bunce wrote:
>>> On Tue, Feb 01, 2011 at 10:58:14AM -0500, John Scoles wrote:
>>
>>>> My only concern is when it does error (no matter what the setting of
>>>> AutoCommit) you always get unef;
>>> Umm, yes. Returning undef (or an empty list) if there was any kind of
>>> error causes useful information (the total row count) to be discarded.
>>
>> I don't see why when called in list context. Just because
>> execute_array partially succeeded (or failured) only means undef
>> needs to be returned for the first scalar and does not affect the
>> rows affected.
> 
> True for the total row count. I was thinking of the RaiseError case,
> I think. Also, by that definition, this wouldn't work:

Yes, sorry, me being an idiot - of course, when RaiseError is set whether 
scalar/list context nothing it returned as $tuples/$rows.

>     unless (($tuples, $rows) = $sth->execute_array(...)) {
>         ...handle error...
>     }
> That's not a big deal though.  (Returning undef for the tuple count
> isn't a big problem because you can determine that from the length of
> ArrayTupleStatus.)

agreed.

I only added list context to DBI ages ago because I noticed the affected rows 
was available for the batch but not individually and at the time I wanted it. 
Some time since then this has stopped working and I probably ignored it since I 
have a broken Oracle DB which fails the 26exe_array test anyway and I don't use 
execute_array in this project now.

DBD::Oracle no longer (in 1.27 at least) works to the spec as it is now because 
even against a working Oracle 10 database (working in that it commits working 
tuples) rows_affected is always undef even when RaiseError is NOT set. See the 
script I posted which output:

DB Version: 10.2.0.1.0
AutoCommit = 1 RaiseError = 0 # raise error NOT set
# warning output by PrintWarn:
DBD::Oracle::st execute_array warning: ORA-24381: error(s) in array DML (DBD 
SUCCESS_WITH_INFO: OCIStmtExecute) [for Statement "insert into mytest values 
(?,?)"] at 
/home/martin/svn/dbd-odbc/trunk/rt_data/execute_array/execute_array_oracle.pl 
line 62.
execute_array did not raise error: '' # which you'd expect
execute_array = undef # which is to the spec as it is now

total affected rows = undef # OOPS - not set even though some rows were affected

Error from execute_array - errstr=ORA-24381: error(s) in array DML (DBD 
SUCCESS_WITH_INFO: OCIStmtExecute), err=0, state=''
$tuple_status = [
                  -1, # because Oracle cannot give you these per tuple
                  [
                    1,
                    'ORA-00001: unique constraint (SYSTEM.SYS_C005676) violated 
(DBD SUCCESS_WITH_INFO)'
                  ],
                  -1,
                  -1
                ];

Error captured in handler: undef # as you'd expect as RaiseError not set
# warning captured in SIGWARN handler:
Warning captured in SIGWARN handler: 'DBD::Oracle::st execute_array warning: 
ORA-24381: error(s) in array DML (DBD SUCCESS_WITH_INFO: OCIStmtExecute) [for 
Statement "insert into mytest values (?,?)"] at 
/home/martin/svn/dbd-odbc/trunk/rt_data/execute_array/execute_array_oracle.pl 
line 62.
'
# good rows successfully committed:
$select * from mytest = [
                          [
                            '1',
                            'onetwothree         '
                          ],
                          [
                            '51',
                            'fiftyone            '
                          ],
                          [
                            '52',
                            'fiftythree          '
                          ],
                          [
                            '53',
                            'one                 '
                          ]
                        ];

>>> This brings us back to the question of whether a failure of a single
>>> tuple should cause execute_array() itself to return an error.
>>>
>>> ODBC doesn't do that. It treats that situation as SUCCESS_WITH_INFO
>>> and that does seem very reasonable.
>>
>> The funny thing is being an big user of ODBC, I don't (and others in
>> this thread expressed the same belief).
> 
> I'm (wobbling) on the fence on that one. I can see it both ways,
> though I'm more sympathetic to your viewpoint :)

I wasn't wobbling and am only wobbling a little now since the point about 
$rows_affected being lost when RaiseError is set was made.

I looked again at ODBC as to why it might do that and it appears it is because 
it states when an error occurs the various bound buffers are in an 
indeterminate state and yet an app may have bound buffers for row status array, 
rows processed array etc (like ArrayTupleStatus). So they return 
SQL_SUCCESS_WITH_INFO which a) avoid the problem of not allowing you to look at 
bound buffers b) allows you to find out what went wrong and where. DBI already 
has this covered with ArrayTupleStatus.

Years of experience with writing, maintaining and debugging database code tells 
me that if something does not work completely then the programmer should have 
to go out of his way to accept that situation and continue i.e., I still think 
the default action for RaiseError in execute_array should be to raise an error.

>> For me, if I use execute_array (with AutoCommit on or not) my
>> principle issue is I want to know if anything failed so I do:
>>
>> $dbh->{RaiseError} = 1;
>> my $ret = eval {
>>    execute_array(...)
>> };
>> do_something_on_error() if ($@ || !$ret); # amended for DBD::Oracle behavior
> 
> Ok. How would you get the total row count when using RaiseError?

If you mean rows_affected, then you cannot as it stands but bear in mind we 
only have 2 usage cases at present - DBI and DBD::Oracle so read on, and in any 
case if I want it I just disable RaiseError temporarily. Clearly we cannot 
guarantee to be able to return the affected rows when RaiseError is on because 
Oracle cannot do it on a tuple basis and when an error is raised the return 
from execute_array is undef. The main point is if I want it I can get it. If 
you mean rows executed it is scalar(@status).

>> Why put the onus on people who cannot accept some tuples to fail to
>> check the ArrayTupleStatus instead of put it on those who can accept
>> some to fail to examine ArrayTupleStatus? My feeling is that it is
>> very serious to ignore a partial failure and so the default should
>> be to ensure you know about it with RaiseError. For those people who
>> can accept a tuple to fail, they can look at ArrayTupleStatus and
>> still commit the changes.
> 
> I agree.

good, some progress ;-)

>>>> but I can live with that.
>>> Perhaps we could default to the SUCCESS_WITH_INFO model and add an
>>> attribute for people to use if they want tutple errors to cause
>>> execute_array() itself to return an error.

but that is a serious change in the default behaviour which could cause a lot 
of people a lot of work.

>>> We can debate the name later, but for now let's call it:
>>>
>>>     ArrayPromoteTupleError =>  1
>>
>> which defaults to 1, I hope.
> 
> Probably ;-)
> 
>> Incidentally, I just blogged about this a few hours ago as I think
>> it is a serious problem to ignore potential failures. Add that to
>> the fact that some versions of Oracle database (broken) return
>> success with info but don't then commit the ok tuples.
> 
> Yes, that's clearly a serious bug.
> 
> Tim.

Let me see if I can sum this all up so we can move on:

Ok, revision of some facts:

1 there are currently only 2 implementations of execute_array, DBI and 
DBD::Oracle

2 DBD::Oracle at least in 1.27 does not conform to the written spec today in 
that it does not raise an error if RaiseError is set but DBI does.

fixed by John raising an error instead of a warning and I believe he has 
already done this.

3 DBI and the written spec disagree in that one says an error field in 
ArrayTupleStatus contains 2 fields and the other puts in 3 fields.

fixed in DBI trunk already

4 DBD::Oracle at least in 1.27 does not return the rows_affected in list 
context even if RaiseError is disabled so no one could be using it right now.

John has already shown me a branch which fixes this

5 DBD::Oracle puts 2 fields in an ArrayTupleStatus error field but DBI puts 3 
in.

can easily be fixed in DBD::Oracle and I believe John has addressed this now

Now some differences we would like to cope with:

6 Some drivers may be able to return rows affected per tuple and some like 
Oracle cannot

We already do this because drivers which can return the rows affected per tuple 
already do so in the ArrayTupleStatus

7 Some drivers may be able to return the total number of rows affected (in the 
batch, like DBD::Oracle) and some may not.

We already the handle the tuples executed and rows_affected with array context 
on execute_array if RaiseError is off. Now if RaiseError is on you'll not get 
these so if you need them you just disable it locally. Even if you don't 
disable RaiseError you can still get tuples executed but not rows affected.

{
  local $sth->{RaiseError} = 0;
  my ($executed, $affected) = $sth->execute_array(ArrayTupleStatus = \my 
@status);
  if (!$executed) { # some/all failed
    # $affected in Oracle's case is the right number once fixed
    # if you want to find the failured tuples look at @status
    # it you want to know the affected per tuple look at @status
    # to find the number of executed tuples scalar(@status)
  }
}
8 Some drivers may abort on the first error and some may continue and do all 
tuples

Now some coding situations we would like to allow:

9 for those people (me most of the time) who expect all the tuples to succeed 
(and I suspect that covers most people) we need an easy way to indicate an 
error. The convention in DBI if you don't want to check every return status is 
to RaiseError => 1. People expect RaiseError => 1 will raise an error if 
something failed and would be surprised to find it did in all cases except one. 
Just search for RaiseError on perl monks - it is probably the most mentioned 
attribute wrt DBI. I'm willing to bet if you asked most people who use DBI 
whether execute_array raised an error when some of the rows failed most would 
say yes.

Now, I'm assuming these people have not been looking at the rows_affected in 
list context since it is not available via DBI when RaiseError is set and it is 
not available via DBD::Oracle (in 1.27) at all whether RaiseError is set or not.

# this is what most people do and in addition DBIx::Class does the stuff in
# comments too
{
  my @status;
  $sth->{RaiseError} = 1;
  eval {
    $sth->execute_array({ArrayTupleStatus = \@status);
  };
  if ($@) {
    # to find tuple in error look in @status
    # to find rows executed scalar(@status) and hence know if batch was aborted
    # you don't know affected rows unless the driver does it per tuple
    #   if you need then disable RaiseError
    die $sth->errstr if ($@);
  }
}

10 some more "sophisticated" people (we have not found yet) might expect some 
tuples to fail and instead of setting RaiseError (or by clearing it) they check 
execute_array status and may be even look at rows_affected in list context.

case already covered in 7 above

So the benefits:

o only a small doc change to DBI
o no ones code using DBI and not DBD::Oracle is broken
o DBD::Oracle is fixed to work to the spec

the disadvantages:

o you cannot know the total rows affected when RaiseError is on but a) you 
never could in DBI and even when it was off, you only could in DBD::Oracle for 
a few versions a long time ago. You can work around this easily.

So, I think DBI works fine right now (given slight pod change) and don't want 
to complicate it or worse, break peoples existing code. DBD:: Oracle on the 
other hand does not but I believe John has already demonstrated it can be made 
to work.

Apologies for length but I could not work out how the simple clarification I 
was seeking so we could fix DBD::Oracle ended up in changing DBI. I might be at 
fault for that - sorry.

Martin
-- 
Martin J. Evans
Easysoft Limited
http://www.easysoft.com

Re: more on execute_array not complying with the specification

Reply via email to