Re: DBD::mysql 2.9007 and DBD::mysql 2.9015_3 (beta) released

2005-04-29 Thread Kristian Nielsen
Patrick Galbraith <[EMAIL PROTECTED]> writes:

> email too soon ;) I was actually going to answer. Yes, we want to
> support UTF-8 - that's one of the feature major bullet-points for
> 3.0. I'm not sure what all is involved, and have to deal with these
> issues in my work on the federated storage engine, but the answer is a
> definite "yes" ;)

Having spent much effort understanding all the issues involved in Perl
utf8, I want to give a few hints on this since it is an issue that is
very often misunderstood.

The issue is more complex than just "setting the utf8 flag", in fact
when people say that it usually indicates that they are doing the wrong
thing. Perl programmers should _never_ have to see or even know about
the utf8 flag, it should only ever be visible to XS code. All Perl
strings are sequences of Unicode characters.

The real issue is that Perl utf8 introduces two different _internal_
representations of strings (latin1 and utf8). This means that _all_ XS
code that accesses string data must first check which internal
representation is used for the string, and convert the string data if
necessary. It is _always_ wrong to directly use Perl string data from XS
code without appropriate conversion. The utf8 flag is the bit that
defines which internal format is used, and that is all it should be used
for.

So for DBD::mysql, whenever a string is passed to the MySQL API, it must
be converted from the Perl internal representation to the MySQL client
character set. Even if the client character set is latin1 it is
necessary to convert, since it is perfectly possible and normal for
latin1 characters to be stored in utf8 internal format in Perl strings!

When data is pulled from the MySQL API, if the data contains characters
with Unicode value > 255 it must be stored in utf8 internal format. If
all characters are <= 255 it can be stored in either utf8 or latin1
internal format in principle, though in practise most modules that deal
with non-latin1 characters tend to use the utf8 internal format
exclusively.

Anyway, just my opinions on this issue which reflect the fact that we
have experienced much pain over XS code that didn't follow the above
guidelines!

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S



Re: ora_can_unicode discussion

2004-03-24 Thread Kristian Nielsen
"Andy Hassall" <[EMAIL PROTECTED]> writes:

>   1b. If a Perl string with the utf8 flag is bound to a statement, it
> is bound as UTF8 rather than the client character set. Otherwise it is bound
> as normal (in the client character set).

Please do not do this. I will try to explain why.

In Perl, the utf8 flag shouldn't carry any semantics, it should be
purely a matter of internal representation of the string. Thus it is
perfectly possible for two strings to be equal, even though one has the
utf8 flag set and one has the utf8 flag cleared:

use strict;
use warnings;

use Encode();
use Devel::Peek;

my $bytes = "\xe6\xf8\xe5";
my $utf8s = Encode::decode_utf8(Encode::encode_utf8($bytes));

Dump($bytes);
Dump($utf8s);
print $bytes eq $utf8s ? "Equal\n" : "Not equal\n";

The output is

SV = PV(0x811beb0) at 0x81268c4
  REFCNT = 1
  FLAGS = (PADBUSY,PADMY,POK,pPOK)
  PV = 0x8121a50 "\346\370\345"\0
  CUR = 3
  LEN = 4
SV = PV(0x81a233c) at 0x81268e8
  REFCNT = 1
  FLAGS = (PADBUSY,PADMY,POK,pPOK,UTF8)
  PV = 0x81584c8 "\303\246\303\270\303\245"\0 [UTF8 "\x{e6}\x{f8}\x{e5}"]
  CUR = 6
  LEN = 7
Equal

So $bytes and $utf8s is the _same_ string, consisting of the same three
characters, though Perl internally stores it in different ways.
Therefore, both strings should work the same way (by default at least)
when bound in DBI. To quote from perldoc Encode: "This utf8 flag is not
visible in perl scripts". Therefore it should not become visible through
the use of DBI.

I haven't followed the discussion closely, but I believe the core of the
problem is that some (old?) code may bind strings as sequences of bytes
in the database character set. Whereas other (new?) code binds strings
as sequences of unicode characters. As far as I can see, there is no way
for DBI to reliably distinguish between these two situations, the user
will have to tell one way or the other (whether by handle attribyte,
bind option, or defaults based on environment/database config).

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S



Re: DBD::Oracle bulk bind support

2004-03-11 Thread Kristian Nielsen
Tim Bunce <[EMAIL PROTECTED]> writes:

My head is spinning with all the sudden interest in native Oracle
execute_array() ... I have Cc'ed dbi-dev, hope that's ok.

> Oh, hang on. I've just realised that I had been confusing these.
> Are you saying you have developed your own patch separately to
> (and effectively competing with) Kristian's? [CC'd]
> 
> Why not work together?

We did work together last summer. From what I understand Chris used the
XS part of my patch and modified Oracle.pm to work with the (at that
time) new execute_for_fetch() method. My Oracle.pm was based on old DBI
which did not have execute_for_fetch().

I haven't seen Chris' patch, but I would expect the XS/C part of it to
be identical to mine (feel free to correct me on this). Does Chris'
patch work with named placeholders?

I will now rewrite the XS part of my patch to be better integrated with
the existing scalar bind/execute code. Hopefully this will simplify it
somewhat; note though that there will still be a fair amount of code
left, for example to get ArrayTupleStatus from Oracle. I understand that
this is what Tim needs to integrate the code into the official DBD::Oracle.

As to the Oracle.pm modifications I am happy to implement whatever Tim
decides on or to use another implementation. If nobody come up with
anything I will settle for direct implementation of both
bind_param_array(), execute_array(), and execute_for_fetch() in
Oracle.pm, mimicking the DBI implementations. I need this anyway for
merging the newest DBI/DBD::Oracle into our own internal DBI tree.

And now for me it's back to doing some coding.

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S



Re: execute_array()/execute_for_fetch() (was: Theory/Algorithm question)

2004-03-03 Thread Kristian Nielsen
Tim Bunce <[EMAIL PROTECTED]> writes:

> > +my $res = ora_execute_array($sth,
> > +[EMAIL PROTECTED],
> > +scalar(@tuple_batch),
> > +$tuple_batch_status);
> > +if(defined($res) && defined($row_count)) {
> > +$row_count += $res;
> > +} else {
> > +$row_count = undef;
> > +}
> > +push @$tuple_status, @$tuple_batch_status
> > +if defined($tuple_status);
> > +}
> > +return $row_count;
> 
> I think the return value isn't right. I'd there's an error then do ++$err_count,
> then on return do: return ($err_count) ? undef : scalar @$tuple_status;

If there is an error ora_execute_array() returns undef, and $row_count
is set to undef. If there is no error $row_count will be sum of
individual ora_execute_array() rowcounts. But I will see if your
suggestion can work and if so change to that for clarity.

> Or, looking at it a different way, does this work:
> 
>   $sth = $dbh->prepare("INSERT INTO mytable VALUES( :foo )");
>   $sth->bind_param_array(":foo", [0..]);
>   $s->execute_array(undef);
> 
> It ought to.

It does not:

DBD::Oracle::st bind_param_array failed: Can't use named placeholders for 
non-driver supported bind_param_array [for Statement "INSERT INTO mytable 
VALUES(:foo)" with ParamValues: :foo=undef] at -e line 1.

And in fact it cannot work using the execute_for_fetch() approach. It
has been too long so I had forgotten, but now I remember: this is
actually the main reason I originally decided to implement execute_array
directly rather than just execute_for_fetch(). Named placeholders for
execute_array requires column_wise bind values. Rowwise as in
execute_for_fetch() doesn't have the bind names.

This could be fixed, of course: execute_for_fetch could let the
$fetch_tuple_sub return hash references, or bind_param_array() could set
up a mapping from names to placeholder indexes.

Hm, I'll have to think about that. Any suggesions?

> > +static int
> > +do_bind_array_exec(sth, imp_sth, phs)
> 
> Beyond here it looks like you've done more work than needed. I was
> expecting the dbd_phs_in() callback function to be extended (slightly)
> so it knew how to use the oracle supplied index to index into the array.

As I said it has been a long time, but I'll try to remember back then ...

There are two parts to binding. One is to setup the proper OCI callbacks
(dbd_phs_id ...). The other is to convert the bind values from Perl
scalars to char *'s.

For non-array bind, DBD::Oracle mixes these two together, as far as I
remember. But for array bind the OCI callbacks must be set up once, but
the conversion to char * must be done once for each bind value. So I had
to implement new array bind code to set up the callback, and extend
dbd_phs_in() with the char * conversion.

If you want, maybe I could look into instead special-casing the
non-array bind code to do only the callback-setup in the array case?
Maybe originally I just felt that the bind code was complicated enough
as it is. But the char * conversion must stay in the callback (unless we
want to allocate a whole new array with the converted values ...). Maybe
/all/ char * conversion should move into the callback?

Anyway, I am all for trying to make the code simpler, so I'll look
deeper into your suggestions and welcome any further input.

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S



Re: execute_array()/execute_for_fetch() (was: Theory/Algorithm question)

2004-03-03 Thread Kristian Nielsen
[When I first sent this it bounced with the message:

<[EMAIL PROTECTED]>:
ezmlm-send: fatal: Sorry, after removing unacceptable MIME parts from your message I 
was left with nothing (#5.7.0)

Not sure what that means, but I'll try again with the patch inline
instead of as an attachment.]

Tim Bunce <[EMAIL PROTECTED]> writes:

> How about this... Don't implement execute_array, just use the DBI default.

Ok, I have merged my DBD::Oracle execute_array() patch into a subversion
checkout and simplified it to implement execute_for_fetch() instead of
execute_array() (included below).

I get segfaults in make test t/30long.t, but I get that both with and
without the patch.

I will

 - Add docs for the problem with Oracle < 8.1.5.
 - Add some tests in t/20select.t.
 - Check up on the ArrayTupleStatus elements; it seems a third 'state'
   element has been added to every error tuple.

Anything else you would like before including it?

And finally for some meaningless timings. With the patch I ran the
following in less than one second:

$s = $dbh->prepare("INSERT INTO mytable VALUES(?)");
$s->execute_array(undef, [0..]);

while this ran in 7 seconds:

$s->execute($_) for (0..);

While real-life speedups may be more modest, this is still encouraging!

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S

---
Index: oci8.c
===
--- oci8.c  (revision 174)
+++ oci8.c  (working copy)
@@ -137,7 +137,7 @@
 ) {
if (debug >= 4 || recno>1/*XXX temp*/)
PerlIO_printf(DBILOGFP, "OCIErrorGet after %s (er%ld:%s): %d, %ld: 
%s\n",
-   what, (long)recno,
+   what ? what : "", (long)recno,
(eg_status==OCI_SUCCESS) ? "ok" : oci_status_name(eg_status),
status, (long)eg_errcode, errbuf);
errcode = eg_errcode;
@@ -360,6 +360,30 @@
 {
 phs_t *phs = octxp;
 STRLEN phs_len;
+AV *tuples_av;
+SV *sv;
+AV *av;
+SV **sv_p;
+
+/* Check for bind values supplied by tuple array. */
+tuples_av = phs->imp_sth->bind_tuples;
+if(tuples_av) {
+  /* NOTE: we already checked the validity in ora_st_bind_for_array_exec(). */
+  sv_p = av_fetch(tuples_av, phs->imp_sth->rowwise ? iter : phs->idx, 0);
+  av = (AV*)SvRV(*sv_p);
+  sv_p = av_fetch(av, phs->imp_sth->rowwise ? phs->idx : iter, 0);
+  sv = *sv_p;
+  if(SvOK(sv)) {
+*bufpp = SvPV(sv, phs_len);
+phs->alen = (phs->alen_incnull) ? phs_len+1 : phs_len;
+phs->indp = 0;
+  } else {
+*bufpp = SvPVX(sv);
+phs->alen = 0;
+phs->indp = -1;
+  }
+}
+else
 if (phs->desc_h) {
*bufpp  = phs->desc_h;
phs->alen = 0;
@@ -383,7 +407,7 @@
PerlIO_printf(DBILOGFP, "   in  '%s' [%lu,%lu]: len %2lu, ind %d%s\n",
phs->name, ul_t(iter), ul_t(index), ul_t(phs->alen), phs->indp,
(phs->desc_h) ? " via descriptor" : "");
-if (index > 0 || iter > 0)
+if (!tuples_av && (index > 0 || iter > 0))
croak("Arrays and multiple iterations not currently supported by DBD::Oracle 
(in %d/%d)", index,iter);
 return OCI_CONTINUE;
 }
Index: Oracle.pm
===
--- Oracle.pm   (revision 174)
+++ Oracle.pm   (working copy)
@@ -841,7 +841,36 @@
 
 {   package DBD::Oracle::st; # == STATEMENT ==
 
-# all done in XS
+sub execute_for_fetch {
+my ($sth, $fetch_tuple_sub, $tuple_status) = @_;
+my $row_count = 0;
+my $batch_size = ($sth->{ora_array_chunk_size} ||= 1000);
+my $tuple_batch_status;
+
+if(defined($tuple_status)) {
+@$tuple_status = ();
+   $tuple_batch_status = [ ];
+   }
+while (1) {
+my @tuple_batch;
+for (my $i = 0; $i < $batch_size; $i++) {
+push @tuple_batch, [ @{$fetch_tuple_sub->() || last} ];
+}
+last unless @tuple_batch;
+my $res = ora_execute_array($sth,
+[EMAIL PROTECTED],
+scalar(@tuple_batch),
+$tuple_batch_status);
+if(defined($res) && defined($row_count)) {
+$row_count += $res;
+} else {
+$row_count = undef;
+}
+push @$tuple_status, @$tuple_batch_status
+if defined($tuple_status);
+}
+return $row_count;
+}
 }
 
 1;
@@ -1221,6 +1250,26 @@
 
 =back
 

Re: Accessing DBI from XS/C?

2004-02-03 Thread Kristian Nielsen
Stephen Clouse <[EMAIL PROTECTED]> writes:

> Long story, since our software is approaching 7 years old.  Basically we have an 
> abstraction class that implements a database row (very analogous to Class::DBI, 
> but Class::DBI wasn't mature enough at the time I started writing this layer, 
> which was sometime in 1999).  My original pure Perl implementation was abysmally 

I am not familiar with Class::DBI, but it sounds like you construct an
object for each row fetched and access it with method calls? Yes, that
could probably get slow when done in pure Perl.

Maybe you should look into calling DBI from your C layer. Perhaps if
you can maintain the $dbh in Perl code and mostly call prepare(),
execute() and fetchrow*() from the C layer it would be doable.

perldoc perlcall might be a good start for what is possible.

 - Kristian.



Re: Accessing DBI from XS/C?

2004-02-03 Thread Kristian Nielsen
Stephen Clouse <[EMAIL PROTECTED]> writes:

> Is there any sane way to get at the DBI API from C?
> 
> I have a fair amount of database-layer code in XS/C for speed.  We still use DBI 
> because we do quite a bit of SQL from Perl, but then we also have this direct 
> OCI layer that does nasty things such as grabbing the Oracle handle pointers 
> from DBI, doing our own handle creation/prepares/executes and what not.

A large part of DBI is written in Perl, not C. Some DBDs are even
pure-Perl, as far as I know. So calling DBI would, as far as I can see,
mean pushing SV's (C representation of Perl values) on the Perl stack,
running the appropriate Perl expression, then popping the results off
the Perl stack. Whether that is considered sane would be a matter of
personal opinion.

It would be possible to call into DBD::Oracle XS/C code directly, but I
am not sure that could be considered "using DBI", and would also make
you vulnerable to internal changes between DBD::Oracle versions. And you
would still need heavy SV manipulations.

Could you give an example of where you need to use C instead of Perl for
speed in DB access? I would think that with proper use of DBI, any time
spent in the client Perl code would be dwarfed by time spent in OCI and
in the server.

 - Kristian.



Re: DBD::Oracle and bulk binds via execute_array

2003-10-09 Thread Kristian Nielsen
"Dean Arnold" <[EMAIL PROTECTED]> writes:

> Does this need to be modified to provide the column type info ? I don't see
> anything in the DBI 1.38 POD indicating type info can be specified...tho
> I spose the bind_param() i/f (wo binding data) can be used as a cheat ?

Where non-default types are desired, dummy bind_param() calls are
used. This is actually not a cheat, but the correct supported way. The
DBI docs may no be very explicit about this, but it says under
bind_param():

97 }".  The data type for a placeholder cannot be changed after the
first "bind_param" call. However, it can be left unspecified, in
which case it defaults to the previous value.

So you just say $sth=$dbh->prepare("INSERT INTO X VALUES(?)") and
immediately $sth->bind_param(1, undef, SQL_INTEGER); then any subsequent
array executes will use the specified type.

> Again, whence column type info ? Also, why less memory efficient 
> (in the wholistic sense) ? If you bind N column arrays of M elements each
> column wise, vs. binding M tuple arrays of N columns for rowwise,
> isn't the total memory use nearly equal ?

Well, Perl does not have two-dimensional arrays like C does. So for
row-wise you need memory for a Perl array for _every_ tuple, as well as
memory for each element. For column-wise there are only three arrays.

Here is a quick test of the memory usage of a million rows in columns
and in tuples:

perl -e '$y = [[1..1e6], [1..1e6], [1..1e6]]; scalar()'
 7962 kn23   0  132M 132M   912 S 0.0 13.1   0:01 perl

perl -e '$x = [map([$_,$_,$_], 1..1e6)]; scalar()'
14174 kn25   0  184M 184M   924 S36.3 18.3   0:01 perl

So for row-wise there is an overhead of around 50 bytes/tuple in this
example compared to column-wise. Significant, but not very so.

> > 1. The execute_for_fetch() method. The implementation of this in the
> >driver is mandatory (in the sense that without it no native
> >execute_array() will be available; DBI will emulate it with an
> >execute() for each row).
> 
> I assume you mean mandatory to support rowwise ?

No, actually I meant mandatory also for column-wise; execute_for_fetch()
should be the standard low-level driver interface, since it is the most
general. The other suggested methods are refinements that most drivers
need not bother about.

> > 2. A new $sth->execute_array_rowwise([EMAIL PROTECTED], [EMAIL PROTECTED])
> >for direct implementation of row-wise array execution. Optional; if
> >implemented it will be used by DBI::execute_array() when an array of
> >tuples is already available (ie. execute_array() method 4 above).
> 
> - needs type attribute capability; maybe also a named PH order list to indicate
> which tuple elements map to which named PHs ?

Types are handled with dummy bind_param() calls as above. But I tend to
agree about the need to support named placeholders; relying on
placeholder order is fragile when there are many placeholders and code
is modified. The execute_for_fetch() also needs to take an extra
optional PH order list to support named placeholders.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S



Re: DBD::Oracle and bulk binds via execute_array

2003-10-08 Thread Kristian Nielsen
"Dean Arnold" <[EMAIL PROTECTED]> writes:

> > Does DBI 1.38 have an API that takes an array of rows (which should be
> > the usual end-user interface rather than the more abstract callback
> > method of execute_for_fetch())? If not, it should probably be added, I
> > think I put in a row-wise option in my execute_array patch.
> 
> Was this ever resolved ? Was a new API i/f deemed neccesary,
> or was some hack on bind_param_array/execute_array considered ?
> Or is ArrayTupleFetch the solution to all the above ?

It has not been resolved yet (that I know of).

So here is my suggestion, based on my work on the DBD::Oracle
execute_array patch:

DBI interface:

The main user-level interface is execute_array(). Four different ways to
use:

1. Column-wise, using $sth->bind_param_array() for each column, then
   $sth->execute_array(\%attr). This is the only method that supports
   named placeholders (:foo style).
2. Column-wise, using $sth->execute_array(\%attr, [EMAIL PROTECTED], [EMAIL 
PROTECTED]).
3. Row-wise, using execute_array({ArrayTupleFetch => sub { ... } }).
   This is the only row-wise method supported in DBI 1.37. This permits
   calling execute_array() with a very large number of rows without
   having to keep them all in memory at once (subject to driver support;
   Oracle will buffer rows and send them off to Oracle in batches
   because of OCI restrictions).
4. Row-wise, using execute_array({ArrayTuple => [ [1,'A'], [2,'B'] ]}).
   Useful in the common case where the user already has a list
   of tuples. Less memory-efficient than column-wise, though.

Method 1 and 2 are the ones supported by older DBI versions. Method 3
was added in DBI 1.36. Method 4 is not in DBI now, but I suggest adding
it for the next version (and it is implemented in my DBD::Oracle
execute_array patch).


DBD::xxx interface:

Low-level methods implemented by drivers and usually only called by the
DBI implementation of execute_array():

1. The execute_for_fetch() method. The implementation of this in the
   driver is mandatory (in the sense that without it no native
   execute_array() will be available; DBI will emulate it with an
   execute() for each row).
2. A new $sth->execute_array_rowwise([EMAIL PROTECTED], [EMAIL PROTECTED])
   for direct implementation of row-wise array execution. Optional; if
   implemented it will be used by DBI::execute_array() when an array of
   tuples is already available (ie. execute_array() method 4 above).
3. A new $sth->execute_array_colwise([EMAIL PROTECTED], [EMAIL PROTECTED], ..., [EMAIL 
PROTECTED])
   for direct implementation of column-wise array execution. Optional;
   if implemented it will be used by DBI for execute_array() method 1
   and 2 above. If we want to implement named placeholders this method
   needs to be extended to take a mapping { 'foo' => [EMAIL PROTECTED],
   'bar' => [EMAIL PROTECTED] }, but maybe that is not so important.

The execute_array_rowwise() and execute_array_colwise() would only be
implemented in drivers like DBD::Oracle where they would provide better
performance than execute_for_fetch() (since they avoid building extra
temporary arrays). It could be that the extra performance obtained in
this way is not worth pursuing so that we would be better off with just
execute_for_fetch() in the drivers. Opinions, anyone? In any case, my
DBD::Oracle patch more or less implements them, so it is just a matter
of deciding style, not of implementing it.

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S



Re: DBD::Oracle and bulk binds via execute_array

2003-10-06 Thread Kristian Nielsen
"Chris R. Donnelly" <[EMAIL PROTECTED]> writes:

[I thought I would Cc: dbi-dev, since this seems to be of general
interest; I hope that is ok.]

> I believe I actually may have resolved that in the version I had written
> up... I started from DBI 1.38 (which has the execute_for_fetch()) method and
> had basically hit a wall with how to do the bind variables, which
> fortunately was the part you had done :).  I merged your changes in with the
> ones I had done.  I am still cleaning up some of the minor details (like
> status handling).
> 
> BTW, is there a need for doing data in a column major format?  It seems to
> perform just fine with row major format, and I believe the latest
> execute_array builds its tuples from the column-style binds... 

Well, this question is essentially the "outstanding issue" I mentioned.
If writing new code I think I would find the row-format most natural,
however old code would probably use execute_array(), which is column
major format. Column format is probably also more memory efficient,
though that may be of less inportance.

Does DBI 1.38 have an API that takes an array of rows (which should be
the usual end-user interface rather than the more abstract callback
method of execute_for_fetch())? If not, it should probably be added, I
think I put in a row-wise option in my execute_array patch.

Oracle OCI needs to know the total row-count up-front, so the only way
to implement execute_for_fetch() is to buffer rows in an array before
calling Oracle. If I start with an array of rows, I find it somewhat
unsatisfactory if the array is first taken apart in the
execute_for_fetch() callback, then reassembled in smaller pieces by
execute_for_fetch() before finally making it to Oracle.

For that reason I ended up implementing execute_array() directly as well
as execute_for_fetch(), and even made the driver support both
column-wise and row-wise operations natively. So the functionality is
there, it is just a matter of deciding which parts to pick to get a
clean implementation and interface with DBI. I think Tim will have some
opinions when he is done with the next DBD::Oracle.

Hm, I think my own suggestion will be to let DBD::Oracle implement
exeute_for_fetch() by buffering callback rows (there is an old post by
Tim in dbi-dev that outlines this). Then there could be lower-level
DBD::Oracle specific methods (or possibly new lower-level DBI methods)
for directly accessing rowwise and columnwise array opeations for those
who need the efficiency and control.

 - Kristian.

> 
> Thank you for your assistance!  I definitely couldn't do it without your
> help :)
> 
> # Chris
> 
> Chris Donnelly [EMAIL PROTECTED]
> Software Systems Architect
> Digital Motorworks
> Austin, TX (512) 692-1101 
>  
> 
> 
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
> Sent: Monday, October 06, 2003 2:07 AM
> To: [EMAIL PROTECTED]
> Subject: Re: DBD::Oracle and bulk binds via execute_array
> 
> "Chris R. Donnelly" <[EMAIL PROTECTED]> writes:
> 
> > Hi!  Sorry to bother you, but I found an entry on dbi-dev where you had
> > mentioned you were working on making DBD::Oracle support execute_array
> > properly.  Did you manage to get it working?  I was working on making it
> > implement it myself when I happened to find your article... if not, any
> > particular issues that stopped you?
> 
> Well, I did manage to get something working; in fact I posted a patch to
> dbi-dev (against DBD::Oracle 1.14) with an implementation:
> 
> http://www.mail-archive.com/[EMAIL PROTECTED]/msg02296.html
> http://aspn.activestate.com/ASPN/Mail/Message/perl-DBI-dev/1686549
> 
> No real issues in the implementation; mainly just a lot of time spent
> learning the deeper regions of both DBD::Oracle and OCI.
> 
> I have briefly tested that the patch actually works and gives a nice
> speed improvement. I have postponed extensive testing and memory leak
> check until we actually need this for production code, but if you were
> to start using it first and found any specific problems I would most
> likely be willing to fix it quickly.
> 
> I have an outstanding issue with Tim Bunce (DBI/DBD::Oracle author)
> about how to finalize the internal interface between DBI and
> DBD::Oracle::execute_array(), however that should not affect code that
> merely uses execute_array() and this native DBD::Oracle implementation.
> 
>  - Kristian.
> 
> -- 
> Kristian Nielsen   [EMAIL PROTECTED]
> Development Manager, Sifira A/S
> 
> 
> 



Re: DBD::Oracle beta available for testing

2003-09-25 Thread Kristian Nielsen
Tim Bunce <[EMAIL PROTECTED]> writes:

> But since *very* few people would need the ability to
> share database connections with ProC/SQLLIB code my preference

Yes, I agree, I implemented this because I wanted to migrate a large
ProC based codebase gradually to using Perl/DBI, so I am embedding Perl
in a ProC-based application. Few people would ever find themselves in
this unfortunate situation, and if they did, having to slightly tweak
the DBD::Oracle build would be the least of their problems.

> right now is to just disable it entirely on Windows and
> only enable it on other platforms for Oracle >= 9.0.

Hm, I *think* that I had this running on 8.1.7 before we upgraded to
Oracle 9. Not absolutely sure though.

Sorry that I cannot help with the Windows part, but I do not have
access to a Windows Oracle development environment.

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S



Re: Hints for implementing execute_array() support in DBD::Oracle?

2003-06-26 Thread Kristian Nielsen
Tim Bunce <[EMAIL PROTECTED]> writes:

> On Fri, Jun 06, 2003 at 08:29:12AM +0200, Kristian Nielsen wrote:

> > We could use proper execute_array() support in DBD::Oracle, and having
> > already fiddled a bit with the code I am considering having a go at
> > implementing it (no promises though).

Well, I managed to get something working, and I thought I would post
what I have so far to perhaps get some feedback and spur the discussion
by providing something concrete.

> Seeing your message has prompted me to add something I'd been meaning
> to do for a while that will simplify this somewhat:
> 
> =item C
>  
>   $rv = $sth->execute_for_fetch($fetch_tuple_sub);
>   $rv = $sth->execute_for_fetch($fetch_tuple_sub, [EMAIL PROTECTED]);
>  
> The execute_for_fetch() method is used to perform bulk operations
> and is most often used via the execute_array() method, not directly.

This is where I started. But as we discussed, OCI needs the tuple count
(and also the maximum bind lengths) up-front, so execute_for_fetch needs
to buffer tuples in batches. I then started to implement ora_execute_array
to take an array of tuples. But then I realised that this does not
support named placeholders (:foo style) well, and so I ended up with a
hybrid ora_execute_array that accepts both row-wise and column-wise
data. This isn't so bad, but the code is perhaps a bit convoluted as a
result. But it's a start.

The batch given below implements native DBD::Oracle bind_param_array()
and execute_array() with (I believe) full functionality. I also added an
'ArrayTuple' attribute to execute_array() that allows to pass an array
of tuples directly instead of a fetch_tuple sub or individual column
arrays (essentially exposing ora_execute_array() functionality).

Some other points:

To report errors for individual tuple executions, I use the
OCI_BATCH_ERRORS mode. As far as I have found, this mode is only fully
supported from Oracle 8.1.5 (not sure if that means client or server or
both). I'm not willing to do an alternative implementation for older
Oracle, but if there is a simple way to detect Oracle version in
Oracle.pm, I could fall-back to the DBI default implementation. BTW,
this is all OCI8 only, but I gather that OCI7 support is going away?

There is no support for in_out binding, though I would think that it
could be added later. No SELECT's either. There is also no support for
LOBs, and I am not sure that makes sense anyway.

I could not find any way to get the row counts for individual tuple
executions. So the status array returned by execute_array() will have
all elements as -1, (except for errors which are [errcode, errmsg]).
The total rowcount is returned, though.

Once the overall implementation and interface into DBI has been settled
a bit more I will of course add test cases and documentation as needed.

 - Kristian.

[Patch from 'cvs rdiff -u' appended.]

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S

---
Index: source/perlmod/DBD-Oracle/Oracle.pm
diff -u source/perlmod/DBD-Oracle/Oracle.pm:1.2 
source/perlmod/DBD-Oracle/Oracle.pm:1.2.2.4
--- source/perlmod/DBD-Oracle/Oracle.pm:1.2 Wed May 14 12:53:04 2003
+++ source/perlmod/DBD-Oracle/Oracle.pm Thu Jun 26 15:24:23 2003
@@ -825,7 +825,142 @@
 
 {   package DBD::Oracle::st; # == STATEMENT ==
 
-# all done in XS
+sub bind_param_array {
+my ($sth, $ph, $vals, $attr) = @_;
+
+   return $sth->DBI::set_err(1, "Value for parameter $p_id must be a ".
+  "scalar or an arrayref, not a ".
+  ref($vals_array))
+   if defined $vals and ref $vals and ref $vals ne 'ARRAY';
+
+   # get/create arrayref to hold params
+   my $hash_of_arrays = $sth->{ParamArrays} ||= { };
+
+# Promote scalar to singleton arrayref (later promoted to full-length
+# arrayref if needed).
+$vals = [ $vals ] unless ref($vals) eq 'ARRAY';
+
+# Check that input has same length as existing binds.
+# Promote singletons to full length.
+foreach (keys %$hash_of_arrays) {
+my $v = $hash_of_arrays->{$_};
+if(@$vals != @$v) {
+if(@$v == 1) {
+$hash_of_arrays->{$_} = [ map $v->[0], ([EMAIL PROTECTED]) ];
+} elsif(@$vals == 1) {
+$vals = [ map $vals->[0], ([EMAIL PROTECTED]) ];
+} else {
+return $sth->DBI::set_err(1,
+"Arrayref for parameter $ph has "[EMAIL PROTECTED]" elements"
+." but parameter $_ has "[EMAIL PROTECTED]);
+}
+}
+}
+$hash_of_arrays->{$ph} = $vals

Re: row-wise vs column-wise execute_array (was: Hints for implementing execute_array() support in DBD::Oracle?)

2003-06-16 Thread Kristian Nielsen
Tim Bunce <[EMAIL PROTECTED]> writes:

> sub execute_for_fetch {
> my ($fetch_tuple_sub, $tuple_status) = @_;
> while (1) {
>   my @tuple_batch;
>   for (my $batch_size = 1; $batch_size-- > 0; ) {
>   push @tuple_batch, $fetch_tuple_sub->() || last;
>   }
>   last unless @tuple_batch;
>   $sth->ora_execute_array([EMAIL PROTECTED], \my @tuple_batch_status);
>   push @$tuple_status, @tuple_batch_status;
> }
> }

Maybe that should be

push @tuple_batch, [ @{$fetch_tuple_sub->() || last} ];

in case $fetch_tuple_sub wants to return the same reference every time
(or document that it should not do that).

> Taking a step back for a moment... The switch from column-wise
> arrays in execute_array() to row-wise arrays in ora_execute_array()
> is interesting.
> 
> Perhaps there's scope for the DBI to add a new execute_* method to
> do what ora_execute_array is doing here. Take a simple set of tuples
> (as array of tuple arrays) plus a status array.
> 
> Then I could change the default execute_for_fetch method in the DBI
> to call that one. Driver authors would then have more options in
> what they override.

I was wondering... how will ora_execute_array() work with bind
attributes (ie. TYPE => xxx, ora_type => xxx, ...)?

Will there be calls to bind_param() with dummy values, like

$sth->bind_param(1, undef, { ora_type => 97 });
$sth->ora_execute_array([ ['a'], ['b'], ['c'] ], \ my @statuses);

with the OCIBindDynamic callback somehow accessing the type information?

I haven't yet the overview of the whole bind/execute implementation in
DBD::Oracle to have a firm opinion on this issues yet, really. It seems
that for ora_execute_array(), the logic in bind_param() must be split,
with the type-related part (OCIBindByName(), ...) going in
ora_execute_array(), and the value-related part (SV conversions ...) in
the OCIBindDynamic callback.

In terms of bind semantics, column-wise (using bind_param_array() and
execute_array()) seems kind of simpler than row-wise, because of the
explicit bind.

Any thoughts?

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S



Re: Hints for implementing execute_array() support in DBD::Oracle?

2003-06-10 Thread Kristian Nielsen
Tim Bunce <[EMAIL PROTECTED]> writes:

> Seeing your message has prompted me to add something I'd been meaning
> to do for a while that will simplify this somewhat:
> 
> =item C
>  
>   $rv = $sth->execute_for_fetch($fetch_tuple_sub);
>   $rv = $sth->execute_for_fetch($fetch_tuple_sub, [EMAIL PROTECTED]);

> So now drivers wanting to support bulk operations do not need to
> implement the full execute_array API, they just need to implement
> execute_for_fetch().
> 
> For DBD::Oracle you need to setup a callback so *Oracle* will call
> the $fetch_tuple_sub code ref (via a C wrapper) to get the next row :)

Sounds great! However, after reading OCI docs a bit, I am not sure this
will work. The problem is that, as far as I can tell, OCIStmtExecute()
expects to be told the number of bind values up front; whereas if using
execute_for_fetch() as proposed, the value count is not determined until
the last callback call returns undef:

sword OCIStmtExecute (..., ub4 iters, ...)
...
iters (IN)

For non-SELECT statements, the number of times this statement is
executed is equal to iters - rowoff.
...

Did you have some particular OCI feature in mind (that I overlooked)
which would allow array execute where the count is not known in advance?

If not, two immediate options spring to mind:

1. Change the execute_for_fetch() interface, passing the count:

  $rv = $sth->execute_for_fetch($count,
$fetch_tuple_sub,
[EMAIL PROTECTED]);

2. Implement execute_array() (and bind_param_array()) directly
   in DBD::Oracle, to the extend possible.

What do you think?

> I don't have time to help much but if you ask me clue-full questions
> as you go along then I'll help where I can.

Thanks Tim, what you wrote was just the kind of hint I was looking
for.

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S



Hints for implementing execute_array() support in DBD::Oracle?

2003-06-05 Thread Kristian Nielsen
Hi,

We could use proper execute_array() support in DBD::Oracle, and having
already fiddled a bit with the code I am considering having a go at
implementing it (no promises though).

Not being sure where to start, I was hoping to get a hint or two? Just
something simple like "You need to implement such-and-such function in
foo.xs" and "look at functions so-and-so in DBD::XXX for an example"
would be perfect to get me going.

Any pointers?

 - Kristian.

-- 
Kristian Nielsen   [EMAIL PROTECTED]
Development Manager, Sifira A/S