On Wed, Feb 26, 2003 at 04:36:04PM +1100, Stas Bekman wrote: > The problem: > > ithreads-enabled Perl cannot share nested (Perl/C) datastructures. If it > could > the issue of dbi-pool would be a no-brainer since all we needed to do > is $dbh ||= connect();
(There's also tied magic to consider. A DBI handle is more than a nested datastructure. But you know that already.) > Final (semi-working) solution: I'm really glad you're doing this Stas, and your timing is good as I'm actively working on the DBI again now. Before we dig into the fine implementation details I'd like to review the high-level concepts first to make sure we're "on the same page". > as discussed with Tim and Hugo back at TPC (Jul 2002), the only > possible solution is to share only the driver's private part of dbh. A DBI handle is made up of four parts: The outer handle (ref to tied hash). The inner handle (ref to plain hash holding attribute cache). The 'implementors data' (attached by 'magic' to the inner hash. The 'DBI common data' at the start of the 'implementors data'. The 'DBI common data' holds various flags (like RaiseError, AutoCommit etc), pointers (like the stash of the driver class), and other info like the current recursive call_depth and how many kids the handle has. And there's also bunch of pointers to certain attribute values that are just there for performance. [Currently the DBI common data is embedded directly into the implementors data structure rather than being pointed to by a pointer in the implementors data structure. I've had that penciled in as something to change for a v2.x DBI. It would probably make your life simpler.] The 'implementors data' (after the DBI common data at the start) holds data that's private to the driver, like pointers to the database API objects like connection handles etc. Now, back to the threads... as I recall, the idea we worked out was that when a new handle is being created it could told to copy/share the implementors data from some other handle instead of initializing it's own. Something like: my $dbh2 => DBI->connect($dsn, $user, $pass, { CloneHandle => $dbh1 }); The effect would be that $dbh2 would be a completely new handle in all respects except that it would not actually have issued a database API connect call, it would have just copied the implementors data from $dbh1. [In practice we may pass some other attribute name with some other value that isn't an actual $dbh but something extracted from the $dbh, possibly in another thread. Something like: my $internal_id = $dbh1->internal_id; # in thread 1 my $dbh2 => DBI->connect(..., { CloneInternal => $internal_id }); # in thread 2 But I wanted to keep the initial example simple. ] As it happens, a very similar concept has already been implemented by Gerald Richter in DBD::Oracle 1.13 (not yet released, sadly). It works like this: our $orashr : shared = '' ; $dbh = DBI->connect($dsn, $user, $passwd, { ora_dbh_share => \$orashr }) ; The first connect sees $orashr as false and so does a proper connection and then sets $orashr to a copy of the implementors data structure. Subsequent connects see $orashr set and initialise their own implementors data structure from $orashr. That seems to work for active concurrent sharing across threads but may not fit well into a pool model (and may not be thread safe for some drivers). I think of a pool as something that stored things while they're not being used and can 'loan them out' to be used for a while before then being returned to the pool. While an item is 'out' it can't be used given out to any other requestor. I think this is the model we spoke about at TPC. Where the pool holds database connections that are then loaned out for use by a request before being returned. If two threads both request connections at the same time then the pool will grow to have two connections. In this model a connection is only ever used by a single thread at any one time. This makes it much safer and more widely useable across drivers because the underlying database API does not need to be thread safe *in it's handling of multiple threads using a single connection concurrently*. Oracle is, recent mysql might be, but I doubt many others are. So, in this scenario we want to allow a handle to 'loan out' use of it's implementors data and for another handle to be created and initialised to use that 'borrowed' implementors data before finally returning it. Lets look at the 'loan out' first. We need a method to get some value that represents (points to) the implementors data, lets call it the ID, and also puts the handle into a 'brain dead' state. For example: my $id = $h->borrow_id; And another method to say the implementors data is no longer being used elsewhere and clear the 'brain dead' flag. (The flag will be used to prevent the handle being used to do anything while it's brain dead.) For example: $h->restore_id; On the borrowers side we need to pass in the borrowed ID to the connect() call so the driver can use it. For example: my $dbh => DBI->connect($dsn, $user, $pass, { UseID => $id }); For safety and simplisity it would seem best to copy the implementors data structure and overwrite anything that needs overwriting. I don't think anything else is essential to the design but I could easily have missed lots of issues. (It would be very nice if we could find some way to automatically restore the id if the thread that borrowed it died or exits without returning it.) Nothing in the design requires the use of threads. The borrowing, using, and restoring of implementors data can be done between handles in the same thread or an unthreaded perl. > These are the open issues: > > 2. I need a support from DBI to help me access the *really* private > data in struct imp_dbh_st, because the following is a hack: > > D_imp_dbh(dbh); > imp_dbh->mysql = ((imp_dbh_t *)imp_dbh_new)->mysql; Perhaps, but why exactly are you calling it a hack? > When I re-install the stored dbh, What to you mean by 're-install' here? > I must not break the ->com > structure, but overwrite the rest. So I guess the right approach is > to copy away the original ->com, overwrite the whole imp_dbh and > then copy back the original ->com. Also I'd prefer to store in the > pool only the really private data. I guess all I need is to know > the size of ->com struct with its sub-structs, preferrably at > compile time. If the \%attr containing { UseID => $id } is passed into DBI::_new_dbh(...) by the driver then the DBI can look after copying the given implementors data into the new handle's implementors data structure that it's setting up. If it also sets a 'HAS_COPIED_ID' flag then all the driver has to do in its _login sub is check for the flag and, if set, skip almost all of the normal connection setup. > 3. $dbh->DESTROY. Currently I had to: > SvREFCNT_inc(dbh); > so imp_dbh won't lose it's data when $dbh goes out of scope, I have > tried copying it but wasn't very successful. Neither playing with > DBIc_FLAGS(imp_dbh) helped, but that's probably because I'm not > very familiar with DBI guts. You help is needed here. I'm not sure what you're doing here ans it may not be relevant with the model I've outlined above. > 5. Finally, the most important issue is that if a thread logged in for > real and created imp_dbh, it must not exit while other threads use > the same data. Must not exit because the other threads are *pointing* to it's implementors data rather than using a copy? (And if it exits the memory will be freed.) Tim.