On Thu, Feb 27, 2003 at 09:47:31AM +1100, Stas Bekman wrote: > Tim Bunce wrote: > > >Before we dig into the fine implementation details I'd like to > >review the high-level concepts first to make sure we're "on the > >same page". > > We definitely are.
Great. > >The 'implementors data' (after the DBI common data at the start) holds > >data that's private to the driver, like pointers to the database > >API objects like connection handles etc. > > That's exactly what I'm sharing. Though we have to agree that if the DBD > chooses to include anything that has to do with Perl (SVs, references to > my_perl, etc) this DBD can't participate in the world domination scheme. ;) Since a driver only shares implementors data with itself it'll know what it can and can't use. Since I'm proposing the the driver copy the implementors data and not just point to it then, at the moment the copying is happening then even data stored in non-magical SV's ought to be safe to access because, given the pool model, no other thread would be touching it at the same time. > >The first connect sees $orashr as false and so does a proper connection > >and then sets $orashr to a copy of the implementors data structure. > >Subsequent connects see $orashr set and initialise their own > >implementors data structure from $orashr. > > > >That seems to work for active concurrent sharing across threads but > >may not fit well into a pool model (and may not be thread safe for > >some drivers). > > That's pretty much what my code does. We use locking when borrowing from > the pool and putting items back, so I don't see where this could be thread > unsafe. Reread it. The current DBD::Oracle code lets multiple threads use the same database connection at the same time. It's thread hot. There's no pool checkin/checkout. The locking on the shared var is only to make the initial setup safe. > >In this model a connection is only ever used by a single thread at > >any one time. This makes it much safer and more widely useable across > >drivers because the underlying database API does not need to be > >thread safe *in it's handling of multiple threads using a single > >connection concurrently*. Oracle is, recent mysql might be, but I > >doubt many others are. > > That's absolutely transparent to a driver. If $dbh ||= connect works with a > DBD in question, the pool scheme should work too. Yes, unlike the current DBD::Oracle scheme. (Which is more efficient but less safe in general.) > >Lets look at the 'loan out' first. We need a method to get some value > >that represents (points to) the implementors data, lets call it the ID, > >and also puts the handle into a 'brain dead' state. For example: > > my $id = $h->borrow_id; > >And another method to say the implementors data is no longer being > >used elsewhere and clear the 'brain dead' flag. (The flag will be > >used to prevent the handle being used to do anything while it's > >brain dead.) For example: > > $h->restore_id; > > Well, currently I simply identify the objects by their memory address. So > when the object is borrowed it's moved from the free list to the busy list. > When it's returned the busy list is searched, the matching item is removed > and moved to the free list. > > >On the borrowers side we need to pass in the borrowed ID to the > >connect() call so the driver can use it. For example: > > my $dbh => DBI->connect($dsn, $user, $pass, { UseID => $id }); > > Nuh, it can be transparent for the user API. Just lock the pool, check > whether there are free items, borrow if any or do a normal login and store > otherwise. I don't think the top level connect() API should change at all, > other than finding a way to turn the pooling on/off. But how do you tell the driver that it should be reusing the implementors data from an exisiting connection? > >For safety and simplisity it would seem best to copy the implementors > >data structure and overwrite anything that needs overwriting. > > Yes, but the problem is when $dbh is destroyed the private implementation > is destroyed as well. So no matter if you copy it or not, if we have no way > to prevent the destruction/cleanup of the implementators data, we are in > trouble. That's one of the problems that I'm stumbled with. Hence my proposal to have a flag on the handle to say it's been borrowed. If it's destroyed in that case then it ought not to cleanup the implementators data (just warn). Ah, actually the InactiveDestroy flag could serve that purpose for the time being (but without the warning). Cute. > >(It would be very nice if we could find some way to automatically > >restore the id if the thread that borrowed it died or exits > >without returning it.) > > We could have a garbage collector. Simply store in the pool the thread id > which has borrowed the item (inside the item). Then once in a while scan > the busy list and free any items that we can't find their threads alive. Ah, yes. That'll work. And trigger the scan whenever a new request find the free list empty. > >Nothing in the design requires the use of threads. The borrowing, > >using, and restoring of implementors data can be done between handles > >in the same thread or an unthreaded perl. > > That's correct. The only issue here is to have NOOPs for locking and > sharing, but this is trivial. Sure, but (I think) you're talking about a layer above the DBI. > >>These are the open issues: > >> > >>2. I need a support from DBI to help me access the *really* private > >> data in struct imp_dbh_st, because the following is a hack: > >> > >> D_imp_dbh(dbh); > >> imp_dbh->mysql = ((imp_dbh_t *)imp_dbh_new)->mysql; > > > >Perhaps, but why exactly are you calling it a hack? > > because it calls ->mysql. That's why the driver itself needs to be the one to do the copying of the 'template' implementors data. Only it can know what's needed. (Although I propose to let the DBI do the copying when it alloc's the implementors data structure and set a flag for the driver which could then skip most of it's own setup knowing that the struct elements have been copied already.) > >If the \%attr containing { UseID => $id } is passed into DBI::_new_dbh(...) > >by the driver then the DBI can look after copying the given implementors > >data into the new handle's implementors data structure that it's setting > >up. > > That's the method I'm after. > >If it also sets a 'HAS_COPIED_ID' flag then all the driver has to > >do in its _login sub is check for the flag and, if set, skip almost all > >of the normal connection setup. > > it's more than that. We need a way to tell the driver not to destroy its > private data. That'll be a 'HAS_LOANED_ID' flag :) [but InactiveDestroy would work for now.] > >>5. Finally, the most important issue is that if a thread logged in for > >> real and created imp_dbh, it must not exit while other threads use > >> the same data. > > > >Must not exit because the other threads are *pointing* to it's > >implementors data rather than using a copy? (And if it exits the > >memory will be freed.) > > something like that. It's not about pointing. It's about freeing. Freeing is only a problem if something is pointing at what's been freed :) > thread A allocates imp_dbh, finishes with it and puts it into the pool. Um, the allocates/finishes/puts seems wrong. The pool thread is the only one to make real connections (in response to requests). Having made the connection the imp data is then 'borrowed' and passed back to the thread that requested the connection. That requesting thread then does a $dbh = DBI->connect(..., { UseID => $id }) to get a $dbh in it's own thread but using the imp data from the pool. > now that it exits, chances are that the data in pool may go invalid. "data in pool may go invalid" Eh? > I haven't debugged this issue yet, but it seems like that's what's happening. > What we need to do is to ensure that the data in the pool can survive the > exit of threads that created that data in first place. Only the pool thread should be creating connections for the pool. > First I want to figure out how to tell the driver not to destroy its > private data when $dbh is destroyed. Then I'll be able to pursue this > threads-exit issue. Use InactiveDestroy for now. > Take a look at the code. It's pretty much trivial. No time today. Tim.