Re: rfc: DBI::Pool prototype (i.e. hack)

Tim Bunce Thu, 27 Feb 2003 02:30:24 -0800

On Thu, Feb 27, 2003 at 11:33:54AM +1100, Stas Bekman wrote:
> Tim Bunce wrote:
> 
> >>>The 'implementors data' (after the DBI common data at the start) holds
> >>>data that's private to the driver, like pointers to the database
> >>>API objects like connection handles etc.
> >>
> >>That's exactly what I'm sharing. Though we have to agree that if the DBD 
> >>chooses to include anything that has to do with Perl (SVs, references to 
> >>my_perl, etc) this DBD can't participate in the world domination scheme. 
> >>;)
> >
> >
> >Since a driver only shares implementors data with itself it'll know
> >what it can and can't use. Since I'm proposing the the driver copy
> >the implementors data and not just point to it then, at the moment
> >the copying is happening then even data stored in non-magical SV's
> >ought to be safe to access because, given the pool model, no other
> >thread would be touching it at the same time.
> 
> I'd first try to get away without copying if possible. And pursue the 
> copying next if that doesn't work.


Er, why? Isn't the only alternative to copying, pointing?
And isn't pointing worse?


> >>>On the borrowers side we need to pass in the borrowed ID to the
> >>>connect() call so the driver can use it. For example:
> >>>my $dbh => DBI->connect($dsn, $user, $pass, { UseID => $id });
> >>
> >>Nuh, it can be transparent for the user API. Just lock the pool, check 
> >>whether there are free items, borrow if any or do a normal login and 
> >>store otherwise. I don't think the top level connect() API should change 
> >>at all, other than finding a way to turn the pooling on/off.
> >
> >
> >But how do you tell the driver that it should be reusing the
> >implementors data from an exisiting connection?
> 
> I don't. My design is to make this transparent to the drivers. As you can 
> see from my code, I overload the driver's _login function and accomplish 
> what I need.

That presumes that all drivers use a _login function, which may not
be true.  And I think that trying to be transparent to the drivers
will never be safe enough. There's always a strong possibility that
the implementors data contains SV* etc's the are owned by a different
interpreter. On the other hand, the driver changes required are
minor and I doubt there would be a problem getting them implemented.


> >>>For safety and simplisity it would seem best to copy the implementors
> >>>data structure and overwrite anything that needs overwriting.
> >>
> >>Yes, but the problem is when $dbh is destroyed the private implementation 
> >>is destroyed as well. So no matter if you copy it or not, if we have no 
> >>way to prevent the destruction/cleanup of the implementators data, we are 
> >>in trouble. That's one of the problems that I'm stumbled with.
> >
> >
> >Hence my proposal to have a flag on the handle to say it's been
> >borrowed.  If it's destroyed in that case then it ought not to
> >cleanup the implementators data (just warn).
> >
> >Ah, actually the InactiveDestroy flag could serve that purpose for
> >the time being (but without the warning). Cute.
> 
> So, you say that we absolutely must have drivers to co-operate with DBI to 
> accomplish that. I thought that it'd be possible for DBI simply to skip the 
> destruction calls to the drivers data.

Setting the existing InactiveDestroy attribute will achieve that much.
Using a new flag would just allow us to fine-tune the semantics, like
issuing a warning.

> >>>>These are the open issues:
> >>>>
> >>>>2. I need a support from DBI to help me access the *really* private
> >>>>data in struct imp_dbh_st, because the following is a hack:
> >>>>
> >>>> D_imp_dbh(dbh);
> >>>> imp_dbh->mysql = ((imp_dbh_t *)imp_dbh_new)->mysql;
> >>>
> >>>Perhaps, but why exactly are you calling it a hack?
> >>
> >>because it calls ->mysql.
> >
> >That's why the driver itself needs to be the one to do the copying
> >of the 'template' implementors data. Only it can know what's needed.
> 
> I'm not 100% sure. Here is what I've figured from messing with debugger:
> 
> The driver knows the size of imd_dbh_st. DBI knows the size of the first 
> member of imd_dbh_st (or am I wrong?).
> So we could have a driver function 
> which gives us an access to the point where the "really" private data 
> starts.

Yes, and yes.

But only the driver knows what's in that data and thus what needs to be
re-initialised for use in a different thread - like SVs.

> >(Although I propose to let the DBI do the copying when it alloc's
> >the implementors data structure and set a flag for the driver which
> >could then skip most of it's own setup knowing that the struct
> >elements have been copied already.)
> 
> well, that's exactly what I was after ;)

:)

> >Um, the allocates/finishes/puts seems wrong. The pool thread is the only
> >one to make real connections (in response to requests). Having made the
> >connection the imp data is then 'borrowed' and passed back to the thread
> >that requested the connection. That requesting thread then does a
> >  $dbh = DBI->connect(..., { UseID => $id })
> >to get a $dbh in it's own thread but using the imp data from the pool.
> 
> So you are proposing to use the normal grow() concept. Which introduces two 
> problems. The first just inconvenience as you have to pass the arguments 
> and handle failures, etc... The second is the serialization of connect(). 
> What you suggest is the following:
> 
> the pool is empty, let's grow it:
> 
> lock # here all other threads block
> switch the perl context to the parent (pool) perl thread;
> connect ();
> pool the object and lend it to the requesting thread
> switch the perl context back to thread's perl
> unlock
> 
> this could be quite a bottleneck.

The key point being that the connect could be slow while holding the lock.
Okay, I see that.

> in my design, each thread does its own connect and then deposit the object 
> to the pool. No bottleneck whatsoever. Other than the pool management 
> blocking.

Umm [thinks] okay, there's no reason why the connections in the pool
can't themselves be created via
        $dbh = DBI->connect(..., { UseID => $id })
where the $id has come from a connection created outside the pool.

We just need a way to 'take' imp data instead of 'borrow' it, so
the handle that's left behind is effectively dead and won't warn
when it's destroyed. That may be as simple as also setting
InactiveDestroy.


> >>now that it exits, chances are that the data in pool may go invalid.
> >
> >"data in pool may go invalid" Eh?
> 
> remember, I'm talking about threads allocating memory, not the parent 
> thread. When the thread exists depending on how the memory was allocated, 
> the memory could get freed.

So copy and reinit as needed, don't point.


Let's see if I can summarize where we're at (from the DBI prespective)...

  my $id = $dbh->borrow_imp_data;

        Sets a flag to inidcate that the imp data is in use by
        another thread.  Returns a token for the imp data, possibly
        simply the address, but could also be the whole imp data
        structure. Using the whole struct is appealing because it
        avoids a race hazard - if borrow_id sets a flag before
        returning then whichever thread uses the id will see that
        flag already set.

  my $id = $dbh->take_imp_data;

        As above but indicates that the imp data won't be returned
        and the handle is effectively dead.

  $dbh->restore_imp_data;

        Indicates that the thread that borrowed the imp data has
        finished using it.

  $dbh = DBI->connect(..., { UseImpData => $id })

        Creates a new dbh and initialises it with a copy of the given imp data.
        Also sets a flag to indicate that the imp data is borrowed.

Does that give you all the mechanisms you need to build on?

Tim.

Re: rfc: DBI::Pool prototype (i.e. hack)

Reply via email to