On Sun, Jan 29, 2012 at 06:12:50PM +0000, Dave Mitchell wrote:
>
> I've now written some working caching code, that reduces CPU usage on
> while ($sth->fetch()) {$c++} from 15.23s to 13.10s, which is a 14%
> saving!! The test code just fetches millions of rows from a 2-int table
> and then doesn't do anything real with the returned data, so a real-world
> example will see a smaller saving, but I'm still very pleased with it :-)
Nice.
> It's not ready for use yet; it passes the DBI test suite, but:
> * I haven't tested it under ithreads or early perls yet;
> * I haven't tried it against the DBD::mysql test suite yet;
> * I haven't written any tests yet for the method cache invalidation,
> e.g. if code does *DBD::mysql::st::foo = sub {...},
> will subsequent calls to $sth->foo() call the new function?
>
> As regards that last point, can you recommend where I should place the new
> tests? Is there an existing test script that they can be added to, or if a
> new script needs adding, what should it be called, and is there a good
> existing script to use as a template?
The DBI test suite is, sadly, rather crufty. Lots of test files have far
outgrown their original scope.
I'd reccomend starting a new one, say 31methcache.t. You could use
DBD::Sponge (as many tests do, e.g. 30subclass.t) and create two
statement handles each preloaded with different data.
Then fetchrow_arrayref from one and fetchrow_hashref from the other,
for example, checking you get back the right data with the right
kind of reference. Just a thought.
> The code itself (see diff below) just attaches some magic to the outer CV
> that caches the GV of the corresponding inner method.
Any reason for not using the existing 'internal method attribute'
mechanism? _install_method() already allocates a structure and
attaches it the the CV using CvXSUBANY(cv).any_ptr. You could add the
three elements of method_cache_t directly into dbi_ima_t. Then, since
the dispatch code already has the structure pointer (ima) handy, your
cache code could avoid the mg_findext call and use ima->method_cache
directly.
> The caching code *does* assume that the name of the outer sub never
> changes, i.e. that GvNAME(CvGV(cv)) in XS_DBI_dispatch is constant WRT cv;
> is this a reasonable assumption?
Yes, I think so.
Though some kind of 'kill switch' to disable the cache might be handy,
in case odd things are happening and the cache is suspected.
> The code makes use of non-API knowledge about PL_generation and
> HvMROMETA(stash)->cache_gen, which could conceivably change in a later
> perl release.
Which seems like a good argument for extending the perl API to cover
that kind of functionality.
> > PS - I'm specifically being paid only to fix a performance issue on
> > non-threaded builds, so I won't be looking at threaded issues. But
> > dPERINTERP looks like bad news on threaded builds.
>
> I've since been told that a) I'm allowed to say who's funding me:
> it's booking.com (thanks, guys!); and that b) depending on how much time I
> burn on the cache issue, I may have time to look at threaded issues too,
> specifically the dPERINTERP rewrite using MY_CXT - assuming that no one
> beats me to it!
That would be great. Thanks again Dave.
> Dave.
>
>
> Index: DBI.xs
Looks good.
Tim.