{RowsInCache}

Jeff Urlwin Sun, 17 Mar 2002 21:23:11 -0800

Just as an FYI -- I was thinking about the same thing recently.  That is:
Moving DBD::ODBC to use SQLExtendedFetch and allowing multiple rows to be
cached locally, hopefully in less network trips ;)


I'd like to work towards implementing the direct support in DBI (unless I'm
missing something) as this could be useful for other drivers, if it's at all
possible...  Just along the lines of handling the cache itself, I assume.

At a high level, DBI would call fetch(..., nrows) indicating how many rows
were actually read.  The "default" implementation could be to loop around
dbd_st_fetch() unless over-ridden?  Then, it's up to the driver to provide:
        a) an optimization for the multi-row fetch -- if applicable
        b) a function to retrieve a row from the cached rows (presuming that all
drivers will have a different organization of the cached row??)

I'm thinking that the driver will still have to manage the cached rows to
some degree, since the data organization could be vastly different for each
driver.  However, DBI should manage the process, if possible.

Some drivers could retrieve data into an array/buffer for each column while
others could build one be row based.  DBD::ODBC for example builds a buffer
for the row currently, but that would have to change, I think...  I believe
the buffer for the columns must be consecutive in memory, at the very least,
simulating an array for each column (regardless of how the memory is
allocated).

Thoughts/comments?

Jeff

>
>
> Dear Tim,
>
> I spotted the discussion about using these attributes to improve the
> performance of DBD::Oracle, and woke up since I don't remember them
> being discussed before.
>
> Running 'perldoc DBI' for DBI 1.21 gives:
>
>       [...]
>     "RowCacheSize"  (integer)
>          A hint to the driver indicating the size of the local
>          row cache that the application would like the driver to
>          use for future "SELECT" statements.  If a row cache is
>          not implemented, then setting "RowCacheSize" is ignored
>          and getting the value returns "undef".
>
>          Some "RowCacheSize" values have special meaning, as follows:
>
>            0 - Automatically determine a reasonable cache size
>                for each C<SELECT>
>            1 - Disable the local row cache
>           >1 - Cache this many rows
>           <0 - Cache as many rows that will fit into this much
>                memory for each C<SELECT>.
>
>          Note that large cache sizes may require a very large
>          amount of memory (cached rows * maximum size of row).
>          Also, a large cache will cause a longer delay not only
>          for the first fetch, but also whenever the cache needs
>          refilling.
>
>          See also the "RowsInCache" statement handle attribute.
>
>       [...]
>     "RowsInCache"  (integer, read-only)
>          If the driver supports a local row cache for "SELECT"
>          statements, then this attribute holds the number of un-
>          fetched rows in the cache.  If the driver doesn't, then
>          it returns "undef".  Note that some drivers pre-fetch
>          rows on execute, whereas others wait till the first
>          fetch.
>
>          See also the "RowCacheSize" database handle attribute.
>
> I assume that we aren't expected to deal in negative memory, and
> the comments for RowsInCache <0 should make some comment about
> 'absolute value'.  Is the memory size specified in bytes or some
> multiple of bytes?
>
> I've got some questions for you - how could I implement any support for
> that in DBD::Informix given the constraints imposed by Informix?  I'd
> like to do it for the same reasons other drivers do it, but I can't
> easily match the semantics of the RowCacheSize and RowsInCache
> attributes.
>
> What do I get to play with?  There's a fetch buffer size variable that
> is used when the (ESQL/C, equivalent to ODBC) FETCH statement is
> executed; it defaults to a value such as 4 KB.  Within broad limits, as
> many rows as can fit will be fetched into that buffer.  One of the
> exceptions is when the rows are locked because the cursor is qualified
> FOR UPDATE; then you get one row at a time.  Even if I set the value
> (which I'd cheerfully do), I won't really be able to say how many rows
> are in the cache.  I can guess, but the precise information is not
> revealed to the client code.
>
> There is also some array fetching technology which can be perhaps be
> used, but it requires me to do a rewrite of the code to use 'the other'
> dynamic SQL mechanism for describing data.  (DBD::Informix currently
> uses the standardized SQL DESCRIPTORS; the other mechanism is known as
> 'sqlda').  However, the array fetching mechanism does allow the user to
> specify the number of rows to be fetched at a time, and the rewrite has
> been (still is) on my list of desirable things to do.  There will still
> be an upper limit on the number of rows that can be prefetched (based on
> the maximum of 32 KB fetch buffer size and the row size).  But the
> information about how many rows there are in the cache will be
> available explicitly.
>
> Judging from the specification, if the first row has not been fetched,
> then the RowsInCache attribute should return undef; otherwise, its best
> estimate of the number of rows available in the cache.  And things like
> fetchall_arrayref() will be able to use the database handle value of
> RowCacheSize.
>
> There are some interesting twists and turns.  If the user modifies
> $dbh->{RowCacheSize} after the first fetch op, the array size can't
> change until the next FETCH operation.  Or should the size when the
> first fetch is performed remain constant?  That is presumably what is
> intended; juggling the memory while the statement is active is possible
> but unduly hairy.  Should the statement record the RowCacheSize at the
> time when the statement is prepared and ignore subsequent changes?  That
> is mostly consistent.  Is the RowCacheSize inherited by the statement?
> I presume so.  What happens if the statement's RowCacheSize is changed
> between the prepare and the execute.  Are these attributes stored
> automatically by DBI?  If so, what mechanism should be documented in the
> DBI::DBD manual so that driver writers can access the fields.  Or do the
> drivers have to add the support themselves?
>
> ....hmmm...maybe the real question is "When am I going to find time to
> actually rewrite the code to use sqlda?"...
>

RE: Q's about $dbh->{RowCacheSize} and $sth->{RowsInCache}

Reply via email to