I did implement what I described below, which I believed to work well
unless it is the first row that needs the biggest buffer.  However, I
ran into a problem.  I keep getting incorrect results when I use the
coalesce() function.  Usually coalesce() will simply return the first
argument, and if this is the case before I realloc my buffer so it
only contains the columns seen so far, then I run into a problem when
coalesce() later needs to look past its first argument if that column
no longer exists in my buffer.

I believed that I would know which columns to care about after the
first row had been completed, but that's not a safe assumptation.  On
the other hand, it would be a waste to compute values that coalesce()
will discard, so this behaviour makes sense.  I could run through
xColumn twice for the whole result table: once to determine which
columns to care about and once again to do the actual computations.
That will give me a tradeoff between speed and memory usage.

-Steinar

Steinar Midtskogen <stei...@latinitas.org> writes:

> Thanks to Dan and Roger for the information and suggestions.  I have
> about 100 columns (and millions of rows), but usually queries will
> only ask for a few columns.
>
> I think I know a way to work around this for my case.  Basically, I
> don't know the exact size required for my lookahead buffer (currently
> allocated in xFilter), so I probably need to make that size dynamic
> anyway rather than to allocate the maximum possible size.  I don't
> know the size because I need to look a certain time ahead (in the
> timestamp column) and I don't know how many rows that will be.  So if
> xColumn finds out that it hasn't enough lookahead data to do its
> computation, I should resize and read more.  But at that time (unless
> it's the first row), I can know what the columns are that I'm going to
> need.
>
> So I can start out with a buffer with all the columns but few rows,
> and then increase the number of rows when needed and hopefully reduce
> it to only the interesting columns at the same time.
>
> It will require a bit of bookkeeping, but seems doable.
>
> -Steinar
>
> Roger Binns <rog...@rogerbinns.com> writes:
>
>> On 23/02/12 23:02, Steinar Midtskogen wrote:
>>> I know that xColumn will only get called for these columns.
>>
>> As Dan said there isn't a direct way of knowing.  There is a reasonably
>> convenient workaround of having multiple virtual tables with different
>> subsets of the columns but all returning the same underlying data.
>>
>> Something else to look at are hidden columns.  See section 2.1.1 in
>> http://www.sqlite.org/vtab.html
>>
>> With that you can make the "cheap" columns regular and the expensive ones
>> hidden so that the SQL query explicitly has to ask for them.  In your vtab
>> implementation you can always start out by only getting the cheap columns
>> until xColumn tells you that one of the hidden/expensive ones is needed
>> and then go off and redo the underlying query.
>>
>> Roger
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to