On 03/03/14 03:01, Alek Paunov wrote:
On 02.03.2014 21:38, Elefterios Stamatogiannakis wrote:
Under this view, the efficiency of the virtual table api is very
important. Above query only uses 2 VTs in it, but we have other queries
that use a lot more VTs than that.

Max tests in C shows 2x CPU work, but he explains that the test is not
very sound, so let's say somewhere between 1x-2x. Your tests - 3x time.

As you have already identified, the real reason probably is the million
scale callback quantity across the VM barrier - I do not follow PyPy,
but see these notes [1] by Mike Pall - the LuaJIT author (LuaJIT is the
leading project in the trace compilers filed):

[1] http://luajit.org/ext_ffi_semantics.html#callback_performance

Also from one of the dozens of threads touching the subject:

[2] http://www.freelists.org/post/luajit/Yielding-across-C-boundaries,3

```
Entering the VM needs a lot of state setup and leaving it isn't
free either. Constantly entering and leaving the VM via a callback
from C *to* Lua has a high overhead. For short callbacks, the
switching overhead between C and Lua may completely dominate the
total CPU time.

Calling an iterator written in C via the FFI *from* a Lua program
is much cheaper -- this compiles down to a simple call instruction.
```

I remember that i had seen above quote from Mike Pall, but i couldn't find it.

Thank you for unearthing it.

Unfortunately, for your "insert into t select * from vt" case an the
callback/iterator transformation is not possible (we do not have
repetitive _step call to invert the control somehow). What to do?

It seems that the easiest optimization for this (very often) VT use case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).

SQLite doesn't need any special format for the records to be passed over to it. It already has the "bind" API which would be very suitable for the xNextRow function too.

For a paging API (which IMHO is too complex ATM), the bind API could be extended with a row number parameter.

Regards,

estama


Kind regards,
Alek


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to