[sqlite] Indexing virtual tables

Aladdin Lampé Sat, 17 May 2008 07:42:02 -0700

Hi! Here is what I'm still trying to achieve:
- I have a custom file format having "records" and file offsets.
- Each record in that custom file format has the same number of fields, but the 
records itself are variable length, that's why I need a file offset to quickly 
locate a record. One other way (if you can wait for a very long time...) is to 
walk sequentially the records list to get the desired record.
- I've implemented a working SQLite "virtual table" in order to be able to read 
and query my custom file format through SQLite.
- Now, basically what I'd like would be to "CREATE INDEX" on a field of my 
virtual table to take advantage of it in the xBestIndex callback. But the 
documentation says that we cannot use "CREATE INDEX" on virtual tables.


Let's say the data in the field "F1" of my virtual table "VFILE", and the file 
offsets are the following:
F1              fileoffset
----------------------
a               10
b               21
z               34
x               45
a               51
x               69
z               73
a               88
x               94

I want to index the column F1, to be able to have a quick response to queries 
like:
select * from VFILE where F1='x'

At this point, I think I have only 3 possible strategies:
1. Use SQLite tables to "fake" a standard index using SQLite tables
2. Use internal SQLite B-Tree routines to implement my index 
(sqlite3BtreeCreateTable and stuff)
3. Implement my own B-Tree and sort algorithms to achieve this, externally to 
SQLite

Strategy 3 is precisely what I'm trying to avoid (too much work and testing :-) 
).
Strategy 2 is strongly discouraged by DRH.

Then strategy 1 seems to be (like you've just said) the only way to go:

a) Duplicate the data to be indexed (and the file offsets to use)

create table INDEX1_SORT as select F1, fileoffset from VFILE order by F1, 
fileoffset

INDEX1_SORT: physical table
F1              fileoffset
----------------------
a               10
a               51
a               88
b               21
x               45
x               69
x               94
z               34
z               73

b) Create an index on that data

OPTION 1: Use SQLite CREATE INDEX at this point.
b.1.1) create index on INDEX1_SORT(F1)

OPTION 2: Fake index with custom tables
b.2.1) create table INDEX2_SUM as select F1, min(INDEX1_SORT.rowid) as 
'minrow', max(INDEX1_SORT.rowid) as 'maxrow' from INDEX1_SORT group by F1

INDEX2_SUM: physical table
F1         minrow          maxrow
---------------------------------------
a           1                  3
b           4                  4
x           5                  7
z           8                  9

b.2.2) create index on INDEX_2_SUM(F1)


*********************************************************************

Usage for option 2:
- Use INDEX2_SUM to fetch the requested value in the initial query (select * 
from VFILE where F1='x')
- Get data from table INDEX1_SORT between rowid "minrow" (5) and "maxrow" (7)
- For each line, use the given file offset to locate the real data in the 
custom file format file.
- Read 3 records at fileoffet = 45,69,94 and return them to SQLite.

I really feel like all this is not very optimal.
What is the best strategy to achieve optimal speed and needed storage?
Am I missing a trivial point?

Thank you for any help on that!
Aladdin

> Date: Mon, 12 May 2008 15:37:22 -0700
> From: [EMAIL PROTECTED]
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Indexing virtual tables
>
> I'm not quite clear on your question - why wouldn't you just create
> any indices you need within the virtual-table implementation itself?
> Sort of like how fts uses SQLite tables to implement data-storage for
> the full-text index.
>
> -scott
>
>
> On Mon, May 5, 2008 at 10:13 AM, Aladdin Lampé  wrote:
>>
>> Just thinking again about indexing strategies on virtual tables, I'm 
>> wondering why virtual tables could not be indexed using the "normal" SQLite 
>> command "INDEX". Indeed, I just expected that the data inside the column of 
>> the virtual table could be sequentially scanned (using the "xColumn" 
>> callback), producing the same result as if it were a real table. Is that way 
>> of seeing things flawed?
>>
>> Any hook allowing to use SQLite internal indexing techniques for virtual 
>> tables? Maybe using direct b-tree manipulation (even if I know it's not 
>> recommended)? I'm not very keen on developing my own from stratch. Dealing 
>> with "big" tables that don't fit into memory does not seem so easy because 
>> I'll have to use a temporary disk file...
>>
>> Some help would be greatly appreciated!
>> Aladdin
>>
>> _________________________________________________________________
>> Faites vous de nouveaux amis grâce à l'annuaire des profils Messenger !
>> http://home.services.spaces.live.com/search/?page=searchresults&ss=true&FormId=AdvPeopleSearch&form=SPXFRM&tp=3&sc=2&pg=0&Search.DisplayName=Nom+public&search.gender=&search.age=&Search.FirstName=Pr%C3%A9nom&Search.LastName=Nom&search.location=Lieu&search.occupation=Profession&search.interests=amis&submit=Rechercher
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_________________________________________________________________
Faites vous de nouveaux amis grâce à l'annuaire des profils Messenger !
http://home.services.spaces.live.com/search/?page=searchresults&ss=true&FormId=AdvPeopleSearch&form=SPXFRM&tp=3&sc=2&pg=0&Search.DisplayName=Nom+public&search.gender=&search.age=&Search.FirstName=Pr%C3%A9nom&Search.LastName=Nom&search.location=Lieu&search.occupation=Profession&search.interests=amis&submit=Rechercher
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Indexing virtual tables

Reply via email to