[sqlite] xBestIndex() implementation question

Hick Gunter Fri, 15 May 2015 06:45:06 +0000

You are not supposed to know the value on the RHS because that would lure you 
into performing lots of work that you should not be doing in the query planning 
step. Remember that xBestIndex may be called multiple times with different 
combinations of constraints. Running the equivalent of "select count() from 
<table> where <contraints>" for each call may turn out to be more costly than 
the query itself.


The edge cases are simple (assuming equality constraints):

No (usable) constraints -> full table scan of n rows -> rows = n
Complete unique key -> key lookup in n rows -> rows = 1

Absent cardinality info on the fields (i.e. select count() from (select unique 
<field> from <table>); type info), I think it would be best to assume that each 
field contributes the same factor in reducing the number of rows.

For a key prefix of c out of k fields in a table containing n rows -> partial 
key scan -> rows = n ^^ (c/k)

If you do have cardinality information, rows = n * (product of constraint 
cardinalities) / (product of key field cardinalities)

For inequality constraints (like your id > 50), the best you can assume is that 
each such constraint will, on average, exactly bisect the result set.

If your virtual table implements indexed access of some sort, the cost will be 
the sum of locating the first record, typically O(log n), plus the number of 
rows estimated to be retrieved.
Without an indexed access, the cost will be constant at that of  a full table 
scan, typically O(n).

Again, you are expected to return *estimates* based on information whose 
retrieval cost are negligible relative to the total cost of the query (which 
usually translates to "stored in the virtual table's structure information").

Gunter
-----Urspr?ngliche Nachricht-----
Von: Jilong Kuang [mailto:jilong.kuang at samsung.com]
Gesendet: Donnerstag, 14. Mai 2015 04:05
An: sqlite-users at mailinglists.sqlite.org
Betreff: [sqlite] xBestIndex() implementation question

Hello,

I'm working on a project using SQLite virtual table. Now I have a problem about 
the implementation of xBestIndex() function, in particular, the estimatedRow 
and estimatedCost variables.

As the aConstraint array does not contain the RHS expression value for each 
constraint (only iColumn and op), how am I supposed to deduce the appropriate 
value for both estimatedRow and estimatedCost?

For example, let's say one constraint "...Where id > 50...". The value of 50 is 
not passed into the sqlite3_index_info struct. Can you explain how to handle 
this situation?

I just want to have a more accurate cardinality estimation to improve query 
performance. But I do not know how to do it without the value field.

Thank you very much for your time.

Best regards,
Jilong
_______________________________________________
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: hick at scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.

[sqlite] xBestIndex() implementation question

Reply via email to