Revisiting a thread from about 3 weeks back, I have another xBestIndex puzzler.
The example query from that thread was:
SELECT t1.rental_date, t1.inventory_id, t8.film_id, t5.title AS Title,
t3."name" AS Category, t4.customer_id, t4.store_id, t4.email,
t5.length
FROM rental t1
LEFT OUTER JOIN inventory t8
ON ( t8.inventory_id = t1.inventory_id )
LEFT OUTER JOIN customer t4
ON ( t4.customer_id = t1.customer_id )
LEFT OUTER JOIN film_category t7
ON ( t7.film_id = t8.film_id )
LEFT OUTER JOIN film t5
ON ( t5.film_id = t8.film_id )
LEFT OUTER JOIN category t3
ON ( t3.category_id = t7.category_id )
LEFT OUTER JOIN film_actor t6
ON ( t6.film_id = t5.film_id )
LEFT OUTER JOIN actor t2
ON ( t2.actor_id = t6.actor_id ) ;
The "money" call to xBestIndex in that case looked like this:
jmpvtab BEST INDEX: Table: inventory nConstraints: 3
CONST[0]: 0 (inventory_id) = Usable
CONST[1]: 1 (film_id) = Usable
CONST[2]: 1 (film_id) = Usable
In response to this, I originally promised to create an index using all 3
constraints (two if which are the same), but SQLite rejected that and did a
table scan instead, because it really only needed an index based on
inventory_id. So the heuristic we ended up with, which did fix this specific
query, was (from Gunter):
> If you know nothing about a table except for the names of the fields
> and the number of rows, then you are best off choosing
> the first constraint only. (rows = cost = log n)
Well, now I have a counter-example. This time, the SQL looks like this:
SELECT t2.rental_id, t2.rental_date, t1.film_id, t3.title,
t4.category_id, t5."name"
FROM rental t2
LEFT OUTER JOIN inventory t1
ON ( t2.inventory_id = t1.inventory_id )
LEFT OUTER JOIN film t3
ON ( t3.film_id = t1.film_id )
LEFT OUTER JOIN film_category t4
ON ( t4.film_id = t1.film_id )
LEFT OUTER JOIN category t5
ON ( t5.category_id = t4.category_id ) ;
For whatever reason, the "money" call to xBestIndex in this case looks like
this:
jmpvtab BEST INDEX: Table: inventory nConstraints: 3
CONST[0]: 1 (film_id) = Usable
CONST[1]: 1 (film_id) = Usable
CONST[2]: 0 (inventory_id) = Usable
The order of the constraints is different! So, using the "first constraint"
heuristic, I commit to indexing based on film_id, but indexing inventory on
film_id not helpful for this query. SQLite sees that inventory is indexed on
film_id and decides to use table scan for inventory, and it's game over.
If SQLite calls to xBestIndex do not in some way convey which constraints
matter and which ones don't, I don't see how I can use virtual tables.
Here is the WhereTrace, in case it might help:
*** Optimizer Start ***
add: * 0.01.00 sqlite_master f 00100 N 0 cost 0,216,200
0 0.01.00 sqlite_master f 00100 N 0 cost 0,216,200
---- begin solver. (nRowEst=0)
New 0 cost=216,200 order=0
---- after round 0 ----
0 cost=216 nrow=200 order=0
---- begin solver. (nRowEst=201)
New 0 cost=216,200 order=1
---- after round 0 ----
0 cost=216 nrow=200 order=1 rev=0x0
---- Solution nRow=200 ORDERBY=1,0x0
0 0.01.00 sqlite_master f 00100 N 0 cost 0,216,200
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
add: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
skip: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
add: * 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
0 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
---- begin solver. (nRowEst=0)
New 0 cost=271, 43 order=0
Update 1 cost=216,180 order=0 was 0 cost=271, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=0
---- begin solver. (nRowEst=181)
---- sort cost=239 (1/1) increases cost 271 to 272
New 0 cost=272, 43 order=0
Update 1 cost=216,180 order=1 was 0 cost=272, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=1 rev=0x0
---- Solution nRow=180 ORDERBY=1,0x0
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
add: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
skip: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
add: * 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
0 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
---- begin solver. (nRowEst=0)
New 0 cost=271, 43 order=0
Update 1 cost=216,180 order=0 was 0 cost=271, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=0
---- begin solver. (nRowEst=181)
---- sort cost=239 (1/1) increases cost 271 to 272
New 0 cost=272, 43 order=0
Update 1 cost=216,180 order=1 was 0 cost=272, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=1 rev=0x0
---- Solution nRow=180 ORDERBY=1,0x0
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
add: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
skip: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
add: * 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
0 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
---- begin solver. (nRowEst=0)
New 0 cost=271, 43 order=0
Update 1 cost=216,180 order=0 was 0 cost=271, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=0
---- begin solver. (nRowEst=181)
---- sort cost=239 (1/1) increases cost 271 to 272
New 0 cost=272, 43 order=0
Update 1 cost=216,180 order=1 was 0 cost=272, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=1 rev=0x0
---- Solution nRow=180 ORDERBY=1,0x0
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
add: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
skip: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
add: * 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
0 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
---- begin solver. (nRowEst=0)
New 0 cost=271, 43 order=0
Update 1 cost=216,180 order=0 was 0 cost=271, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=0
---- begin solver. (nRowEst=181)
---- sort cost=239 (1/1) increases cost 271 to 272
New 0 cost=272, 43 order=0
Update 1 cost=216,180 order=1 was 0 cost=272, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=1 rev=0x0
---- Solution nRow=180 ORDERBY=1,0x0
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
add: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
skip: * 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
add: * 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
0 0.01.00 sqlite_master f 04000 N 1 cost 271,53,43
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
---- begin solver. (nRowEst=0)
New 0 cost=271, 43 order=0
Update 1 cost=216,180 order=0 was 0 cost=271, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=0
---- begin solver. (nRowEst=181)
---- sort cost=239 (1/1) increases cost 271 to 272
New 0 cost=272, 43 order=0
Update 1 cost=216,180 order=1 was 0 cost=272, 43 order=0
---- after round 0 ----
1 cost=216 nrow=180 order=1 rev=0x0
---- Solution nRow=180 ORDERBY=1,0x0
1 0.01.00 sqlite_master f 00100 N 0 cost 0,216,180
*** Optimizer Finished ***
*** Optimizer Start ***
---- Solution nRow=1
0 0.01.00 sqlite_master f 01101 N 1 cost 0,33,1
*** Optimizer Finished ***
*** Optimizer Start ***
constraint[0]: col=2 termid=0 op=2 usabled=0
usage[0]: argvIdx=0 omit=0
idxNum=-999
idxStr=
orderByConsumed=0
estimatedCost=2.5741e+08
estimatedRows=16044
add: * 0.01.00 t2 (-999,0) f 00400 N 0 cost 0,279,139
constraint[0]: col=2 termid=0 op=2 usabled=1
usage[0]: argvIdx=1 omit=1
idxNum=0
idxStr=
orderByConsumed=0
estimatedCost=67470
estimatedRows=4
add: * 0.01.03 t2 (0,1) f 00400 N 1 cost 0,160,20
constraint[0]: col=1 termid=5 op=2 usabled=0
constraint[1]: col=1 termid=6 op=2 usabled=0
constraint[2]: col=0 termid=7 op=2 usabled=0
usage[0]: argvIdx=0 omit=0
usage[1]: argvIdx=0 omit=0
usage[2]: argvIdx=0 omit=0
idxNum=-999
idxStr=
orderByConsumed=0
estimatedCost=2.09856e+07
estimatedRows=4581
add: * 1.02.01 t1 (-999,0) f 00400 N 0 cost 0,243,120
constraint[0]: col=1 termid=5 op=2 usabled=1
constraint[1]: col=1 termid=6 op=2 usabled=1
constraint[2]: col=0 termid=7 op=2 usabled=1
usage[0]: argvIdx=1 omit=1
usage[1]: argvIdx=0 omit=0
usage[2]: argvIdx=0 omit=0
idxNum=0
idxStr=
orderByConsumed=0
estimatedCost=16770.9
estimatedRows=3
add: * 1.02.0f t1 (0,1) f 00400 N 1 cost 0,140,16
constraint[0]: col=0 termid=1 op=2 usabled=0
usage[0]: argvIdx=0 omit=0
idxNum=-999
idxStr=
orderByConsumed=0
estimatedCost=1e+06
estimatedRows=1000
add: * 2.04.03 t3 (-999,0) f 00400 N 0 cost 0,199,99
constraint[0]: col=0 termid=1 op=2 usabled=1
usage[0]: argvIdx=1 omit=1
idxNum=0
idxStr=
orderByConsumed=0
estimatedCost=3000
estimatedRows=3
replace: * 2.04.03 t3 (-999,0) f 00400 N 0 cost 0,199,99
add: * 2.04.03 t3 (0,1) f 00400 N 1 cost 0,115,16
constraint[0]: col=0 termid=2 op=2 usabled=0
constraint[1]: col=1 termid=4 op=2 usabled=0
usage[0]: argvIdx=0 omit=0
usage[1]: argvIdx=0 omit=0
idxNum=-999
idxStr=
orderByConsumed=0
estimatedCost=1e+06
estimatedRows=1000
add: * 3.08.07 t4 (-999,0) f 00400 N 0 cost 0,199,99
constraint[0]: col=0 termid=2 op=2 usabled=1
constraint[1]: col=1 termid=4 op=2 usabled=1
usage[0]: argvIdx=1 omit=1
usage[1]: argvIdx=0 omit=0
idxNum=0
idxStr=
orderByConsumed=0
estimatedCost=3000
estimatedRows=3
replace: * 3.08.07 t4 (-999,0) f 00400 N 0 cost 0,199,99
add: * 3.08.07 t4 (0,1) f 00400 N 1 cost 0,115,16
constraint[0]: col=0 termid=3 op=2 usabled=0
usage[0]: argvIdx=0 omit=0
idxNum=-999
idxStr=
orderByConsumed=0
estimatedCost=256
estimatedRows=16
add: * 4.10.0f t5 (-999,0) f 00400 N 0 cost 0,80,40
constraint[0]: col=0 termid=3 op=2 usabled=1
usage[0]: argvIdx=1 omit=1
idxNum=0
idxStr=
orderByConsumed=0
estimatedCost=19.2659
estimatedRows=1
replace: * 4.10.0f t5 (-999,0) f 00400 N 0 cost 0,80,40
add: * 4.10.0f t5 (0,1) f 00400 N 1 cost 0,42,0
0 0.01.00 t2 (-999,0) f 00400 N 0 cost 0,279,139
1 0.01.03 t2 (0,1) f 00400 N 1 cost 0,160,20
2 1.02.01 t1 (-999,0) f 00400 N 0 cost 0,243,120
3 1.02.0f t1 (0,1) f 00400 N 1 cost 0,140,16
4 2.04.03 t3 (0,1) f 00400 N 1 cost 0,115,16
5 3.08.07 t4 (0,1) f 00400 N 1 cost 0,115,16
6 4.10.0f t5 (0,1) f 00400 N 1 cost 0,42,0
---- begin solver. (nRowEst=0)
New 0 cost=279,139 order=0
---- after round 0 ----
0 cost=279 nrow=139 order=0
New 02 cost=382,259 order=0
---- after round 1 ----
02 cost=382 nrow=259 order=0
New 024 cost=389,275 order=0
---- after round 2 ----
024 cost=389 nrow=275 order=0
New 0245 cost=400,291 order=0
---- after round 3 ----
0245 cost=400 nrow=291 order=0
New 02456 cost=400,291 order=0
---- after round 4 ----
02456 cost=400 nrow=291 order=0
---- Solution nRow=291
0 0.01.00 t2 (-999,0) f 00400 N 0 cost 0,279,139
2 1.02.01 t1 (-999,0) f 00400 N 0 cost 0,243,120
4 2.04.03 t3 (0,1) f 00400 N 1 cost 0,115,16
5 3.08.07 t4 (0,1) f 00400 N 1 cost 0,115,16
6 4.10.0f t5 (0,1) f 00400 N 1 cost 0,42,0
*** Optimizer Finished ***
Thanks,
Eric