On Tue, Apr 8, 2008 at 2:03 PM, Jay A. Kreibich <[EMAIL PROTECTED]> wrote:
>   The stat table actually hurts in this case, since the stats for both
>   indexes are the same.  This appears to be because two NULL values
>   appear to be considered distinct values, so the analysis system sees
>   column a as being full of unique values.  That makes the "usefulness"
>   weight of idx_a highly inflated.

This may indeed be a bug -- thanks for filing it.  However, even after
manually updating the sqlite_stat1 table with the proper values (as if
the bug were fixed), the planner still did exactly the same thing.

This problem also affects the table ordering of my joins.  Taking the
same schema and adding a "detail" table (see
http://skrul.com/index_detail.sh) , this query chooses idx_a and puts
the foo table first:

sqlite> explain query plan select * from foo join foo_detail on foo.id
= foo_detail.id where (a is null) and data = 1;
0|0|TABLE foo WITH INDEX idx_a
1|1|TABLE foo_detail WITH INDEX idx_foo_detail_data

And it runs very very _very_ slowly.  Forcing sqlite not to use idx_a, you get:

sqlite> explain query plan select * from foo join foo_detail on foo.id
= foo_detail.id where (a is null or a is not null) and data = 1;
0|1|TABLE foo_detail WITH INDEX idx_foo_detail_data
1|0|TABLE foo USING PRIMARY KEY

And it runs brilliantly fast as you'd expect.

For now I've added the "or" hack into my application (thanks, Ken) and
this has sped things up considerably.  Is there any better way to fix
this?

cheers,
-steve
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to