> Anyway, it seems the OP has a point in saying that it would be nice > --and I would say 'natural'-- to have the optimizer enhanced to handle > "NOT <condition>" as efficiently as it handles "<condition>, provided > such enhancement can be done with only little changes.
I totally disagree with you. Let's say you have 1,000,000 rows and 100 of them contain NULL. In this situation selecting NOT NULL will select almost all rows which means that using index in this case doesn't give any performance boost. So here using full scan for NOT NULL condition is better and for NULL condition using index is better. Everything is completely legitimate. The only bad thing here is that functions like max(), min() and count() are not special functions in SQLite like it is in some other RDBMS. Because of this fact SQLite core doesn't know that for count() it doesn't need the full row and if the table contains a lot of other fields along with "text" then it will be indeed faster to use index on "text" even for full scan as in example above. It's a historical feature in SQLite and I doubt it can be easily changed anywhere in near future. BTW, when SQLite uses index on "text" field it needs to load full values of "text" field anyway. So maybe OP meant something else? Pavel On Mon, Mar 1, 2010 at 11:24 AM, Jean-Christophe Deschamps <[email protected]> wrote: > >> > "SELECT count(*) WHERE NOT text IS NULL" >> > >> > requires that the complete text column is loaded. With a stored LOB >> > this results in crazy performance. >> >>How did you find that? What do you mean by "requires loading of the >>whole text column"? It pretty much can require even loading of text >>columns that shouldn't be counted at all just because one database >>page is the minimum storage entity loaded from disk. > > Hi Pavel, > > I believe the OP was intrigued/upset by the fact that > > SELECT count(*) WHERE NOT text IS NULL; > or (equivalent) > SELECT count(*) WHERE text IS NOT NULL > > does not use an index in the text column, while > > SELECT count(*) WHERE text IS NULL > > does use the index. > > I've shown a (trivial) way to achieve the same 'not null' count using > the index. > > Anyway, it seems the OP has a point in saying that it would be nice > --and I would say 'natural'-- to have the optimizer enhanced to handle > "NOT <condition>" as efficiently as it handles "<condition>, provided > such enhancement can be done with only little changes. > > The optimizer is smart enough to handle multiple conditions connected > by AND and OR and use index for every condition (when they are > available, of course), but it reverts to full scan for any NOT > <condition>, whatever condition is (simple or complex). > > I'm certainly not in a position to dictate how should the optimizer > should evolve but, as a mere user, I feel that situation a little less > than satisfactory. > > > > _______________________________________________ > sqlite-users mailing list > [email protected] > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

