Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
On Mon, Mar 01, 2010 at 11:44:51PM +0100, Jean-Christophe Deschamps scratched on the wall: > >> So indexes are not used for NOT conditions, as NOT conditions >> generally require a full scan, regardless. Yes, it is a simple >> reverse of a binary test, but the reverse of a specific indexed >> lookup of a known value is a table scan to gather all the unknown >> values. > > I understand how your argument goes, but I still have difficulty buying > it. Why not negate the condition itself and then only proceed? That seems to be mixing concepts a bit. Yes, you can try to use algebra to negate the base of a "NOT..." condition and transform it into an equivalent condition that lacks a secondary negation, like transforming "NOT A > B" into "A <= B". That new condition can them be applied to the query, and the query engine may then choose to use an index, because the condition now defines the set of rows it is looking for, rather than the set of rows it is not looking for. But, I'd argue that's not really the point, for several reasons. First and foremost, the original conversation was about why indexes aren't used with "NOT..." conditions. If you go through this process, by the time you actually pass off the expression to the database engine, you no longer have a "NOT..." condition. You may have an expression that can be transformed (back) into a "NOT..." condition, but that's not the same at the database level. Fine, you say, but shouldn't the query optimizer be able to produce that transformation? So the transform is made between the input level and the database level? Well, I suppose it is possible for simple expressions, but I'd argue that really isn't the business of the query optimizer. At least not the one in SQLite. For one thing, as we've already seen, 3LV and those pesky NULLs make this kind of transformation a lot more complex than it first looks. But more to the point, it is only likely to work for fairly simple expressions. For a large number of expressions a transformed expression simply does not exist. Take, for example, one of the previous expressions you gave: "NOT col1 LIKE 'abc%'". The LIKE operator is defined by the SQL function like(), which offers no inverse function. The optimizations that use (or don't) an index are deep inside the code that implements the like() function. You can't simply flip them, you need to start over. And yes, you might be able to devise a new function based off the default implementation of like(), maybe even one that pays attention to the case_sensitive_like pragma. You might even be able to implement it so that it uses all the same index aware optimizations the default like() does. And what you end up with is an unlike() function that you feed into the database engine as a non-"NOT..." expression, putting us back to the argument that by the time you go through all this, you're not using an index for a "NOT..." expression, you're using it for an equivalent, but different expression. Perhaps more importantly, the transform isn't systematic, but predefined by swapping one function for another that has been pre-defined to have an inverse meaning. And, of course, it can't be a simple inverse at a lower layer either. For an unlike() function to have similar efficiency and use an index when possible, the whole function needs to be re-written at a much deeper level so that similar index aware optimizations can be used and still answer the question of like() or unlike(). That's far from simple, even for one specific and isolated case. Then there's the nasty bit about user-define over-rides for like(), or any expression that happens to use a function. You can't exactly expect everyone that writes a function to also write an inverse function. (Of course, I'll admit that the majority of such functions can't use an index anyways...) So the only question left is why the optimizer doesn't do it for the simple cases. And I'd argue that, "Why don't *you* do it?" It hardly seems worth the very complex code (and code size, and runtime cost to check and test all this, not to mention development time, plus the huge amount of testing to verify it works correctly) just to optimize the extremely simple cases on the order of "NOT col1 > 3." If performance is a big concern the developer should just write different queries (and, very likely, come to appreciate that when NULL handling is important, this isn't as easy as it sounds). It might be nice if the optimizer could do it for us, but I'd guess that the optimizer would only be able to bite off very simple cases that most of us could do in our heads in just a few seconds. Then the only issue is the knowledge and understanding that a "NOT..." condition may be significantly more expensive than an equivalent "base" condition. And that's a point worth rai
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
>I haven't been able to think of how it would preclude using the index, >but I suspect it's more a matter of needing a similar-but-different >codepath to optimize for the NOT case, rather than a simple "invert >this" codepath relying on the existing case. Which is really just >another way of stating Jay's point, perhaps. I understand. Perhaps I'm biaised by using functional languages where such operators as NOT in our case are applied as deep as possible in the process of any evaluation. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
On Mon, Mar 1, 2010 at 3:12 PM, Jean-Christophe Deschamps wrote: >>NULL = 12345 is NULL, NOT NULL is NULL, so subset N is not part of NOT >>(col = 12345). > > You're right of course! (and I was even saying about nulls treated apart) > > But, in your view, that the set can be non-contiguous for > negative/negated conditions would it explain that current code can't > make use of index? I haven't been able to think of how it would preclude using the index, but I suspect it's more a matter of needing a similar-but-different codepath to optimize for the NOT case, rather than a simple "invert this" codepath relying on the existing case. Which is really just another way of stating Jay's point, perhaps. -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
>NULL = 12345 is NULL, NOT NULL is NULL, so subset N is not part of NOT >(col = 12345). You're right of course! (and I was even saying about nulls treated apart) But, in your view, that the set can be non-contiguous for negative/negated conditions would it explain that current code can't make use of index? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
On Mon, Mar 1, 2010 at 2:44 PM, Jean-Christophe Deschamps wrote: > The actual reason for the way NOT works as for now may be due to the > fact that negating a condition may cause the resulting set to be in > fact itself the union of two subsets. > Say the "where" condition K is "col = 12345". We can see the index > split into not less than four subsets: > o) N = {rows with col is null} > o) X = {rows with col < 12345} > o) Y = {rows with col = 12345} > o) Z = {rows with col > 12345} > > For the "positive" condition K, the resulset is Y. > For the negated condition K' (col <> 12345) the resulset is N u X u Z, > made of two (or three ?) distinct blocks of indexed rowids. A possible > explanation is that the current code may be unable to cope with > situations where the resultset is not index-wise contiguous. NULL = 12345 is NULL, NOT NULL is NULL, so subset N is not part of NOT (col = 12345). -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
> So indexes are not used for NOT conditions, as NOT conditions > generally require a full scan, regardless. Yes, it is a simple > reverse of a binary test, but the reverse of a specific indexed > lookup of a known value is a table scan to gather all the unknown > values. Jay, I understand how your argument goes, but I still have difficulty buying it. Why not negate the condition itself and then only proceed? Under any condition K, a table T is the union of the subset X of elements that match the condition K and the subset Y of elements that don't. In other words, Y is the subset whose elements match [not K]. In SQL there is of course the special case for nulls. You're saying that selecting all rows in T that match K is OK for indexing, while selecting rows in T which match the negated condition K' = not K can only be done with a full scan. That is true iff the complementation is done afterwards or on the fly (full scan). My point is that NOT seems to be acting as a complementation operator: select X then full scan and check every row if it belongs to X or not. I'd rather see the negated condition K' (= not K) as the condition used for indexed selection, again when using the 'complementary' condition makes sense. Pavel modified simple example works well here: condition " ... > 3" OK for index search condition " ... <= 3" OK for index search condition "NOT ... > 3" only full scan: why? I still believe that it's possible that many simple conditions be negated and used negated for index search. Now, that it would easy to implement in the current SQLite code is another matter entirely and I never pretended it could be done within minutes. The actual reason for the way NOT works as for now may be due to the fact that negating a condition may cause the resulting set to be in fact itself the union of two subsets. Say the "where" condition K is "col = 12345". We can see the index split into not less than four subsets: o) N = {rows with col is null} o) X = {rows with col < 12345} o) Y = {rows with col = 12345} o) Z = {rows with col > 12345} For the "positive" condition K, the resulset is Y. For the negated condition K' (col <> 12345) the resulset is N u X u Z, made of two (or three ?) distinct blocks of indexed rowids. A possible explanation is that the current code may be unable to cope with situations where the resultset is not index-wise contiguous. The OP request can be written in a number of ways to make use of the index, even: select count(*) from mybigtable where mytextcolumn < '' or mytextcolumn >= ''; so the question is not if SQLite can process equivalent count/querries efficiently or not. I nonetheless think "is not null" is much clearer than the above workaround. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
On Mon, Mar 01, 2010 at 06:03:49PM +0100, Jean-Christophe Deschamps scratched on the wall: > > >I totally disagree with you. Let's say you have 1,000,000 rows and 100 > >of them contain NULL. In this situation selecting NOT NULL will select > >almost all rows which means that using index in this case doesn't give > >any performance boost. So here using full scan for NOT NULL condition > >is better and for NULL condition using index is better. Everything is > >completely legitimate. > The number of rows concerned either way is not the point. Unless you have stats to prove otherwise, the optimizer will generally assume you're trying to select a smaller subset. And SQLite's stats get weird when a lot of NULLs are involved, as it sees NULLs as unique values. That's correct for 3VL, but not so much for calculating the uniqueness of a column. > Also the > NULL condition could be almost anything else. Situations where the 1M > and 100 figures are reversed would invalidate your point, so? No, you'd still need/want a table scan (see below), but for different reasons. > select * from T where col1 like 'abc%' and col2 between 3 and 18; > uses the index on T > > select * from T where col1 NOT like 'abc%' and col2 between 1 and 24; > doesn't, due to the use of NOT. > > That in some particular situation using an index could be better or > worse is again not the point. The point was that NOT is > simply reversing the issue of a binary test, in fine, and that seems > essentially independant of the use of an index for determining . You can't use an index to efficiently find something that "isn't." Indexes are optimized to find specific values. You have to know what you're looking for before you start. If you need to do a full scan, you should just use the regular table structure, not the index. Most indexes have a very high overhead in returning rows. That's why they're only useful if they target a very small percentage of rows (typically less than 15%). In the case of "not" operators, you don't know what you're looking for. In the case of NOTs, you can't use an index to find what you want, only to find what you don't want, so that's not useful. For example, if I hand you a phone book and tell you to find all the phone numbers for everyone with a lastname that *doesn't* start with 'X', you can use the phone book "index" to quickly find those numbers you don't want. But that isn't useful unless you do a full scan to generate the pool of numbers you might want, and then subtract out the numbers you don't want. In other words, you need to do a full data scan. Once that's the case, it is faster to generate the list of return values with a full scan, testing and rejecting as you do the scan. So indexes are not used for NOT conditions, as NOT conditions generally require a full scan, regardless. Yes, it is a simple reverse of a binary test, but the reverse of a specific indexed lookup of a known value is a table scan to gather all the unknown values. > >BTW, when SQLite uses index on "text" field it needs to load full > >values of "text" field anyway. > > Do you mean that > select * from T where rowid = 1; > needs loading the _entire_ index when T has rowids in 1..1000 ? No. But a row is a row and a value is a value. It does not matter if it is a NULL or a TEXT data type. If you want to test a column value, you need to load it. So the question of "Why load a TEXT value to test for NULL?" is an odd one, since you clearly can't test something you don't know. Along with that, SQLite indexes use full data copies. So an index of a column full of TEXT values has a full copy of the TEXT values. So loading is still loading. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
>maybe NOT is implemented the same way as any other >function and so it cannot be optimized using index. That's possible, but other logical operators don't exhibit the same bahavior and will not prevent the use of indexes. That NOT is not being handled at the same _logical_ level than AND and OR is a bit disappointing. >Better example will be "NOT int_val < 3" versus "int_val >= 3". You're right, that just got "out of the keyboard" by itself ;-) >No, I didn't mean that. Apparently we have different understanding of >words "full values". :) OK then I follow you, but that only needs loading / searching some O(log N) pages from the b-tree, which hold actual column entries (full values). ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
> The point was that NOT is > simply reversing the issue of a binary test, in fine, and that seems > essentially independant of the use of an index for determining . I agree with the point that NOT prevents using an index for some reason. I'm not sure but probably that's because of the same issue with functions: maybe NOT is implemented the same way as any other function and so it cannot be optimized using index. But I find your example with like inappropriate because it's much harder to implement NOT LIKE 'abc%' using index than LIKE 'abc%' - it needs double entrance to the b-tree structure. Better example will be "NOT int_val < 3" versus "int_val >= 3". >>BTW, when SQLite uses index on "text" field it needs to load full >>values of "text" field anyway. > > Do you mean that > select * from T where rowid = 1; > needs loading the _entire_ index when T has rowids in 1..1000 ? No, I didn't mean that. Apparently we have different understanding of words "full values". :) Pavel On Mon, Mar 1, 2010 at 12:03 PM, Jean-Christophe Deschamps wrote: > >>I totally disagree with you. Let's say you have 1,000,000 rows and 100 >>of them contain NULL. In this situation selecting NOT NULL will select >>almost all rows which means that using index in this case doesn't give >>any performance boost. So here using full scan for NOT NULL condition >>is better and for NULL condition using index is better. Everything is >>completely legitimate. > > The number of rows concerned either way is not the point. Also the > NULL condition could be almost anything else. Situations where the 1M > and 100 figures are reversed would invalidate your point, so? > > select * from T where col1 like 'abc%' and col2 between 3 and 18; > uses the index on T > > select * from T where col1 NOT like 'abc%' and col2 between 1 and 24; > doesn't, due to the use of NOT. > > That in some particular situation using an index could be better or > worse is again not the point. The point was that NOT is > simply reversing the issue of a binary test, in fine, and that seems > essentially independant of the use of an index for determining . > > >>BTW, when SQLite uses index on "text" field it needs to load full >>values of "text" field anyway. > > Do you mean that > select * from T where rowid = 1; > needs loading the _entire_ index when T has rowids in 1..1000 ? > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
>I totally disagree with you. Let's say you have 1,000,000 rows and 100 >of them contain NULL. In this situation selecting NOT NULL will select >almost all rows which means that using index in this case doesn't give >any performance boost. So here using full scan for NOT NULL condition >is better and for NULL condition using index is better. Everything is >completely legitimate. The number of rows concerned either way is not the point. Also the NULL condition could be almost anything else. Situations where the 1M and 100 figures are reversed would invalidate your point, so? select * from T where col1 like 'abc%' and col2 between 3 and 18; uses the index on T select * from T where col1 NOT like 'abc%' and col2 between 1 and 24; doesn't, due to the use of NOT. That in some particular situation using an index could be better or worse is again not the point. The point was that NOT is simply reversing the issue of a binary test, in fine, and that seems essentially independant of the use of an index for determining . >BTW, when SQLite uses index on "text" field it needs to load full >values of "text" field anyway. Do you mean that select * from T where rowid = 1; needs loading the _entire_ index when T has rowids in 1..1000 ? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
> Anyway, it seems the OP has a point in saying that it would be nice > --and I would say 'natural'-- to have the optimizer enhanced to handle > "NOT " as efficiently as it handles ", provided > such enhancement can be done with only little changes. I totally disagree with you. Let's say you have 1,000,000 rows and 100 of them contain NULL. In this situation selecting NOT NULL will select almost all rows which means that using index in this case doesn't give any performance boost. So here using full scan for NOT NULL condition is better and for NULL condition using index is better. Everything is completely legitimate. The only bad thing here is that functions like max(), min() and count() are not special functions in SQLite like it is in some other RDBMS. Because of this fact SQLite core doesn't know that for count() it doesn't need the full row and if the table contains a lot of other fields along with "text" then it will be indeed faster to use index on "text" even for full scan as in example above. It's a historical feature in SQLite and I doubt it can be easily changed anywhere in near future. BTW, when SQLite uses index on "text" field it needs to load full values of "text" field anyway. So maybe OP meant something else? Pavel On Mon, Mar 1, 2010 at 11:24 AM, Jean-Christophe Deschamps wrote: > >> > "SELECT count(*) WHERE NOT text IS NULL" >> > >> > requires that the complete text column is loaded. With a stored LOB >> > this results in crazy performance. >> >>How did you find that? What do you mean by "requires loading of the >>whole text column"? It pretty much can require even loading of text >>columns that shouldn't be counted at all just because one database >>page is the minimum storage entity loaded from disk. > > Hi Pavel, > > I believe the OP was intrigued/upset by the fact that > > SELECT count(*) WHERE NOT text IS NULL; > or (equivalent) > SELECT count(*) WHERE text IS NOT NULL > > does not use an index in the text column, while > > SELECT count(*) WHERE text IS NULL > > does use the index. > > I've shown a (trivial) way to achieve the same 'not null' count using > the index. > > Anyway, it seems the OP has a point in saying that it would be nice > --and I would say 'natural'-- to have the optimizer enhanced to handle > "NOT " as efficiently as it handles ", provided > such enhancement can be done with only little changes. > > The optimizer is smart enough to handle multiple conditions connected > by AND and OR and use index for every condition (when they are > available, of course), but it reverts to full scan for any NOT > , whatever condition is (simple or complex). > > I'm certainly not in a position to dictate how should the optimizer > should evolve but, as a mere user, I feel that situation a little less > than satisfactory. > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
> > "SELECT count(*) WHERE NOT text IS NULL" > > > > requires that the complete text column is loaded. With a stored LOB > > this results in crazy performance. > >How did you find that? What do you mean by "requires loading of the >whole text column"? It pretty much can require even loading of text >columns that shouldn't be counted at all just because one database >page is the minimum storage entity loaded from disk. Hi Pavel, I believe the OP was intrigued/upset by the fact that SELECT count(*) WHERE NOT text IS NULL; or (equivalent) SELECT count(*) WHERE text IS NOT NULL does not use an index in the text column, while SELECT count(*) WHERE text IS NULL does use the index. I've shown a (trivial) way to achieve the same 'not null' count using the index. Anyway, it seems the OP has a point in saying that it would be nice --and I would say 'natural'-- to have the optimizer enhanced to handle "NOT " as efficiently as it handles ", provided such enhancement can be done with only little changes. The optimizer is smart enough to handle multiple conditions connected by AND and OR and use index for every condition (when they are available, of course), but it reverts to full scan for any NOT , whatever condition is (simple or complex). I'm certainly not in a position to dictate how should the optimizer should evolve but, as a mere user, I feel that situation a little less than satisfactory. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
> "SELECT count(*) WHERE NOT text IS NULL" > > requires that the complete text column is loaded. With a stored LOB > this results in crazy performance. How did you find that? What do you mean by "requires loading of the whole text column"? It pretty much can require even loading of text columns that shouldn't be counted at all just because one database page is the minimum storage entity loaded from disk. Pavel On Sat, Feb 27, 2010 at 6:25 PM, Lothar Scholz wrote: > It is driving me crazy. I'm working on a web spider where a table holds the > downloaded > webpage. It seems that a select > > "SELECT count(*) WHERE NOT text IS NULL" > > requires that the complete text column is loaded. With a stored LOB > this results in crazy performance. > > Is this optimized in later versions of SQLite (i'm using a 3.4.x which is > about 2 years old). > > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
On Sun, Feb 28, 2010 at 01:26:20AM +0100, Kees Nuyt scratched on the wall: > On Sun, 28 Feb 2010 00:44:00 +0100, "Artur Reilin" > wrote: > > > "Select count(id) from table where text!='' or > > text!=NULL" is the same, right? > > text!=NULL is not a valid expression. Sure it is. 8-) It just isn't a very useful expression: it will *always* return NULL. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
>It is driving me crazy. I'm working on a web spider where a table >holds the downloaded >webpage. It seems that a select > >"SELECT count(*) WHERE NOT text IS NULL" > >requires that the complete text column is loaded. With a stored LOB >this results in crazy performance. > >Is this optimized in later versions of SQLite (i'm using a 3.4.x which is >about 2 years old). select (select count(*) from mybigtable) - (select count(*) from mybigtable where mytextcolumn is null) as "Non-NULL record count"; This uses only indexes, but explain query plan gives no answer on the first select. Anyway, workaround like this to check that the second part actually uses your index (Adapt to minimum rowid you're using if ever you force negative rowids in). explain query plan select (select count(*) from mybigtable where rowid > 0) - (select count(*) from mybigtable where mytextcolumn is null) as "Non-NULL record count"; TABLE mybigtable USING PRIMARY KEY TABLE mybigtable USING idxMyTextColumn BTW, what is "irreal" in SQLite? Its cost or its support? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
On Sun, 28 Feb 2010 00:44:00 +0100, "Artur Reilin" wrote: > "Select count(id) from table where text!='' or > text!=NULL" is the same, right? text!=NULL is not a valid expression. -- ( Kees Nuyt ) c[_] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
Hello Artur, Sunday, February 28, 2010, 12:44:00 AM, you wrote: AR> "Select count(id) from table where text!='' or text!=NULL" is the same, AR> right? Using text!='' obviously needs to load the text column. I just tried it nevertheless and aborted after 5 min (brute force reread of the whole file would be by the way much much faster then the 5min for a 10 GB file). AR> As i see you want to count all rows, right? I don't know if it is faster AR> to only count one column instead of *, but I using it the one-column-count AR> way. Perhaps this helps too? I never thought about this and always considered it syntax-sugar for the count function. And it seems i'm right, no change either. So it's maybe time to say byebye to SQLite and move on to a real database system. -- Best regards, Lothar Scholzmailto:sch...@scriptolutions.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
"Select count(id) from table where text!='' or text!=NULL" is the same, right? As i see you want to count all rows, right? I don't know if it is faster to only count one column instead of *, but I using it the one-column-count way. Perhaps this helps too? Artur --- Am 28.02.2010, 00:25 Uhr, schrieb Lothar Scholz : > It is driving me crazy. I'm working on a web spider where a table holds > the downloaded > webpage. It seems that a select > > "SELECT count(*) WHERE NOT text IS NULL" > > requires that the complete text column is loaded. With a stored LOB > this results in crazy performance. > > Is this optimized in later versions of SQLite (i'm using a 3.4.x which is > about 2 years old). > > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Is there an optimization for "IS NULL" operator in a SELECT query ?
It is driving me crazy. I'm working on a web spider where a table holds the downloaded webpage. It seems that a select "SELECT count(*) WHERE NOT text IS NULL" requires that the complete text column is loaded. With a stored LOB this results in crazy performance. Is this optimized in later versions of SQLite (i'm using a 3.4.x which is about 2 years old). ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users