Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-18 Thread Woody Wu
On Friday, 9 May 2014, RSmith  wrote:

>
> On 2014/05/08 18:16, Woody Wu wrote:
>
>> On 2014年5月8日 GMT+08:00PM11:59:33, RSmith  wrote:
>>
>>> On 2014/05/08 14:13, Woody Wu wrote:
>>>
 Do you mean that my index for this case is already the best choice?

 And, as a general question, for a query in the form of
 select max(a) from table where a < InF and b=B and c=C.
 what's the best index in the case of (1) B is an existed one, or (2)

>>> B is not exists ?
>>>
>>> Imagine you have a table like this:
>>>
>>> A, B, C
>>> 1, 1, 1
>>> 2, 1, 2
>>> 3, 2, 1
>>> 4, 1, 3
>>> 5, 1, 1
>>> 6, 2, 1
>>> 7, 1, 2
>>>
>>> etc.
>>>
>>>
>> Thanks Smith, you explained quite clear in your way.  I understood that
>> things has to be that if sqlite could only work like that.
>>
>> What I was wondering is that why sqlite couldn't find out earlier that
>> b=8 c=1 denotes an empty set? If I run something like "select * from table
>> where b=8 and c=1", it could return very quickly. If so, the max(a) on an
>> empty set should alway be Null.  Am I thinking right?
>>
>>
>> -Woody Wu
>>
>
> It makes no sense if there isn't some index defined on b or c or both. If
> there is, then it is the same happy accident accident looked at in a
> different way. The reason that the query returns fast if you "Just" do
> WHERE b=8 AND c=1 is that there is either an automatic or explicit index on
> column b and/or c - but - as soon as you change the query to include
> grouping (i.e: max(a)) then that specific index is no longer the best one
> to use - and you cannot expect SQLite to first run some other arbitrary
> "test" query to see if it has no results, and only THEN try your whole
> query after it "knows" whether the other model query has actual results or
> not. Put another way, you CAN construct an Index that will be VERY fast for
> queries where there definitely is NO value in the list, consider again the
> table from earlier:
>
> A, B, C
> 1, 1, 1
> 2, 1, 2
> 3, 2, 1
> 4, 1, 3
> 5, 1, 1
> 6, 2, 1
> 7, 1, 2
>
> Now add Index (B, C, A) - a covering Index which first lists B and C, then
> A... now if you were to ask of it:
>
> SELECT max(a) from table where a
> It will start by scanning the new Index for values where b=8... and Not
> find any, so very quickly return with no results.
>
> but now let's say you change the query to:
>
> SELECT max(a) from table where a
> It starts (as before) scanning the table (assuming it uses the new index -
> this is not a given) and finds that the very first entry indeed b=1 and
> c=1.. so it checks a first check, max(a) requires we scan all other values to find the max of a,
> at least all values where b=1 and c=1.. of which there might be very very
> many meaning this query is a LOT slower now than NOT using that index (as
> you had it in your original post when you get very fast results).
>
> So the new Index is optimized to be quick when the b value DOESN'T exist,
> but slow when it DOES... and the original index is better to use if the b
> value DOES exist, but slow when it DOESN'T.  Again, there is no best index
> that will satisfy both those requirements. Simon's advice is still best re
> making Indices and then running Analyze to see which queries get used, and
> whether they have good times, then removing the bad ones.
>
> If there is a way you could tell before-hand whether a value exists or
> not, that would be better, and nothing stoppping you from actually running
> two queries every time, first check if it exists, then, if it does, run the
> second query that is faster (as the original). - Actually this might make
> your unique problem into the fastest possible results, but testing is
> needed to verify and it isn't a solution you can short-hand off to the
> Query planner... you need to do two queries.
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>

Simon,

After the explanation, I understand the problem so much better than before.
 I think I would to use the two queries solution.  Thanks a lot!



-- 
Life is the only flaw in an otherwise perfect nonexistence
-- Schopenhauer

narke
public key at http://subkeys.pgp.net:11371 (narkewo...@gmail.com)
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread Clemens Ladisch
Woody Wu wrote:
> On 2014年5月8日 GMT+08:00PM9:14:52, Clemens Ladisch  wrote:
>> Woody Wu wrote:
>>> what's the best index in the case of (1) B is an existed one, or (2)
>>> B is not exists ?
>>
>> That depends.  Create two indexes on (b,c,a) and (c,b,a), run ANALYZE,
>> and check which one gets actually used.
>
>  Thanks Clemen, could you please tell me how to use this analyze command for 
> the case?

Just execute "ANAYLZE".

To check which index is used by the query, you have to execute this query with
EXPLAIN QUERY PLAN.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread Graham Holden
Out of curiosity, why is the "a < INF" clause (or something like "a < " 
in the original question) there? It doesn't seem to limit the results, and even 
if the overhead of having it there is minimal, does the fact that no one's 
suggested removing it mean it's needed/useful?

SELECT max(a) from table where ahttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread RSmith


On 2014/05/08 18:16, Woody Wu wrote:

On 2014年5月8日 GMT+08:00PM11:59:33, RSmith  wrote:

On 2014/05/08 14:13, Woody Wu wrote:

Do you mean that my index for this case is already the best choice?

And, as a general question, for a query in the form of
select max(a) from table where a < InF and b=B and c=C.
what's the best index in the case of (1) B is an existed one, or (2)

B is not exists ?

Imagine you have a table like this:

A, B, C
1, 1, 1
2, 1, 2
3, 2, 1
4, 1, 3
5, 1, 1
6, 2, 1
7, 1, 2

etc.



Thanks Smith, you explained quite clear in your way.  I understood that things 
has to be that if sqlite could only work like that.

What I was wondering is that why sqlite couldn't find out earlier that b=8 c=1 denotes an 
empty set? If I run something like "select * from table where b=8 and c=1", it 
could return very quickly. If so, the max(a) on an empty set should alway be Null.  Am I 
thinking right?


-Woody Wu


It makes no sense if there isn't some index defined on b or c or both. If there is, then it is the same happy accident accident 
looked at in a different way. The reason that the query returns fast if you "Just" do WHERE b=8 AND c=1 is that there is either an 
automatic or explicit index on column b and/or c - but - as soon as you change the query to include grouping (i.e: max(a)) then that 
specific index is no longer the best one to use - and you cannot expect SQLite to first run some other arbitrary "test" query to see 
if it has no results, and only THEN try your whole query after it "knows" whether the other model query has actual results or not. 
Put another way, you CAN construct an Index that will be VERY fast for queries where there definitely is NO value in the list, 
consider again the table from earlier:


A, B, C
1, 1, 1
2, 1, 2
3, 2, 1
4, 1, 3
5, 1, 1
6, 2, 1
7, 1, 2

Now add Index (B, C, A) - a covering Index which first lists B and C, then A... 
now if you were to ask of it:

SELECT max(a) from table where aIt starts (as before) scanning the table (assuming it uses the new index - this is not a given) and finds that the very first entry 
indeed b=1 and c=1.. so it checks aother values to find the max of a, at least all values where b=1 and c=1.. of which there might be very very many meaning this query 
is a LOT slower now than NOT using that index (as you had it in your original post when you get very fast results).


So the new Index is optimized to be quick when the b value DOESN'T exist, but slow when it DOES... and the original index is better 
to use if the b value DOES exist, but slow when it DOESN'T.  Again, there is no best index that will satisfy both those 
requirements. Simon's advice is still best re making Indices and then running Analyze to see which queries get used, and whether 
they have good times, then removing the bad ones.


If there is a way you could tell before-hand whether a value exists or not, that would be better, and nothing stoppping you from 
actually running two queries every time, first check if it exists, then, if it does, run the second query that is faster (as the 
original). - Actually this might make your unique problem into the fastest possible results, but testing is needed to verify and it 
isn't a solution you can short-hand off to the Query planner... you need to do two queries.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread Igor Tandetnik

On 5/8/2014 12:16 PM, Woody Wu wrote:

What I was wondering is that why sqlite couldn't find out earlier that b=8 c=1 denotes an 
empty set? If I run something like "select * from table where b=8 and c=1", it 
could return very quickly.


Are you sure? An index on (a, b, c) wouldn't help with such a query at 
all; it would require a full table scan. Do you have another index for 
this table, on (b) or (c) or (b, c) or (c, b)?

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread Woody Wu
On 2014年5月8日 GMT+08:00PM11:59:33, RSmith  wrote:
>
>On 2014/05/08 14:13, Woody Wu wrote:
>>
>> Do you mean that my index for this case is already the best choice?
>>
>> And, as a general question, for a query in the form of
>>select max(a) from table where a < InF and b=B and c=C.
>> what's the best index in the case of (1) B is an existed one, or (2)
>B is not exists ?
>
>Firstly, Simon's explanation of the best Index to use is to the point
>and accurate, so I won't add to that here, but on the last 
>point I should say that there is no difference between case 1 and case
>2 you seem to be missing the entire point of why your 
>query performed at different speeds, and I thought Igor explained it
>very well... but since I am not busy now, let me try to 
>elaborate more on why your query worked like it worked.
>
>Imagine you have a table like this:
>
>A, B, C
>1, 1, 1
>2, 1, 2
>3, 2, 1
>4, 1, 3
>5, 1, 1
>6, 2, 1
>7, 1, 2
>
>And now you asked the qyery like this:
>SELECT max(a) from table where a
>Now the query planner knows that you are looking for a maximum value,
>it also knows that A is sorted up, so max(a) will be the last 
>value in the list for which B and C matches, right?
>So the best plan here is to go down from the end of the list, starting
>at the last entry (a=7) and see if b=2 and c=1... which it is 
>not, so it goes to the next lower entry, namely a=6.. and YES, it finds
>that here b=2 and c=1 just like the query asked.. so it puts 
>the results in, and there is no need to continue checking because there
>can be no higher max(a) value ever again down the list...
>
>Now change this query to ask:
>SELECT max(a) from table where a
>// i.e the value 8 for "b" does not exist in the list
>
>Now the query planner knows that you are looking for a maximum value
>again, it also knows that A is sorted from least to most again, 
>so max(a) will again be the last value in the list for which B and C
>matches.
>So the best plan here is to go down from the end of the list, starting
>at the last entry (a=7) and see if b=8 and c=1... which it is 
>not, so it goes to the next lower entry, namely a=6.. and still no
>match, so it goes to next lower entry (a=5) and still no match, 
>so it goes to next lower entry (a=4) etc etc etc until ALL the list is
>scanned through taking a LOT of extra time.  The problem is 
>not your index, the problem is that in the one case there IS a value
>that qualifies soon in the search cycle, and in the other case 
>there never is a value that qualifies, so it has to keep looking until
>ALL the list has been searched - which simply takes a lot 
>longer. The problem is not your index... the problem is asking to find
>something that does not exist in a very long list.
>
>Do you understand now why there is no "better" index that will fix your
>problem?
>
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Thanks Smith, you explained quite clear in your way.  I understood that things 
has to be that if sqlite could only work like that.

What I was wondering is that why sqlite couldn't find out earlier that b=8 c=1 
denotes an empty set? If I run something like "select * from table where b=8 
and c=1", it could return very quickly. If so, the max(a) on an empty set 
should alway be Null.  Am I thinking right?


-Woody Wu
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread Woody Wu
On 2014年5月8日 GMT+08:00PM11:59:33, RSmith  wrote:
>
>On 2014/05/08 14:13, Woody Wu wrote:
>>
>> Do you mean that my index for this case is already the best choice?
>>
>> And, as a general question, for a query in the form of
>>select max(a) from table where a < InF and b=B and c=C.
>> what's the best index in the case of (1) B is an existed one, or (2)
>B is not exists ?
>
>Firstly, Simon's explanation of the best Index to use is to the point
>and accurate, so I won't add to that here, but on the last 
>point I should say that there is no difference between case 1 and case
>2 you seem to be missing the entire point of why your 
>query performed at different speeds, and I thought Igor explained it
>very well... but since I am not busy now, let me try to 
>elaborate more on why your query worked like it worked.
>
>Imagine you have a table like this:
>
>A, B, C
>1, 1, 1
>2, 1, 2
>3, 2, 1
>4, 1, 3
>5, 1, 1
>6, 2, 1
>7, 1, 2
>
>And now you asked the qyery like this:
>SELECT max(a) from table where a
>Now the query planner knows that you are looking for a maximum value,
>it also knows that A is sorted up, so max(a) will be the last 
>value in the list for which B and C matches, right?
>So the best plan here is to go down from the end of the list, starting
>at the last entry (a=7) and see if b=2 and c=1... which it is 
>not, so it goes to the next lower entry, namely a=6.. and YES, it finds
>that here b=2 and c=1 just like the query asked.. so it puts 
>the results in, and there is no need to continue checking because there
>can be no higher max(a) value ever again down the list...
>
>Now change this query to ask:
>SELECT max(a) from table where a
>// i.e the value 8 for "b" does not exist in the list
>
>Now the query planner knows that you are looking for a maximum value
>again, it also knows that A is sorted from least to most again, 
>so max(a) will again be the last value in the list for which B and C
>matches.
>So the best plan here is to go down from the end of the list, starting
>at the last entry (a=7) and see if b=8 and c=1... which it is 
>not, so it goes to the next lower entry, namely a=6.. and still no
>match, so it goes to next lower entry (a=5) and still no match, 
>so it goes to next lower entry (a=4) etc etc etc until ALL the list is
>scanned through taking a LOT of extra time.  The problem is 
>not your index, the problem is that in the one case there IS a value
>that qualifies soon in the search cycle, and in the other case 
>there never is a value that qualifies, so it has to keep looking until
>ALL the list has been searched - which simply takes a lot 
>longer. The problem is not your index... the problem is asking to find
>something that does not exist in a very long list.
>
>Do you understand now why there is no "better" index that will fix your
>problem?
>
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Thanks Smith, you explained quite clear in your way.  I understood that things 
has to be that if sqlite could only work like that.

What I was wondering is that why sqlite couldn't find out earlier that b=8 c=1 
denotes an empty set? If I run something like "select * from table where b=8 
and c=1", it could return very quickly. If so, the max(a) on an empty set 
should alway be Null.  Am I thinking right?


-Woody Wu

-Woody Wu
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread RSmith


On 2014/05/08 14:13, Woody Wu wrote:


Do you mean that my index for this case is already the best choice?

And, as a general question, for a query in the form of
   select max(a) from table where a < InF and b=B and c=C.
what's the best index in the case of (1) B is an existed one, or (2) B is not 
exists ?


Firstly, Simon's explanation of the best Index to use is to the point and accurate, so I won't add to that here, but on the last 
point I should say that there is no difference between case 1 and case 2 you seem to be missing the entire point of why your 
query performed at different speeds, and I thought Igor explained it very well... but since I am not busy now, let me try to 
elaborate more on why your query worked like it worked.


Imagine you have a table like this:

A, B, C
1, 1, 1
2, 1, 2
3, 2, 1
4, 1, 3
5, 1, 1
6, 2, 1
7, 1, 2

And now you asked the qyery like this:
SELECT max(a) from table where aNow the query planner knows that you are looking for a maximum value, it also knows that A is sorted up, so max(a) will be the last 
value in the list for which B and C matches, right?
So the best plan here is to go down from the end of the list, starting at the last entry (a=7) and see if b=2 and c=1... which it is 
not, so it goes to the next lower entry, namely a=6.. and YES, it finds that here b=2 and c=1 just like the query asked.. so it puts 
the results in, and there is no need to continue checking because there can be no higher max(a) value ever again down the list...


Now change this query to ask:
SELECT max(a) from table where aNow the query planner knows that you are looking for a maximum value again, it also knows that A is sorted from least to most again, 
so max(a) will again be the last value in the list for which B and C matches.
So the best plan here is to go down from the end of the list, starting at the last entry (a=7) and see if b=8 and c=1... which it is 
not, so it goes to the next lower entry, namely a=6.. and still no match, so it goes to next lower entry (a=5) and still no match, 
so it goes to next lower entry (a=4) etc etc etc until ALL the list is scanned through taking a LOT of extra time.  The problem is 
not your index, the problem is that in the one case there IS a value that qualifies soon in the search cycle, and in the other case 
there never is a value that qualifies, so it has to keep looking until ALL the list has been searched - which simply takes a lot 
longer. The problem is not your index... the problem is asking to find something that does not exist in a very long list.


Do you understand now why there is no "better" index that will fix your problem?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread Woody Wu
On 2014年5月8日 GMT+08:00PM9:14:52, Clemens Ladisch  wrote:
>Woody Wu wrote:
>> On 2014年5月8日 GMT+08:00AM1:51:43, Clemens Ladisch 
>wrote:
>>> Simon Slavin wrote:
 You can let SQLite make a better estimate by using the 'ANALYZE'
 command.
>>>
>>> Yes.  However, even if a better estimate were to show that this
>index is
>>> not very useful, there would exist, at the moment, no better index
>(with
>>> id1 or id2 as the first indexed column).
>>
>> Do you mean that my index for this case is already the best choice?
>
>No.  In a database that has only this index, it is the only choice.
>But you should create a better index.
>
>> And, as a general question, for a query in the form of
>>   select max(a) from table where a < InF and b=B and c=C.
>> what's the best index in the case of (1) B is an existed one, or (2)
>B is not exists ?
>
>That depends.  Create two indexes on (b,c,a) and (c,b,a), run ANALYZE,
>and check which one gets actually used.
>
>
>Regards,
>Clemens
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


 Thanks Clemen, could you please tell me how to use this analyze command for 
the case? I dont get the point from the manual.

-Woody Wu
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread Simon Slavin

On 8 May 2014, at 2:14pm, Clemens Ladisch  wrote:

> Woody Wu wrote:
>> And, as a general question, for a query in the form of
>>  select max(a) from table where a < InF and b=B and c=C.
>> what's the best index in the case of (1) B is an existed one, or (2) B is 
>> not exists ?
> 
> That depends.  Create two indexes on (b,c,a) and (c,b,a), run ANALYZE,
> and check which one gets actually used.

To expand on what Clemens wrote, once you get experienced with SQLite you 
understand better how indexing works and can usually just make up a very good 
index off the top of your head.  But until then this is a typical way of 
working:

1) Create your tables and populate them with realistic data.
2) Create lots of indexes on the columns used in your SELECT command.
3) Run ANALYZE.
4) Use EXPLAIN QUERY PLAN on your SELECT and see which index SQLite decided to 
use.
5) Delete the other indexes.

The results of the ANALYZE command are saved with your database.  It will be 
used by SQLite for any SELECT or 'WHERE' clause.  Unless the character of your 
data changes, or you make or delete indexes, you only need to run it once, 
though you can update it at any time by running ANALYZE again.

The reason I stated you need realistic, not random, data is that the ANALYZE 
command figures out not just how many rows each of your tables has but also how 
'chunky' the values in them are.  There's a difference in the use of an index 
between a table with 1000 rows, each with a different value, and a table with 
1000 rows, each of which is either 'yes' or 'no'.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread Clemens Ladisch
Woody Wu wrote:
> On 2014年5月8日 GMT+08:00AM1:51:43, Clemens Ladisch  wrote:
>> Simon Slavin wrote:
>>> You can let SQLite make a better estimate by using the 'ANALYZE'
>>> command.
>>
>> Yes.  However, even if a better estimate were to show that this index is
>> not very useful, there would exist, at the moment, no better index (with
>> id1 or id2 as the first indexed column).
>
> Do you mean that my index for this case is already the best choice?

No.  In a database that has only this index, it is the only choice.
But you should create a better index.

> And, as a general question, for a query in the form of
>   select max(a) from table where a < InF and b=B and c=C.
> what's the best index in the case of (1) B is an existed one, or (2) B is not 
> exists ?

That depends.  Create two indexes on (b,c,a) and (c,b,a), run ANALYZE,
and check which one gets actually used.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-08 Thread Woody Wu
On 2014年5月8日 GMT+08:00AM1:51:43, Clemens Ladisch  wrote:
>Simon Slavin wrote:
>> On 7 May 2014, at 1:29pm, Woody Wu  wrote:
>>> The 'explain query plan' gives same result for the first and the
>second query:
>>>
>>> 0|0|0|SEARCH TAB mp USING COVERING INDEX sqlite_auto_index_mp_1
>(ntimereq>>
>>> BTW: I dont understand what the (~1 rows) mean.
>>
>> Literally "about 1 row".  This is SQLite's estimate of how many rows
>the 'mp' table has.
>
>Actually, the estimate of how many rows will match this search.
>
>SQLite assumes that such a comparison is useful for reducing the number
>of result rows.  In this case, the estimate is quite wrong.
>
>> You can let SQLite make a better estimate by using the 'ANALYZE'
>command.
>
>Yes.  However, even if a better estimate were to show that this index
>is
>not very useful, there would exist, at the moment, no better index
>(with
>id1 or id2 as the first indexed column).
>
>
>Regards,
>Clemens
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Do you mean that my index for this case is already the best choice?

And, as a general question, for a query in the form of 
  select max(a) from table where a < InF and b=B and c=C.
what's the best index in the case of (1) B is an existed one, or (2) B is not 
exists ?


-Woody Wu
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-07 Thread Clemens Ladisch
Simon Slavin wrote:
> On 7 May 2014, at 1:29pm, Woody Wu  wrote:
>> The 'explain query plan' gives same result for the first and the second 
>> query:
>>
>> 0|0|0|SEARCH TAB mp USING COVERING INDEX sqlite_auto_index_mp_1 (ntimereq> (~1 rows)
>>
>> BTW: I dont understand what the (~1 rows) mean.
>
> Literally "about 1 row".  This is SQLite's estimate of how many rows the 'mp' 
> table has.

Actually, the estimate of how many rows will match this search.

SQLite assumes that such a comparison is useful for reducing the number
of result rows.  In this case, the estimate is quite wrong.

> You can let SQLite make a better estimate by using the 'ANALYZE' command.

Yes.  However, even if a better estimate were to show that this index is
not very useful, there would exist, at the moment, no better index (with
id1 or id2 as the first indexed column).


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-07 Thread Simon Slavin

On 7 May 2014, at 1:29pm, Woody Wu  wrote:

> The 'explain query plan' gives same result for the first and the second query:
> 
> 0|0|0|SEARCH TAB mp USING COVERING INDEX sqlite_auto_index_mp_1 (ntimereq (~1 rows)
> 
> BTW: I dont understand what the (~1 rows) mean.

Literally "about 1 row".  This is SQLite's estimate of how many rows the 'mp' 
table has.  You can let SQLite make a better estimate by using the 'ANALYZE' 
command.

It doesn't matter for this particular SELECT but for complicated SELECT 
commands which involve searching more than one table, knowing how long each 
table is and how it's indexed can help SQLite choose a good search strategy.

And now back to someone else who might be able to help you understand your 
original problem.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-07 Thread Woody Wu
On 2014年5月7日 GMT+08:00AM3:16:35, Clemens Ladisch  wrote:
>Woody Wu wrote:
>> The following query statement executed very slow, it took 15 secs on
>my ARM device,
>>
>> 1. select max(time) from mytable where time <  and id1 = k1
>and id2 = n.
>>
>> However, if I replace k1with another value that can be found in the
>table and keep everything unchanged, like below,
>>
>> 2. select max(time) from mytable where where time < 9 and id1
>= k2 and id2 = n.
>>
>> This query run very well, it only took less than 1 second on the same
>system.
>>
>> Could anyone explain this to me?
>
>The database can help with the explanation.
>What is the output of EXPLAIN QUERY PLAN for both queries?
>
>
>Regards,
>Clemens
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


The 'explain query plan' gives same result for the first and the second query:

0|0|0|SEARCH TAB mp USING COVERING INDEX sqlite_auto_index_mp_1 (ntimereqhttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-06 Thread Igor Tandetnik

On 5/6/2014 10:19 AM, Woody Wu wrote:

I observed a strange behavior.  I was operating on a big table, there are 
200,000 records in it. The table has a primary key or unique index, (time, id1, 
id2), all of these indexed columns are integers.

The following query statement executed very slow, it took 15 secs on my ARM 
device,

1. select max(time) from mytable where time <  and id1 = k1 and id2 = n.

where  is a value that large enough that no a record has its time field 
equals to it. k1 is a not existed value that cannot be matched by any id1 
column in the table, while n is a normal value that can be matched in the table 
by a subset of records.

However, if I replace k1with another value that can be found in the table and 
keep everything unchanged, like below,

2. select max(time) from mytable where where time < 9 and id1 = k2 and 
id2 = n.

This query run very well, it only took less than 1 second on the same system.


In this query, only the "time" part of the index helps, but not the 
"id1, id2" part. SQLite scans the index backwards, from highest time 
down, and stops as soon as it finds a record that satisfies the two 
equality checks. In the first case, no such record exists, so every 
single record ends up being looked at. In the second case, apparently a 
matching record is found early enough.

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Very slow when query max(col) on empty set?

2014-05-06 Thread Clemens Ladisch
Woody Wu wrote:
> The following query statement executed very slow, it took 15 secs on my ARM 
> device,
>
> 1. select max(time) from mytable where time <  and id1 = k1 and id2 = 
> n.
>
> However, if I replace k1with another value that can be found in the table and 
> keep everything unchanged, like below,
>
> 2. select max(time) from mytable where where time < 9 and id1 = k2 
> and id2 = n.
>
> This query run very well, it only took less than 1 second on the same system.
>
> Could anyone explain this to me?

The database can help with the explanation.
What is the output of EXPLAIN QUERY PLAN for both queries?


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Very slow when query max(col) on empty set?

2014-05-06 Thread Woody Wu
I observed a strange behavior.  I was operating on a big table, there are 
200,000 records in it. The table has a primary key or unique index, (time, id1, 
id2), all of these indexed columns are integers.

The following query statement executed very slow, it took 15 secs on my ARM 
device,

1. select max(time) from mytable where time <  and id1 = k1 and id2 = n.

where  is a value that large enough that no a record has its time field 
equals to it. k1 is a not existed value that cannot be matched by any id1 
column in the table, while n is a normal value that can be matched in the table 
by a subset of records.

However, if I replace k1with another value that can be found in the table and 
keep everything unchanged, like below,

2. select max(time) from mytable where where time < 9 and id1 = k2 and 
id2 = n.

This query run very well, it only took less than 1 second on the same system.

Also, I tried to replace the very large time value to a small enough one,

3. select max(time) from mytable where time < 0 and id1=k1 and id2=n

This query also run very fast.

Could anyone explain this to me? Thanks in advance.

-Woody Wu
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users