On 2014年5月8日 GMT+08:00PM11:59:33, RSmith <rsm...@rsweb.co.za> wrote: > >On 2014/05/08 14:13, Woody Wu wrote: >> >> Do you mean that my index for this case is already the best choice? >> >> And, as a general question, for a query in the form of >> select max(a) from table where a < InF and b=B and c=C. >> what's the best index in the case of (1) B is an existed one, or (2) >B is not exists ? > >Firstly, Simon's explanation of the best Index to use is to the point >and accurate, so I won't add to that here, but on the last >point I should say that there is no difference between case 1 and case >2.... you seem to be missing the entire point of why your >query performed at different speeds, and I thought Igor explained it >very well... but since I am not busy now, let me try to >elaborate more on why your query worked like it worked. > >Imagine you have a table like this: > >A, B, C >1, 1, 1 >2, 1, 2 >3, 2, 1 >4, 1, 3 >5, 1, 1 >6, 2, 1 >7, 1, 2 > >And now you asked the qyery like this: >SELECT max(a) from table where a<INF AND b=2 and c=1; > >Now the query planner knows that you are looking for a maximum value, >it also knows that A is sorted up, so max(a) will be the last >value in the list for which B and C matches, right? >So the best plan here is to go down from the end of the list, starting >at the last entry (a=7) and see if b=2 and c=1... which it is >not, so it goes to the next lower entry, namely a=6.. and YES, it finds >that here b=2 and c=1 just like the query asked.. so it puts >the results in, and there is no need to continue checking because there >can be no higher max(a) value ever again down the list... > >Now change this query to ask: >SELECT max(a) from table where a<INF AND b=8 and c=1; > >// i.e the value 8 for "b" does not exist in the list > >Now the query planner knows that you are looking for a maximum value >again, it also knows that A is sorted from least to most again, >so max(a) will again be the last value in the list for which B and C >matches. >So the best plan here is to go down from the end of the list, starting >at the last entry (a=7) and see if b=8 and c=1... which it is >not, so it goes to the next lower entry, namely a=6.. and still no >match, so it goes to next lower entry (a=5) and still no match, >so it goes to next lower entry (a=4) etc etc etc until ALL the list is >scanned through taking a LOT of extra time. The problem is >not your index, the problem is that in the one case there IS a value >that qualifies soon in the search cycle, and in the other case >there never is a value that qualifies, so it has to keep looking until >ALL the list has been searched - which simply takes a lot >longer. The problem is not your index... the problem is asking to find >something that does not exist in a very long list. > >Do you understand now why there is no "better" index that will fix your >problem? > >_______________________________________________ >sqlite-users mailing list >sqlite-users@sqlite.org >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Thanks Smith, you explained quite clear in your way. I understood that things has to be that if sqlite could only work like that. What I was wondering is that why sqlite couldn't find out earlier that b=8 c=1 denotes an empty set? If I run something like "select * from table where b=8 and c=1", it could return very quickly. If so, the max(a) on an empty set should alway be Null. Am I thinking right? -Woody Wu _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users