Re: [sqlite] SELECT query first run is VERY slow

reseok Tue, 26 Jul 2011 05:42:48 -0700

Think about the distribution of your Data.

select count(*) as cnt,kind,computer
from log
group by kind,computer
order by cnt desc


what happens here?

SELECT *
 FROM log INDEXED BY idxlog_kind_computer
 WHERE kind = 'info' AND computer=1 and id > 7070636
 LIMIT 100;

there are 3_022_148 identical entries 'info,1' in your index

sqlite has to traverse near all of them and so it is not much help with
binary search.
Drop this index, run ANALYZE and sqlite will use your primary key quite
fast.

Also, if you want your data in reverse order try:

PRAGMA legacy_file_format=0;
CREATE TABLE log
(
    id integer primary key DESC autoincrement,
    msg text,
    created_at int,
    kind,
    computer,
    process,
    who
);




Григорий Григоренко schrieb:
> Did that. Timings has decreased.  As I understand it it's about decreasing 
> index size (that includes kind column).
> 
> 
> 
> To me the problem is still there. If my database will have 10 mln log records 
> first running query will stuck again :(
> 
> 
> 
> I don't understand SQLITE strategy. Let me explain.
> 
> Index is an array of index records. They fill pages in database. 
> 
> Searching with B-Tree index is similar to binary search in ordered array, 
> isn't it? You pick record in a middle of array subset and compare to 
> conditional value.
> 
> This step let you drop half of index subset from search.
> 
> Let's say size of index is 100 Mb and it contains 4 mln index records. 
> 
> This is 100 Mb / 8 Kb (size of page) ~ 12000 pages.
> 
> While doing binary search for 4 mln records we do 22 compares at most (2^22 ~ 
> 4 mln).
> 
> Assume worst case - all of these comparings use different pages. 
> 
> So, we need to read 22 pages = 180 Kb.
> 
> Surely there's additional data to be read for index. Like some intermediate 
> nodes in B-Tree.
> 
> Let's triple the number of pages, 66 pages = 540 Kb.
> 
> 
> But SQLITE reads in this case ~ 50 Mb!! 
> 
> 
> This leads us to conclusion: index in SQLITE database if scattered and cannot 
> be "jumped directly" to N-th element. SQLITE has to read it somehow 
> consecutively. 
> 
> And so SQLITE has to read half of index (!) to find matching index record.
> 
> 
> Am I getting it right?
> 
> 
> 
> 
> 
> 
> 25 июля 2011, 19:35 от "Black, Michael (IS)" <[email protected]>:
>> You need to normalize your "kind" value.
>>
>>
>>
>> .pragma cache_size=15000;
>>
>> drop index idxlog_kind_computer;
>>
>> create table kind(id integer,kind text);
>> insert into kind values(1,'debug');
>> insert into kind values(2,'error');
>> insert into kind values(3,'info');
>> insert into kind values(4,'timing');
>> insert into kind values(5,'warn');
>> update log set kind=1 where kind='debug';
>> update log set kind=2 where kind='error';
>> update log set kind=3 where kind='info';
>> update log set kind=4 where kind='timing';
>> update log set kind=5 where kind='warn';
>> create index idxlog_kind_computer ON log(kind,computer);
>>
>>
>>
>> Then see how long your first query takes.
>>
>>
>>
>> Michael D. Black
>>
>> Senior Scientist
>>
>> NG Information Systems
>>
>> Advanced Analytics Directorate
>>
>>
>>
>> ________________________________
>> From: [email protected] [[email protected]] on 
>> behalf of Григорий Григоренко [[email protected]]
>> Sent: Monday, July 25, 2011 8:45 AM
>> To: [email protected]
>> Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow
>>
>> I think I narrowed the problem a bit.  Guys, hope I'm not bothering you too 
>> much :)
>>
>>
>> I've calculated size of index (it is index on log (kind,computer) ) of its 
>> own: dropped index, run VACUUM and re-created index.
>>
>> Database file increased by 105 Mb (and sqlite3 process counter shows that 
>> there were ~105 Mb written to disk).
>>
>>
>> This means that index on log(kind, computer) takes 105 Mb of database file 
>> (and whole size of database is 1259 Mb).
>>
>>
>> Now, I'm running query which is using this index (and is not returning any 
>> data) and monitor that sqlite3 process reads ~50 Mb.
>>
>>
>> So there are  two major problems here.
>>
>> 1) SQLITE has to read about _half of index_ before it can use it (and 
>> understand there are no records matching query).
>>
>> If cache is enough to hold 50 Mb then on subsuquent queries sqlite process 
>> is not reading at all.
>>
>> 2) SQLITE is reading abnormally slowly during this first-time running query  
>> (waiting for something a lot?).
>>
>> During index creation I monitored sqlite3 process and it was consuming CPU 
>> at ~20% rate and it's doing I/O at ~10 Mb per second rate.
>> That's what I call "normal load"!
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> [email protected]
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SELECT query first run is VERY slow

Reply via email to