Think about the distribution of your Data.
select count(*) as cnt,kind,computer
from log
group by kind,computer
order by cnt desc
what happens here?
SELECT *
FROM log INDEXED BY idxlog_kind_computer
WHERE kind = 'info' AND computer=1 and id 7070636
LIMIT 100;
there are 3_022_148 identical entries 'info,1' in your index
sqlite has to traverse near all of them and so it is not much help with
binary search.
Drop this index, run ANALYZE and sqlite will use your primary key quite
fast.
Also, if you want your data in reverse order try:
PRAGMA legacy_file_format=0;
CREATE TABLE log
(
id integer primary key DESC autoincrement,
msg text,
created_at int,
kind,
computer,
process,
who
);
Григорий Григоренко schrieb:
Did that. Timings has decreased. As I understand it it's about decreasing
index size (that includes kind column).
To me the problem is still there. If my database will have 10 mln log records
first running query will stuck again :(
I don't understand SQLITE strategy. Let me explain.
Index is an array of index records. They fill pages in database.
Searching with B-Tree index is similar to binary search in ordered array,
isn't it? You pick record in a middle of array subset and compare to
conditional value.
This step let you drop half of index subset from search.
Let's say size of index is 100 Mb and it contains 4 mln index records.
This is 100 Mb / 8 Kb (size of page) ~ 12000 pages.
While doing binary search for 4 mln records we do 22 compares at most (2^22 ~
4 mln).
Assume worst case - all of these comparings use different pages.
So, we need to read 22 pages = 180 Kb.
Surely there's additional data to be read for index. Like some intermediate
nodes in B-Tree.
Let's triple the number of pages, 66 pages = 540 Kb.
But SQLITE reads in this case ~ 50 Mb!!
This leads us to conclusion: index in SQLITE database if scattered and cannot
be jumped directly to N-th element. SQLITE has to read it somehow
consecutively.
And so SQLITE has to read half of index (!) to find matching index record.
Am I getting it right?
25 июля 2011, 19:35 от Black, Michael (IS) michael.bla...@ngc.com:
You need to normalize your kind value.
.pragma cache_size=15000;
drop index idxlog_kind_computer;
create table kind(id integer,kind text);
insert into kind values(1,'debug');
insert into kind values(2,'error');
insert into kind values(3,'info');
insert into kind values(4,'timing');
insert into kind values(5,'warn');
update log set kind=1 where kind='debug';
update log set kind=2 where kind='error';
update log set kind=3 where kind='info';
update log set kind=4 where kind='timing';
update log set kind=5 where kind='warn';
create index idxlog_kind_computer ON log(kind,computer);
Then see how long your first query takes.
Michael D. Black
Senior Scientist
NG Information Systems
Advanced Analytics Directorate
From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on
behalf of Григорий Григоренко [grigore...@mail.ru]
Sent: Monday, July 25, 2011 8:45 AM
To: sqlite-users@sqlite.org
Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow
I think I narrowed the problem a bit. Guys, hope I'm not bothering you too
much :)
I've calculated size of index (it is index on log (kind,computer) ) of its
own: dropped index, run VACUUM and re-created index.
Database file increased by 105 Mb (and sqlite3 process counter shows that
there were ~105 Mb written to disk).
This means that index on log(kind, computer) takes 105 Mb of database file
(and whole size of database is 1259 Mb).
Now, I'm running query which is using this index (and is not returning any
data) and monitor that sqlite3 process reads ~50 Mb.
So there are two major problems here.
1) SQLITE has to read about _half of index_ before it can use it (and
understand there are no records matching query).
If cache is enough to hold 50 Mb then on subsuquent queries sqlite process
is not reading at all.
2) SQLITE is reading abnormally slowly during this first-time running query
(waiting for something a lot?).
During index creation I monitored sqlite3 process and it was consuming CPU
at ~20% rate and it's doing I/O at ~10 Mb per second rate.
That's what I call normal load!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users