Re: [sqlite] SELECT query #first run# is very slow VOL2

2017-12-20 Thread Simon Slavin


On 19 Dec 2017, at 8:37pm, zakari  wrote:

> pasting some logs, Im declaring again this happening only the first time,
> afterwards working without problem.
> 2017-12-17 15:16:23 - execute
> 2017-12-17 15:17:20 - executed
> 
> 2017-12-19 14:53:35 - execute
> 2017-12-19 14:54:32 - executed
> 
> 2017-12-19 20:14:23 - execute
> 2017-12-19 20:15:20 - executed

I’m missing something here.  All these operations seem to take the same time … 
around a minute.  Do you have figures for the different one(s) ?

The first time you run it, the table is read into memory.  Subsequent searches 
don’t need disk access, so they're faster.

If you need to convince your user that all operations are fast, silently 
execute the SELECT command during startup so that the table is already in cache.

> 1) prepare the statement "select feed_title, feed_url, feed_date from feeds
> where feed_title like :searchTerm or feed_url like :searchTerm  LIMIT :limit
> OFFSET :offset"

There is no way to optimize in SQL for WHERE clause with to LIKE clauses.  The 
entire table must be read.

It may be that full text searching (SQLite FTS4 extension) would speed things 
up, but I don’t use it and have no experience.  Perhaps someone who uses it 
would tell you.  

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SELECT query #first run# is very slow VOL2

2017-12-20 Thread zakari
hi all,

I have exactly the same problem with topic :
http://sqlite.1065341.n5.nabble.com/SELECT-query-first-run-is-VERY-slow-td33100i20.html

--

The dbase sitting on linux server, Im accessing the dbase with PDO object.

-connected
-prepare the statement
-execute
here makes =>1min lag, *only the first time*
-executed

in detail this^ running on a php file, which is the pagination.php...

If I bring the dbase locally on windows machine, *there is no lag*...

I have *no indexes*, the dbase structure is very simple.

CREATE TABLE "feeds" ( "feed_id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
"feed_provider_id" INTEGER, "feed_title" TEXT, "feed_url" TEXT, "feed_date"
TEXT, "feed_timestamp" TEXT )

CREATE TABLE "provider_logs" ( "provider_log_id" INTEGER PRIMARY KEY
AUTOINCREMENT NOT NULL, "provider_id" INTEGER, "feed_type" INTEGER,
"feed_most_recent_feed_date" TEXT, "feed_count" INTEGER, "time2parse" REAL,
"date_rec" TEXT )

CREATE TABLE "providers" ( "provider_id" INTEGER PRIMARY KEY AUTOINCREMENT
NOT NULL, "provider_url" TEXT, "provider_enabled" INTEGER,
"provider_once_per_day" INTEGER, "provider_last_run" TEXT,
"provider_headline" TEXT )

feeds rows = 260k
providers= 43
provider_logs = 0

ofc I tried VACUUM, there is no difference.

pasting some logs, Im declaring again this happening only the first time,
afterwards working without problem.
2017-12-17 15:16:23 - execute
2017-12-17 15:17:20 - executed

2017-12-19 14:53:35 - execute
2017-12-19 14:54:32 - executed

2017-12-19 20:14:23 - execute
2017-12-19 20:15:20 - executed

--

the options through *SQLiteDatabaseBrowser* are like that :
 

--

the flow is :
1) prepare the statement "select feed_title, feed_url, feed_date from feeds
where feed_title like :searchTerm or feed_url like :searchTerm  LIMIT :limit
OFFSET :offset"
2) bindValues
2b) log (ex 2017-12-17 15:16:23 - execute)
3) execute
3b) log (2017-12-17 15:17:20 - executed)


any suggestion ?



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-28 Thread Max Vlasov
On Thu, Jul 28, 2011 at 11:41 AM, Григорий Григоренко wrote:

> So, should we file this as a defect or bug somehow?
>
>
As I understand currently the issues acknowledged and fixed during the
e-mail conversation in this list ( read
http://www.sqlite.org/src/wiki?name=Bug+Reports ). So I'd rather ask Richard
or Dan writing their resolution here about either behavior is by design or
the issue is acknowledged and (maybe) will be fixed.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-28 Thread Григорий Григоренко
So, should we file this as a defect or bug somehow?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-27 Thread Jay A. Kreibich
On Tue, Jul 26, 2011 at 05:13:00PM +, Black, Michael (IS) scratched on the 
wall:
> Part of the problem is it seems you can't create an index with rowid:
> 
> 
> 
> 3.7.5
> 
> sqlite> create table t(i int);
> sqlite> create index idx1 on t(i);
> sqlite> create index idx2 on t(i,rowid);
> Error: table t has no column named rowid
> 
> Any particular reason it can't be included in an index?

  Because it is always included as the last column.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Luuk
On 26-07-2011 10:30, Григорий Григоренко wrote:
>>
>> I thought that, if you have an index on a,b,c,d
>> than you should not have an index on a,b,c too
>> because if you use those 3 field in the where-clause, use can be made of
>> the 4-field index
>>
> 
> I'm not sure. Let me explain.
> 
> I need query to be ORDER BY id DESC. I've dropped this ORDER BY to simplify 
> my case. But in real-life app I need it.
> 
> So, index on (kind, computer) has these index records:
> 
> [ KIND ] [ COMPUTER ] [ ID ]
> ...
> 
> They are ordered by kind, computer and then by id. So ORDER BY id DESC comes 
> free.
> Query planner just need to retrieve records from subset of index records 
> starting at last one backwards.
> 
> Index on (kind,computer, process,who) has these index records:
> 
> [ KIND ] [ COMPUTER ] [ PROCESS ] [ WHO ] [ ID ]
> 
> This time having found subset of index records query planner cannot start 
> retrieving them from the last to first.
> It has to sort them by ID at first. Isn't it?
> 
> 
> Maybe I'm not getting it right?

I'm not sure, but you might have a good point

I was looking at your db, and did this, which did lead me to the
suggestion i made earlier:
C:\TEMP>sqlite3.exe 2011-07-24.dblite
SQLite version 3.7.2
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .schema
CREATE TABLE T(t);
CREATE TABLE item(name text,value text);
CREATE TABLE kind(id integer,kind text);
CREATE TABLE log(id integer primary key autoincrement,msg
text,created_at int,kind,computer,process,who);
CREATE UNIQUE INDEX idxitem_name_value ON item(name,value);
CREATE INDEX idxlog_created_at ON log(created_at);
CREATE INDEX idxlog_kind_computer_process_who ON
log(kind,computer,process,who);
sqlite>
sqlite> .read sql.txt
0|0|TABLE log WITH INDEX idxlog_kind_computer_process_who
2011-07-26 20:13:43.423
2011-07-26 20:13:43.438
sqlite>


sql.txt:
CREATE TABLE IF NOT EXISTS T(t); DELETE FROM T; INSERT INTO T VALUES(
strftime("%Y-%m-%d %H:%M:%f", "now") );

EXPLAIN QUERY PLAN SELECT * FROM log  WHERE kind = 'info' AND computer=1
and id > 7070636 LIMIT 100;

SELECT * FROM log  WHERE kind = 'info' AND computer=1 and id > 7070636
LIMIT 100;

INSERT INTO T VALUES( strftime("%Y-%m-%d %H:%M:%f", "now") );

SELECT * FROM T ;


And, YES, i did do a VACUUM and an ANALYZE before doing the query.

-- 
Luuk

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> 
> >
> > EXPLAIN QUERY PLAN SELECT * FROM foo WHERE bar = 'three' and id > 290
> > LIMIT 10;
> >
> > SELECT * FROM foo WHERE bar = 'three' and id > 290 LIMIT 10;
> >
> >
> > Run query. Now using idx2 index SQLITE reads only about 20 Kbytes!
> >
> >
> >
> Grigory, it seems you just added a field and copied rowid column to it, so
> your fast query is equivalent to queries with general fields that works as
> expected (see my post about a,b,c fields). Id that is mapped to rowid is
> still has special treatment and not used effectively. This still has to be
> explained. But if your requirements to your base allows you to add extra
> field, I think you can use it, just support autoincrement feature some other
> way.
> 


Well, as I wrote it's workaround. One can use it to avoid massive reads until 
SQLITE behaviour is fixed.

I absolutely agree with you that this behaviour of SQLITE should be explained. 

Or considered as defect and fixed.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Simon Slavin

On 26 Jul 2011, at 6:13pm, Black, Michael (IS) wrote:

> Part of the problem is it seems you can't create an index with rowid:
> 
> 
> 
> 3.7.5
> 
> sqlite> create table t(i int);
> sqlite> create index idx1 on t(i);
> sqlite> create index idx2 on t(i,rowid);
> Error: table t has no column named rowid
> 
> 
> 
> Any particular reason it can't be included in an index?

Because you didn't define such a column.  SQLite handles the various row 
aliases okay in an expression but not when you DEFINE things.  If you declare a 
column called 'rowid' it'll accept it in a CREATE command, even if it still 
just maps it to its internal 'id' INTEGER AUTOINCREMENT field.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Max Vlasov
On Tue, Jul 26, 2011 at 8:50 PM, Григорий Григоренко wrote:

>
> EXPLAIN QUERY PLAN SELECT * FROM foo WHERE bar = 'three' and id > 290
> LIMIT 10;
>
> SELECT * FROM foo WHERE bar = 'three' and id > 290 LIMIT 10;
>
>
> Run query. Now using idx2 index SQLITE reads only about 20 Kbytes!
>
>
>
Grigory, it seems you just added a field and copied rowid column to it, so
your fast query is equivalent to queries with general fields that works as
expected (see my post about a,b,c fields). Id that is mapped to rowid is
still has special treatment and not used effectively. This still has to be
explained. But if your requirements to your base allows you to add extra
field, I think you can use it, just support autoincrement feature some other
way.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Black, Michael (IS)
Part of the problem is it seems you can't create an index with rowid:



3.7.5

sqlite> create table t(i int);
sqlite> create index idx1 on t(i);
sqlite> create index idx2 on t(i,rowid);
Error: table t has no column named rowid



Any particular reason it can't be included in an index?





Michael D. Black

Senior Scientist

NG Information Systems

Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Григорий Григоренко [grigore...@mail.ru]
Sent: Tuesday, July 26, 2011 11:50 AM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow

I found some kind of workaround to solve this problem.

Create new database and run:


CREATE TABLE foo(bar);

INSERT INTO foo VALUES(null);  INSERT INTO foo VALUES(null);  INSERT INTO foo 
VALUES(null);

BEGIN;

INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo;

COMMIT;

DELETE FROM foo WHERE rowid > 300;

UPDATE foo SET bar = 'one' WHERE rowid <= 100;
UPDATE foo SET bar = 'two' WHERE rowid > 100 AND rowid < 200;
UPDATE foo SET bar = 'three' WHERE rowid > 200;

CREATE INDEX idx1 ON foo(bar);


This creates foo table with 3 mln records: 1 mln with bar = 'one', 1 mln with 
bar = 'two' and 1 mln with bar = 'three'.


Now, the query.

EXPLAIN QUERY PLAN SELECT * FROM foo WHERE bar = 'three' and rowid > 290 
LIMIT 10;

SELECT * FROM foo WHERE bar = 'three' and rowid > 290 LIMIT 10;


While running this query SQLITE  reads 18 Mbytes (and it uses idx1 index).

18 MBYTES.


Now, run this:

ALTER TABLE foo ADD COLUMN id; UPDATE foo SET id = rowid; CREATE INDEX idx2 ON 
foo(bar, id);


And update query: change rowid to id:

EXPLAIN QUERY PLAN SELECT * FROM foo WHERE bar = 'three' and id > 290 LIMIT 
10;

SELECT * FROM foo WHERE bar = 'three' and id > 290 LIMIT 10;


Run query. Now using idx2 index SQLITE reads only about 20 Kbytes!


20 KBYTES.




This script proves that using rowid in index is possible and eliminates the 
need of reading lots of data.

But SQLITE is not using it.  I think this behaviour is SQLITE defect.










___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
It's all about caching. 

If OS has already cached index data query executes in less than a second time.

To understand what is actually happening you should monitor reading count of 
SQLITE console process (or your app that is executing).

There shouldn't be reading of more than 1 Mb if SQLITE is using index properly.


26 июля 2011, 20:27 от "Black, Michael (IS)" :
> If I run your sql.txt script with the normalization of kind my first time 
> query shows 0 seconds.
> 
> D:\x>sqlite3 sq1  0|0|0|SEARCH TABLE log USING INDEX idxlog_kind_computer (kind=? AND 
> computer=?) (~406234 rows)
> Seconds elapsed: 0
> 
> 
> 
> Does yours show a longer time than that and/or a different plan?
> 
> 
> 
> 
> 
> Michael D. Black
> 
> Senior Scientist
> 
> NG Information Systems
> 
> Advanced Analytics Directorate
> 
> 
> 
> 
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
> behalf of Григорий Григоренко [grigore...@mail.ru]
> Sent: Tuesday, July 26, 2011 8:16 AM
> To: General Discussion of SQLite Database
> Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow
> 
> >
> > select count(*) as cnt,kind,computer
> > from log
> > group by kind,computer
> > order by cnt desc
> >
> > what happens here?
> >
> > SELECT *
> >  FROM log INDEXED BY idxlog_kind_computer
> >  WHERE kind = 'info' AND computer=1 and id > 7070636
> >  LIMIT 100;
> >
> > there are 3_022_148 identical entries 'info,1' in your index
> >
> > sqlite has to traverse near all of them and so it is not much help with
> > binary search.
> > Drop this index, run ANALYZE and sqlite will use your primary key quite
> > fast.
> 
> 
> The problem is: choosing primary key index is not always a good choice.
> Sometimes it can lead to scanning half of table records.
> 
> 
> Let's assume we have:
> 
> CREATE TABLE foo(bar);
> CREATE INDEX idx ON foo(bar);
> 
> INSERT INTO foo VALUES('one');
> .. 1 mln inserts on one..
> INSERT INTO foo VALUES('one');
> 
> INSERT INTO foo VALUES('two');
> .. 1 mln inserts of two..
> INSERT INTO foo VALUES('two');
> 
> INSERT INTO foo VALUES('three');
> .. 1 mln inserts of three..
> INSERT INTO foo VALUES('three');
> 
> Now, the query:
> 
> SELECT * FROM foo WHERE bar='three' and id > 1 LIMIT 10;
> 
> If query planner uses primary index it'll have to scan 2 mln records before 
> he hits first one matching query.
> 
> (speaking about "normal" settings, SQLITE scans from lesser rowid to greater)
> 
> And if query planner can somehow use index idx with both supported values { 
> bar:'three', id:1 } there will be no scanning.
> 
> It's binary search and it's what I need in my app.
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
I found some kind of workaround to solve this problem.

Create new database and run:


CREATE TABLE foo(bar);

INSERT INTO foo VALUES(null);  INSERT INTO foo VALUES(null);  INSERT INTO foo 
VALUES(null); 

BEGIN;

INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo; 
INSERT INTO foo SELECT bar FROM foo;
INSERT INTO foo SELECT bar FROM foo; INSERT INTO foo SELECT bar FROM foo;

COMMIT;

DELETE FROM foo WHERE rowid > 300;

UPDATE foo SET bar = 'one' WHERE rowid <= 100;
UPDATE foo SET bar = 'two' WHERE rowid > 100 AND rowid < 200;
UPDATE foo SET bar = 'three' WHERE rowid > 200;

CREATE INDEX idx1 ON foo(bar);


This creates foo table with 3 mln records: 1 mln with bar = 'one', 1 mln with 
bar = 'two' and 1 mln with bar = 'three'.


Now, the query.

EXPLAIN QUERY PLAN SELECT * FROM foo WHERE bar = 'three' and rowid > 290 
LIMIT 10;

SELECT * FROM foo WHERE bar = 'three' and rowid > 290 LIMIT 10;


While running this query SQLITE  reads 18 Mbytes (and it uses idx1 index).

18 MBYTES.


Now, run this:

ALTER TABLE foo ADD COLUMN id; UPDATE foo SET id = rowid; CREATE INDEX idx2 ON 
foo(bar, id);


And update query: change rowid to id:

EXPLAIN QUERY PLAN SELECT * FROM foo WHERE bar = 'three' and id > 290 LIMIT 
10;

SELECT * FROM foo WHERE bar = 'three' and id > 290 LIMIT 10;


Run query. Now using idx2 index SQLITE reads only about 20 Kbytes!


20 KBYTES.




This script proves that using rowid in index is possible and eliminates the 
need of reading lots of data.

But SQLITE is not using it.  I think this behaviour is SQLITE defect. 










___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Black, Michael (IS)
If I run your sql.txt script with the normalization of kind my first time query 
shows 0 seconds.

D:\x>sqlite3 sq1 
> select count(*) as cnt,kind,computer
> from log
> group by kind,computer
> order by cnt desc
>
> what happens here?
>
> SELECT *
>  FROM log INDEXED BY idxlog_kind_computer
>  WHERE kind = 'info' AND computer=1 and id > 7070636
>  LIMIT 100;
>
> there are 3_022_148 identical entries 'info,1' in your index
>
> sqlite has to traverse near all of them and so it is not much help with
> binary search.
> Drop this index, run ANALYZE and sqlite will use your primary key quite
> fast.


The problem is: choosing primary key index is not always a good choice.
Sometimes it can lead to scanning half of table records.


Let's assume we have:

CREATE TABLE foo(bar);
CREATE INDEX idx ON foo(bar);

INSERT INTO foo VALUES('one');
.. 1 mln inserts on one..
INSERT INTO foo VALUES('one');

INSERT INTO foo VALUES('two');
.. 1 mln inserts of two..
INSERT INTO foo VALUES('two');

INSERT INTO foo VALUES('three');
.. 1 mln inserts of three..
INSERT INTO foo VALUES('three');

Now, the query:

SELECT * FROM foo WHERE bar='three' and id > 1 LIMIT 10;

If query planner uses primary index it'll have to scan 2 mln records before he 
hits first one matching query.

(speaking about "normal" settings, SQLITE scans from lesser rowid to greater)

And if query planner can somehow use index idx with both supported values { 
bar:'three', id:1 } there will be no scanning.

It's binary search and it's what I need in my app.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Simon Slavin

On 26 Jul 2011, at 1:56pm,  
 wrote:

> How to write a trigger [snip]

Nirmala, please start a new thread with your new query.  Do not intrude into 
another person's thread.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Max Vlasov
2011/7/26 Black, Michael (IS) 

>
> Turns out the if you include the primary key in an index it doesn't use the
> triple index but uses the primary key instead.  And analyze doesn't change
> it.
>
> This is with version 3.7.5
>
>
Not sure about the primary index , because with my data and 3.7.6.2 it still
uses the index,  but visits every entry.

the fast one:
WHERE a=10 and b=20 and c> 100
explain query plan > SEARCH TABLE abctable USING COVERING INDEX idxabc (a=?
AND b=? AND c>?) (~2 rows)

the "slow" one
WHERE a=10 and b=20 and id>100
explain query plan > SEARCH TABLE abctable USING COVERING INDEX idxabid (a=?
AND b=?) (~3 rows)

I assume the optimizer could use the following "plan"
... COVERING INDEX idxabid (a=? AND b=? AND rowid>?)
but it didn't.
Maybe for a reason...

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> 
> select count(*) as cnt,kind,computer
> from log
> group by kind,computer
> order by cnt desc
> 
> what happens here?
> 
> SELECT *
>  FROM log INDEXED BY idxlog_kind_computer
>  WHERE kind = 'info' AND computer=1 and id > 7070636
>  LIMIT 100;
> 
> there are 3_022_148 identical entries 'info,1' in your index
> 
> sqlite has to traverse near all of them and so it is not much help with
> binary search.
> Drop this index, run ANALYZE and sqlite will use your primary key quite
> fast.


The problem is: choosing primary key index is not always a good choice. 
Sometimes it can lead to scanning half of table records.


Let's assume we have:

CREATE TABLE foo(bar);
CREATE INDEX idx ON foo(bar);

INSERT INTO foo VALUES('one');
.. 1 mln inserts on one..
INSERT INTO foo VALUES('one');

INSERT INTO foo VALUES('two');
.. 1 mln inserts of two..
INSERT INTO foo VALUES('two');

INSERT INTO foo VALUES('three');
.. 1 mln inserts of three..
INSERT INTO foo VALUES('three');

Now, the query:

SELECT * FROM foo WHERE bar='three' and id > 1 LIMIT 10;

If query planner uses primary index it'll have to scan 2 mln records before he 
hits first one matching query.

(speaking about "normal" settings, SQLITE scans from lesser rowid to greater)

And if query planner can somehow use index idx with both supported values { 
bar:'three', id:1 } there will be no scanning.

It's binary search and it's what I need in my app.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread narmada.jammula
Hi,

How to write a trigger so that it will log the updates on all tables in 
database into a auditlog tables?.
For Example:
The database contains 3 tables 1)country 2) state 3) auditlog
List of fields on each table

Country table:
Cid name

Sate table:
Sid name countryid

Auditlog table:
Aid TableName GUID ChangedDate DataChanged Action

Tablename: is the name of the table the action(insert/update/delete) has 
performed
Guid: is the primary key value of the table
DataChanged: contains the xml in string format.
Ex:If an operation is performed on Country table
Insert into country(cid, name) values (1, india)
Then the xml format will be: 
1
India>


I need a trigger in such a way that it will log all the actions performed on 
country & state tables into the auditlog table.

Please help me in writing the trigger for the above scenario and also provide 
example if possible


Regards,
Nirmala.

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of  ??
Sent: Tuesday, July 26, 2011 6:04 PM
To: Black, Michael (IS)
Cc: General Discussion of SQLite Database
Subject: Re: [sqlite] SELECT query first run is VERY slow



> Could you post your timings and read stats again?  Are you happy with what 
> you're seeing now?
> 

Actually, there was ~ 50% speed-up. More or less. The idea of normalizing 
worked great, thank you.

I'm concerned about SQLITE indexes. 


> 
> 
> I think when you drop the index and recreate it still scatters.  But re-doing 
> your data with the new kind will tell for sure.  I take it you have to build 
> the index while running and can't wait until it's done.


To me problem is narrowed to the fact that SQLITE is ignoring ROWID value while 
scanning index.

Using "INDEX ON Foo(Bar)"  and condtion like "Bar = XXX and Rowid > YYY" SQLITE 
reads too much.

Looks like in this case SQLITE reads all of index records matching condition 
Bar = XXX and scan them.

And if Bar has only a few allowed values (= bad selectivity ) this takes lot's 
of time and disk i/o cause there are lots of records with the same Bar value.

Looks like SQLITE is not using knowledge that index records are sorted by Bar 
and then by Rowid.


This is something I don't (and didn't) expect...




> 
> 
> 
> 
> 
> Michael D. Black
> 
> Senior Scientist
> 
> NG Information Systems
> 
> Advanced Analytics Directorate
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email. 

www.wipro.com
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко


26 июля 2011, 16:42 от res...@googlemail.com:
> Think about the distribution of your Data.
> 
> select count(*) as cnt,kind,computer
> from log
> group by kind,computer
> order by cnt desc
> 
> what happens here?
> 
> SELECT *
>  FROM log INDEXED BY idxlog_kind_computer
>  WHERE kind = 'info' AND computer=1 and id > 7070636
>  LIMIT 100;
> 
> there are 3_022_148 identical entries 'info,1' in your index


If there wasn't id condition in WHERE I would agree. 

But in my opinion this is not that simple.

Each index record includes ROWID. At the end. 

http://www.sqlite.org/queryplanner.html

"An index is another table similar to the original "fruitsforsale" table but 
with the content (the fruit column in this case) stored in front of the rowid 
and with all rows in content order."

http://www.sqlite.org/images/qp/idx1.gif

"The "fruit" column is the primary key used to order the elements of the table 
and the "rowid" is the secondary key used to break the tie when two or more 
rows have the same "fruit". In the example, the rowid has to be used as a 
tie-breaker for the "Orange" rows. Notice that since the rowid is always unique 
over all elements of the original table, the composite key of "fruit" followed 
by "rowid" will be unique over all elements of the index."

And this makes each index record different.

WHERE condition is providing query planner with 3 nice values to match exact 
index record.

And SQLITE ignores this knowledge and uses only first 2 values to search index. 

Why? 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread reseok
Think about the distribution of your Data.

select count(*) as cnt,kind,computer
from log
group by kind,computer
order by cnt desc

what happens here?

SELECT *
 FROM log INDEXED BY idxlog_kind_computer
 WHERE kind = 'info' AND computer=1 and id > 7070636
 LIMIT 100;

there are 3_022_148 identical entries 'info,1' in your index

sqlite has to traverse near all of them and so it is not much help with
binary search.
Drop this index, run ANALYZE and sqlite will use your primary key quite
fast.

Also, if you want your data in reverse order try:

PRAGMA legacy_file_format=0;
CREATE TABLE log
(
id integer primary key DESC autoincrement,
msg text,
created_at int,
kind,
computer,
process,
who
);




Григорий Григоренко schrieb:
> Did that. Timings has decreased.  As I understand it it's about decreasing 
> index size (that includes kind column).
> 
> 
> 
> To me the problem is still there. If my database will have 10 mln log records 
> first running query will stuck again :(
> 
> 
> 
> I don't understand SQLITE strategy. Let me explain.
> 
> Index is an array of index records. They fill pages in database. 
> 
> Searching with B-Tree index is similar to binary search in ordered array, 
> isn't it? You pick record in a middle of array subset and compare to 
> conditional value.
> 
> This step let you drop half of index subset from search.
> 
> Let's say size of index is 100 Mb and it contains 4 mln index records. 
> 
> This is 100 Mb / 8 Kb (size of page) ~ 12000 pages.
> 
> While doing binary search for 4 mln records we do 22 compares at most (2^22 ~ 
> 4 mln).
> 
> Assume worst case - all of these comparings use different pages. 
> 
> So, we need to read 22 pages = 180 Kb.
> 
> Surely there's additional data to be read for index. Like some intermediate 
> nodes in B-Tree.
> 
> Let's triple the number of pages, 66 pages = 540 Kb.
> 
> 
> But SQLITE reads in this case ~ 50 Mb!! 
> 
> 
> This leads us to conclusion: index in SQLITE database if scattered and cannot 
> be "jumped directly" to N-th element. SQLITE has to read it somehow 
> consecutively. 
> 
> And so SQLITE has to read half of index (!) to find matching index record.
> 
> 
> Am I getting it right?
> 
> 
> 
> 
> 
> 
> 25 июля 2011, 19:35 от "Black, Michael (IS)" :
>> You need to normalize your "kind" value.
>>
>>
>>
>> .pragma cache_size=15000;
>>
>> drop index idxlog_kind_computer;
>>
>> create table kind(id integer,kind text);
>> insert into kind values(1,'debug');
>> insert into kind values(2,'error');
>> insert into kind values(3,'info');
>> insert into kind values(4,'timing');
>> insert into kind values(5,'warn');
>> update log set kind=1 where kind='debug';
>> update log set kind=2 where kind='error';
>> update log set kind=3 where kind='info';
>> update log set kind=4 where kind='timing';
>> update log set kind=5 where kind='warn';
>> create index idxlog_kind_computer ON log(kind,computer);
>>
>>
>>
>> Then see how long your first query takes.
>>
>>
>>
>> Michael D. Black
>>
>> Senior Scientist
>>
>> NG Information Systems
>>
>> Advanced Analytics Directorate
>>
>>
>>
>> 
>> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
>> behalf of Григорий Григоренко [grigore...@mail.ru]
>> Sent: Monday, July 25, 2011 8:45 AM
>> To: sqlite-users@sqlite.org
>> Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow
>>
>> I think I narrowed the problem a bit.  Guys, hope I'm not bothering you too 
>> much :)
>>
>>
>> I've calculated size of index (it is index on log (kind,computer) ) of its 
>> own: dropped index, run VACUUM and re-created index.
>>
>> Database file increased by 105 Mb (and sqlite3 process counter shows that 
>> there were ~105 Mb written to disk).
>>
>>
>> This means that index on log(kind, computer) takes 105 Mb of database file 
>> (and whole size of database is 1259 Mb).
>>
>>
>> Now, I'm running query which is using this index (and is not returning any 
>> data) and monitor that sqlite3 process reads ~50 Mb.
>>
>>
>> So there are  two major problems here.
>>
>> 1) SQLITE has to read about _half of index_ before it can use it (and 
>> understand there are no records matching query).
>>
>&g

Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Black, Michael (IS)
That's because they have 2 completely different query plans.

I created the table so that id,a,b,c all had the same values so the indexing 
would be indentical.



#include 
main()
{
  int i;
  for(i=1;i<=10;++i) {
char sql[4096];
sprintf(sql,"insert into abctable(a,b,c) values(%d,%d,%d);",i,i,i);
puts(sql);
  }
}



Turns out the if you include the primary key in an index it doesn't use the 
triple index but uses the primary key instead.  And analyze doesn't change it.

This is with version 3.7.5

But...if you change "c" to be "unique" then it uses an auto_index.



sqlite> pragma cache_size=15;
sqlite> CREATE TABLE [abctable] ([id] INTEGER PRIMARY KEY AUTOINCREMENT, a, b, 
c);
sqlite> begin;
sqlite> .read x.sql
sqlite> commit;
sqlite> create index idxabc on abctable(a,b,c);
sqlite> create index idxabid on abctable(a,b,id);
sqlite> explain query plan select * from abctable where a=10 and b=20 and id=30;
0|0|0|SEARCH TABLE abctable USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
sqlite> explain query plan select * from abctable where a=10 and b=20 and c=30;
0|0|0|SEARCH TABLE abctable USING COVERING INDEX idxabc (a=? AND b=? AND c=?) 
(~8 rows)
sqlite> analyze;
sqlite> select * from sqlite_stat1;
abctable|idxabid|10 1 1 1
abctable|idxabc|10 1 1 1



Re-do with "c unique" in table:

sqlite> explain query plan select * from abctable where a=10 and b=20 and c=30;
0|0|0|SEARCH TABLE abctablsqlite> explain query plan select * from abctable 
where a=10 and b=20 and id>30;
0|0|0|SEARCH TABLE abctable USING COVERING INDEX idxabc (a=? AND b=?) (~3 rows)
sqlite> explain query plan select * from abctable where a=10 and b=20 and c>30;
0|0|0|SEARCH TABLE abctable USING COVERING INDEX idxabc (a=? AND b=? AND c>?) 
(~2 rows)e USING INDEX sqlite_autoindex_abctable_1 (c=?) (~1 rows)

Michael D. Black

Senior Scientist

NG Information Systems

Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Max Vlasov [max.vla...@gmail.com]
Sent: Tuesday, July 26, 2011 6:50 AM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] SELECT query first run is VERY slow

On Tue, Jul 26, 2011 at 2:18 PM, Simon Slavin  wrote:

>
> On 26 Jul 2011, at 11:05am, Григорий Григоренко wrote:
>
> > This leads us to conclusion: index in SQLITE database if scattered and
> cannot be "jumped directly" to N-th element. SQLITE has to read it somehow
> consecutively.
> >
> > And so SQLITE has to read half of index (!) to find matching index
> record.
>
> I don't think it's SQLite itself that's reading half the index.  I think
> it's some part of your operating system that's trying to cache all of your
> database file as SQLite reads lots of different parts of it spread about
> randomly.  Unfortunately I don't see how an OS can reasonably do that since
> it will produce the slow speeds you're complaining about.
>
>
Simon,

actually I narrowed down the problem  (cmiiw).

The simple table

CREATE TABLE [abctable] ([id] INTEGER PRIMARY KEY AUTOINCREMENT, a, b, c)

add many records (100,000)
INSERT INTO abctable (a, b, c) VALUES (10, 20, 30)

Good variant
CREATE INDEX idxabc ON abctable (a, b, c)
SELECT * FROM abctable WHERE a=10 and b=20 and c > 100
Sqlite reads few data (3k actually for 100,000 records) to show empty result

Another variant
CREATE INDEX idxabid ON abctable (a, b, id)
SELECT * FROM abctable WHERE a=10 and b=20 and id > 100

Sqlite reads much (1,7MB)

Checked with 3.7.6.2,
I suppose the latter due to some special meaning of the id/rowid, but I
suppose the second case should work with fewer reads

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко


> Could you post your timings and read stats again?  Are you happy with what 
> you're seeing now?
> 

Actually, there was ~ 50% speed-up. More or less. The idea of normalizing 
worked great, thank you.

I'm concerned about SQLITE indexes. 


> 
> 
> I think when you drop the index and recreate it still scatters.  But re-doing 
> your data with the new kind will tell for sure.  I take it you have to build 
> the index while running and can't wait until it's done.


To me problem is narrowed to the fact that SQLITE is ignoring ROWID value while 
scanning index.

Using "INDEX ON Foo(Bar)"  and condtion like "Bar = XXX and Rowid > YYY" SQLITE 
reads too much.

Looks like in this case SQLITE reads all of index records matching condition 
Bar = XXX and scan them.

And if Bar has only a few allowed values (= bad selectivity ) this takes lot's 
of time and disk i/o cause there are lots of records with the same Bar value.

Looks like SQLITE is not using knowledge that index records are sorted by Bar 
and then by Rowid.


This is something I don't (and didn't) expect...




> 
> 
> 
> 
> 
> Michael D. Black
> 
> Senior Scientist
> 
> NG Information Systems
> 
> Advanced Analytics Directorate
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Max Vlasov
On Tue, Jul 26, 2011 at 2:18 PM, Simon Slavin  wrote:

>
> On 26 Jul 2011, at 11:05am, Григорий Григоренко wrote:
>
> > This leads us to conclusion: index in SQLITE database if scattered and
> cannot be "jumped directly" to N-th element. SQLITE has to read it somehow
> consecutively.
> >
> > And so SQLITE has to read half of index (!) to find matching index
> record.
>
> I don't think it's SQLite itself that's reading half the index.  I think
> it's some part of your operating system that's trying to cache all of your
> database file as SQLite reads lots of different parts of it spread about
> randomly.  Unfortunately I don't see how an OS can reasonably do that since
> it will produce the slow speeds you're complaining about.
>
>
Simon,

actually I narrowed down the problem  (cmiiw).

The simple table

CREATE TABLE [abctable] ([id] INTEGER PRIMARY KEY AUTOINCREMENT, a, b, c)

add many records (100,000)
INSERT INTO abctable (a, b, c) VALUES (10, 20, 30)

Good variant
CREATE INDEX idxabc ON abctable (a, b, c)
SELECT * FROM abctable WHERE a=10 and b=20 and c > 100
Sqlite reads few data (3k actually for 100,000 records) to show empty result

Another variant
CREATE INDEX idxabid ON abctable (a, b, id)
SELECT * FROM abctable WHERE a=10 and b=20 and id > 100

Sqlite reads much (1,7MB)

Checked with 3.7.6.2,
I suppose the latter due to some special meaning of the id/rowid, but I
suppose the second case should work with fewer reads

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Simon Slavin

On 26 Jul 2011, at 11:05am, Григорий Григоренко wrote:

> This leads us to conclusion: index in SQLITE database if scattered and cannot 
> be "jumped directly" to N-th element. SQLITE has to read it somehow 
> consecutively. 
> 
> And so SQLITE has to read half of index (!) to find matching index record.

I don't think it's SQLite itself that's reading half the index.  I think it's 
some part of your operating system that's trying to cache all of your database 
file as SQLite reads lots of different parts of it spread about randomly.  
Unfortunately I don't see how an OS can reasonably do that since it will 
produce the slow speeds you're complaining about.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> 
> > I need query to be ORDER BY id DESC. I've dropped this ORDER BY to simplify 
> > my case. But in real-life app I need it.
> > 
> > So, index on (kind, computer) has these index records:
> > 
> > [ KIND ] [ COMPUTER ] [ ID ]
> 
> I don't know that SQLite does an inherent addition of the 'id' column to all 
> INDEXes.  Some index implementations do this because their algorithm requires 
> that all positions in an index are unique, but I don't think SQLite works 
> like that.  I think that if you want 'id' to be part of an index you have to 
> say that in the definition.

SQLITE always stores rowid in every index record. This rowid let SQLITE find 
approriate database record. You can read more here: 
http://www.sqlite.org/queryplanner.html 

> 
> Nevertheless this still doesn't explain why your first run of a query is so 
> much slower than subsequent runs.  SQLite's cache size is set here:
> 
> http://www.sqlite.org/pragma.html#pragma_cache_size
> 
> If the cache size described there doesn't explain the behaviour you're 
> seeing, the problem isn't with SQLite, it's in your application or the OS or 
> your hardware.


The problem is filling the cache. SQLITE wants too much data to execute simple 
query.

Cache size of SQLITE doesn't matter because OS does it's own caching. After 
query has been run for a first time OS caches disk pages.

And subsequent queries immediately gets data from RAM not from disk.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
Did that. Timings has decreased.  As I understand it it's about decreasing 
index size (that includes kind column).



To me the problem is still there. If my database will have 10 mln log records 
first running query will stuck again :(



I don't understand SQLITE strategy. Let me explain.

Index is an array of index records. They fill pages in database. 

Searching with B-Tree index is similar to binary search in ordered array, isn't 
it? You pick record in a middle of array subset and compare to conditional 
value.

This step let you drop half of index subset from search.

Let's say size of index is 100 Mb and it contains 4 mln index records. 

This is 100 Mb / 8 Kb (size of page) ~ 12000 pages.

While doing binary search for 4 mln records we do 22 compares at most (2^22 ~ 4 
mln).

Assume worst case - all of these comparings use different pages. 

So, we need to read 22 pages = 180 Kb.

Surely there's additional data to be read for index. Like some intermediate 
nodes in B-Tree.

Let's triple the number of pages, 66 pages = 540 Kb.


But SQLITE reads in this case ~ 50 Mb!! 


This leads us to conclusion: index in SQLITE database if scattered and cannot 
be "jumped directly" to N-th element. SQLITE has to read it somehow 
consecutively. 

And so SQLITE has to read half of index (!) to find matching index record.


Am I getting it right?






25 июля 2011, 19:35 от "Black, Michael (IS)" :
> You need to normalize your "kind" value.
> 
> 
> 
> .pragma cache_size=15000;
> 
> drop index idxlog_kind_computer;
> 
> create table kind(id integer,kind text);
> insert into kind values(1,'debug');
> insert into kind values(2,'error');
> insert into kind values(3,'info');
> insert into kind values(4,'timing');
> insert into kind values(5,'warn');
> update log set kind=1 where kind='debug';
> update log set kind=2 where kind='error';
> update log set kind=3 where kind='info';
> update log set kind=4 where kind='timing';
> update log set kind=5 where kind='warn';
> create index idxlog_kind_computer ON log(kind,computer);
> 
> 
> 
> Then see how long your first query takes.
> 
> 
> 
> Michael D. Black
> 
> Senior Scientist
> 
> NG Information Systems
> 
> Advanced Analytics Directorate
> 
> 
> 
> ____________
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
> behalf of Григорий Григоренко [grigore...@mail.ru]
> Sent: Monday, July 25, 2011 8:45 AM
> To: sqlite-users@sqlite.org
> Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow
> 
> I think I narrowed the problem a bit.  Guys, hope I'm not bothering you too 
> much :)
> 
> 
> I've calculated size of index (it is index on log (kind,computer) ) of its 
> own: dropped index, run VACUUM and re-created index.
> 
> Database file increased by 105 Mb (and sqlite3 process counter shows that 
> there were ~105 Mb written to disk).
> 
> 
> This means that index on log(kind, computer) takes 105 Mb of database file 
> (and whole size of database is 1259 Mb).
> 
> 
> Now, I'm running query which is using this index (and is not returning any 
> data) and monitor that sqlite3 process reads ~50 Mb.
> 
> 
> So there are  two major problems here.
> 
> 1) SQLITE has to read about _half of index_ before it can use it (and 
> understand there are no records matching query).
> 
> If cache is enough to hold 50 Mb then on subsuquent queries sqlite process is 
> not reading at all.
> 
> 2) SQLITE is reading abnormally slowly during this first-time running query  
> (waiting for something a lot?).
> 
> During index creation I monitored sqlite3 process and it was consuming CPU at 
> ~20% rate and it's doing I/O at ~10 Mb per second rate.
> That's what I call "normal load"!
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Simon Slavin

On 26 Jul 2011, at 9:30am, Григорий Григоренко wrote:

> I need query to be ORDER BY id DESC. I've dropped this ORDER BY to simplify 
> my case. But in real-life app I need it.
> 
> So, index on (kind, computer) has these index records:
> 
> [ KIND ] [ COMPUTER ] [ ID ]

I don't know that SQLite does an inherent addition of the 'id' column to all 
INDEXes.  Some index implementations do this because their algorithm requires 
that all positions in an index are unique, but I don't think SQLite works like 
that.  I think that if you want 'id' to be part of an index you have to say 
that in the definition.

Nevertheless this still doesn't explain why your first run of a query is so 
much slower than subsequent runs.  SQLite's cache size is set here:

http://www.sqlite.org/pragma.html#pragma_cache_size

If the cache size described there doesn't explain the behaviour you're seeing, 
the problem isn't with SQLite, it's in your application or the OS or your 
hardware.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> >
> > 1) SQLITE has to read about _half of index_ before it can use it (and 
> > understand there are no records matching query).
> >
> > If cache is enough to hold 50 Mb then on subsuquent queries sqlite process 
> > is not reading at all.
> >
> 
> Please, post your query. To understand whether sqlite reads too much
> or not it's better to know what exactly you want to select.
> 


This is script I ran:

CREATE TABLE IF NOT EXISTS T(t); DELETE FROM T; INSERT INTO T VALUES( 
strftime('%s', 'now' ) );

EXPLAIN QUERY PLAN SELECT * FROM log  INDEXED BY idxlog_kind_computer WHERE 
kind = 'info' AND computer=1 and id > 7070636 LIMIT 100;

SELECT * FROM log INDEXED BY idxlog_kind_computer WHERE kind = 'info' AND 
computer=1 and id > 7070636 LIMIT 100;

SELECT "Seconds elapsed: " || (strftime('%s', 'now' ) - t) FROM T ;



> 
> > 2) SQLITE is reading abnormally slowly during this first-time running query 
> >  (waiting for something a lot?).
> >
> 
> Is this with the recreated index or still the one that was created
> during the lifetime of your program?
> 


It doesn't matter. Tried it with old and with recreated index, same behaviour.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-26 Thread Григорий Григоренко
> 
> I thought that, if you have an index on a,b,c,d
> than you should not have an index on a,b,c too
> because if you use those 3 field in the where-clause, use can be made of
> the 4-field index
> 

I'm not sure. Let me explain.

I need query to be ORDER BY id DESC. I've dropped this ORDER BY to simplify my 
case. But in real-life app I need it.

So, index on (kind, computer) has these index records:

[ KIND ] [ COMPUTER ] [ ID ]
...

They are ordered by kind, computer and then by id. So ORDER BY id DESC comes 
free.
Query planner just need to retrieve records from subset of index records 
starting at last one backwards.

Index on (kind,computer, process,who) has these index records:

[ KIND ] [ COMPUTER ] [ PROCESS ] [ WHO ] [ ID ]

This time having found subset of index records query planner cannot start 
retrieving them from the last to first.
It has to sort them by ID at first. Isn't it?


Maybe I'm not getting it right?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-25 Thread Max Vlasov
On Mon, Jul 25, 2011 at 5:45 PM, Григорий Григоренко  wrote:
>
> 1) SQLITE has to read about _half of index_ before it can use it (and 
> understand there are no records matching query).
>
> If cache is enough to hold 50 Mb then on subsuquent queries sqlite process is 
> not reading at all.
>

Please, post your query. To understand whether sqlite reads too much
or not it's better to know what exactly you want to select.


> 2) SQLITE is reading abnormally slowly during this first-time running query  
> (waiting for something a lot?).
>

Is this with the recreated index or still the one that was created
during the lifetime of your program?

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-25 Thread Black, Michael (IS)
You need to normalize your "kind" value.



.pragma cache_size=15000;

drop index idxlog_kind_computer;

create table kind(id integer,kind text);
insert into kind values(1,'debug');
insert into kind values(2,'error');
insert into kind values(3,'info');
insert into kind values(4,'timing');
insert into kind values(5,'warn');
update log set kind=1 where kind='debug';
update log set kind=2 where kind='error';
update log set kind=3 where kind='info';
update log set kind=4 where kind='timing';
update log set kind=5 where kind='warn';
create index idxlog_kind_computer ON log(kind,computer);



Then see how long your first query takes.



Michael D. Black

Senior Scientist

NG Information Systems

Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Григорий Григоренко [grigore...@mail.ru]
Sent: Monday, July 25, 2011 8:45 AM
To: sqlite-users@sqlite.org
Subject: EXT :Re: [sqlite]SELECT query first run is VERY slow

I think I narrowed the problem a bit.  Guys, hope I'm not bothering you too 
much :)


I've calculated size of index (it is index on log (kind,computer) ) of its own: 
dropped index, run VACUUM and re-created index.

Database file increased by 105 Mb (and sqlite3 process counter shows that there 
were ~105 Mb written to disk).


This means that index on log(kind, computer) takes 105 Mb of database file (and 
whole size of database is 1259 Mb).


Now, I'm running query which is using this index (and is not returning any 
data) and monitor that sqlite3 process reads ~50 Mb.


So there are  two major problems here.

1) SQLITE has to read about _half of index_ before it can use it (and 
understand there are no records matching query).

If cache is enough to hold 50 Mb then on subsuquent queries sqlite process is 
not reading at all.

2) SQLITE is reading abnormally slowly during this first-time running query  
(waiting for something a lot?).

During index creation I monitored sqlite3 process and it was consuming CPU at 
~20% rate and it's doing I/O at ~10 Mb per second rate.
That's what I call "normal load"!














___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-25 Thread Luuk
On 25-07-2011 15:45, Григорий Григоренко wrote:
> I think I narrowed the problem a bit.  Guys, hope I'm not bothering you too 
> much :)
> 
> 
> I've calculated size of index (it is index on log (kind,computer) ) of its 
> own: dropped index, run VACUUM and re-created index.


I thought that, if you have an index on a,b,c,d
than you should not have an index on a,b,c too
because if you use those 3 field in the where-clause, use can be made of
the 4-field index

Therefore i think you can do:
DROP INDEX idxlog_kind;
DROP INDEX idxlog_kind_computer;
DROP INDEX idxlog_kind_computer_process;


It drops your db-size, and a 'normal' select will make use of the
idxlog_kind_computer_process_who index.


C:\TEMP>sqlite3 2011-07-24.dblite   0run.cmd

C:\TEMP>sqlite3 2011-07-24.dblite   0http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-25 Thread Григорий Григоренко
I think I narrowed the problem a bit.  Guys, hope I'm not bothering you too 
much :)


I've calculated size of index (it is index on log (kind,computer) ) of its own: 
dropped index, run VACUUM and re-created index.

Database file increased by 105 Mb (and sqlite3 process counter shows that there 
were ~105 Mb written to disk).


This means that index on log(kind, computer) takes 105 Mb of database file (and 
whole size of database is 1259 Mb).


Now, I'm running query which is using this index (and is not returning any 
data) and monitor that sqlite3 process reads ~50 Mb. 


So there are  two major problems here.

1) SQLITE has to read about _half of index_ before it can use it (and 
understand there are no records matching query).

If cache is enough to hold 50 Mb then on subsuquent queries sqlite process is 
not reading at all. 

2) SQLITE is reading abnormally slowly during this first-time running query  
(waiting for something a lot?).

During index creation I monitored sqlite3 process and it was consuming CPU at 
~20% rate and it's doing I/O at ~10 Mb per second rate.
That's what I call "normal load"! 














___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-25 Thread Григорий Григоренко
Okay, here are some steps I've done.

1) normalize db;

I've created single table (item). "computer","process" and "who" fields in log 
table became rowid integers pointing to this table. 
"kind" is still a short string.

2) give up covering indexes;

For those not aware (http://www.sqlite.org/queryplanner.html , 1.7 Covering 
Indices) it's an index that has additional columns at it's end ( being selected 
by query).
Covering index eliminate the need to read data from db records. But it 
increases size of index and size of database.

3) use INDEXED BY to suggest index to query planner.

Since indexes are created exactly for specific query.


Software was working during weekend. 

Yesterday's database has 6 mln records and it's only 1.1 Gb in size.

Comparing to 4 mln records and 3.5 Gb size before.

Now, the long running query took 27 seconds and it has read 50 Mb from database 
(compare to 2 minutes and 307 Mb before).

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-24 Thread Григорий Григоренко
ANALYZE lasted for ~ 15 minutes.


24 июля 2011, 17:21 от Tito Ciuro :
> Hi,
> 
> It has worked fairly well with small databases, but I see the problem with 
> medium to large files. Have you tried to run ANALYZE on your database? I'm 
> curious to know how long it takes.
> 
> -- Tito
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-24 Thread Tito Ciuro
Hi,

It has worked fairly well with small databases, but I see the problem with 
medium to large files. Have you tried to run ANALYZE on your database? I'm 
curious to know how long it takes.

-- Tito

On Jul 24, 2011, at 8:26 AM, Григорий Григоренко wrote:

>> 
>> Perhaps my post dated Aug. 19, 2009 will help a little bit:
>> 
>> http://www.cocoabuilder.com/archive/cocoa/242954-core-data-dog-slow-when-using-first-time-after-boot.html
>> 
>> -- Tito
>> 
> 
> Thanks for sharing.  "warming file" is a way to cache whole database as I 
> understand it.
> 
> After everything is cached scattered reading from database runs faster. 
> 
> Unfortunately, in my case base size is ~ 3.5 Gb; it's too big for this 
> strategy.
> 
> Even if I read at 25 Mb/s rate it will took 3500 / 25 = ~140 seconds just to 
> read whole db file.
> 
> And what's more important I've only 2 Gb of RAM. 
> 
> Anyway, thanks for sharing. I guess these cases are similar. 
> 
> 
> To me problem looks like this: 
> 
> SQLITE needs to read  (cache) from db a lot (too much?) while first-time 
> query execution even if a query uses nicely matched index and returns nothing.
> 
> And SQLITE is doing lot's of scattered readings during query execution; not 
> trying to somehow batch read or similar. That's why file caching helps.
> 
> If it's true not sure there's a simple and nice solution. 
> 
> I'll try some ideas (including normalization) and report results in this 
> topic next week.
> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-24 Thread Григорий Григоренко
> 
> Perhaps my post dated Aug. 19, 2009 will help a little bit:
> 
> http://www.cocoabuilder.com/archive/cocoa/242954-core-data-dog-slow-when-using-first-time-after-boot.html
> 
> -- Tito
> 

Thanks for sharing.  "warming file" is a way to cache whole database as I 
understand it.

After everything is cached scattered reading from database runs faster. 

Unfortunately, in my case base size is ~ 3.5 Gb; it's too big for this strategy.

Even if I read at 25 Mb/s rate it will took 3500 / 25 = ~140 seconds just to 
read whole db file.

And what's more important I've only 2 Gb of RAM. 

Anyway, thanks for sharing. I guess these cases are similar. 


To me problem looks like this: 

SQLITE needs to read  (cache) from db a lot (too much?) while first-time query 
execution even if a query uses nicely matched index and returns nothing.

And SQLITE is doing lot's of scattered readings during query execution; not 
trying to somehow batch read or similar. That's why file caching helps.

If it's true not sure there's a simple and nice solution. 

I'll try some ideas (including normalization) and report results in this topic 
next week.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
Thank you for explaining this. 

I guess you're right about query planner deciding to avoid index usage based on 
stats. 

22 июля 2011, 18:30 от Richard Hipp :
> On Fri, Jul 22, 2011 at 9:48 AM, Григорий Григоренко 
> wrote:
> 
> > >
> > > Please post the results of:
> > >
> > > SELECT * FROM sqlite_stat1;
> > >
> > >
> >   tbl = log
> >  idx = idxlog_kind_computer_process_who_id_msg_created_at
> >  stat = 2815667 563134 563134 469278 74097 1 1 1
> >
> 
> The first number on "stat" is the number of rows in the table.  The 2nd
> number is the average number of rows that have the same value for the first
> column of the index.  The 3rd number is the average number of rows that have
> the same value for the first 2 columns of the index.  And so forth.
> 
> There are a huge number of rows that have the same value for the first 4
> terms of this index, which shows us that this is a really lousy index.
> Ideally, you want the second number in the "stat" column to be something
> small, like 10.
> 
> If you do not run ANALYZE, SQLite has no way of knowing that the index is
> mostly useless.  SQLite assumes that the index is a good one, and that the
> 2nd integer in "stat" is 10.  And it therefore tries to use the index.  But
> since the index is so bad, the resulting performance is slow.
> 
> After running ANALYZE, SQLite realizes that the index is lousy and avoids
> using it.  Hence, performance is much better.
> 
> 
> >
> >  tbl = log
> >  idx = idxlog_kind_computer_process_id_who_msg_created_at
> >  stat = 2815667 563134 563134 469278 1 1 1 1
> >
> >  tbl = log
> >  idx = idxlog_kind_computer_id_process_who_msg_created_at
> >  stat = 2815667 563134 563134 1 1 1 1 1
> >
> >  tbl = log
> >  idx = idxlog_kind_id_computer_process_who_msg_created_at
> >  stat = 2815667 563134 1 1 1 1 1 1
> >
> >  tbl = log
> >  idx = idxlog_kind_computer_process_who_msg_created_at
> >  stat = 2815667 563134 563134 469278 74097 2 2
> >
> >  tbl = log
> >  idx = idxlog_created_at
> >  stat = 2815667 106
> >
> 
> 
> 
> -- 
> D. Richard Hipp
> d...@sqlite.org
> 
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Tito Ciuro
Hello all,

Perhaps my post dated Aug. 19, 2009 will help a little bit:

http://www.cocoabuilder.com/archive/cocoa/242954-core-data-dog-slow-when-using-first-time-after-boot.html

-- Tito

On Jul 22, 2011, at 10:40 AM, Simon Slavin wrote:

> 
> On 22 Jul 2011, at 2:11pm, Григорий Григоренко wrote:
> 
>> Let's say it'll shrink by 50%. Still, if first-time runnning query timing 
>> will change from 2 min to 1 min it is still not acceptable. 
>> I cannot wait even a minute without logging.
>> 
>> So, before restructuring database and re-writing code I just want to 
>> understand SQLITE behaviour.
>> The question is: does SQLITE need to load whole index to start using it? Or 
>> SQLITE can use partial loads to quickly find records it needs (defined by 
>> columns in conditions)?
> 
> SQLite loads only the pages it needs.  It doesn't load a whole index into 
> memory unless the entire index fits in one page (only a few rows).  If your 
> search still runs too slowly your problem is more likely to be that you are 
> searching on text fields rather than integer fields.
> 
> Searching on integers is /must/ faster than searching on test fields.  If you 
> store the id code of the computer in your log file rather than the text name 
> of the computer, a search for log entries for a particular computer will be 
> faster.  I don't know what you're doing with your 'kind' column, but again if 
> you assign integer values to your different values for 'kind' and store those 
> instead, your search will be faster.
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Richard Hipp
On Fri, Jul 22, 2011 at 9:48 AM, Григорий Григоренко wrote:

> >
> > Please post the results of:
> >
> > SELECT * FROM sqlite_stat1;
> >
> >
>   tbl = log
>  idx = idxlog_kind_computer_process_who_id_msg_created_at
>  stat = 2815667 563134 563134 469278 74097 1 1 1
>

The first number on "stat" is the number of rows in the table.  The 2nd
number is the average number of rows that have the same value for the first
column of the index.  The 3rd number is the average number of rows that have
the same value for the first 2 columns of the index.  And so forth.

There are a huge number of rows that have the same value for the first 4
terms of this index, which shows us that this is a really lousy index.
Ideally, you want the second number in the "stat" column to be something
small, like 10.

If you do not run ANALYZE, SQLite has no way of knowing that the index is
mostly useless.  SQLite assumes that the index is a good one, and that the
2nd integer in "stat" is 10.  And it therefore tries to use the index.  But
since the index is so bad, the resulting performance is slow.

After running ANALYZE, SQLite realizes that the index is lousy and avoids
using it.  Hence, performance is much better.


>
>  tbl = log
>  idx = idxlog_kind_computer_process_id_who_msg_created_at
>  stat = 2815667 563134 563134 469278 1 1 1 1
>
>  tbl = log
>  idx = idxlog_kind_computer_id_process_who_msg_created_at
>  stat = 2815667 563134 563134 1 1 1 1 1
>
>  tbl = log
>  idx = idxlog_kind_id_computer_process_who_msg_created_at
>  stat = 2815667 563134 1 1 1 1 1 1
>
>  tbl = log
>  idx = idxlog_kind_computer_process_who_msg_created_at
>  stat = 2815667 563134 563134 469278 74097 2 2
>
>  tbl = log
>  idx = idxlog_created_at
>  stat = 2815667 106
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко


> > >
> > > Something that might speed up your operation is to issue the SQL command
> > 'ANALYZE' just once.  The results are stored in the database file, so you
> > can just do it manually now you have some data in the database.  It gives
> > the query optimizers lots of clues about how best to optimize each query. So
> > do an ANALYZE, then close the database and do your timing tests again.
> > >
> >
> >
> > Well, ANALYZE has changed something. Now query is using PRIMARY KEY and
> > executing almost instantly :-)
> >
> 
> Please post the results of:
> 
> SELECT * FROM sqlite_stat1;
> 
> 

Sure,

  tbl = T
  idx = 
 stat = 1

  tbl = item
  idx = idxitem_name_value
 stat = 21 7 1

  tbl = log
  idx = idxlog_kind_computer_process_who_id_msg_created_at
 stat = 2815667 563134 563134 469278 74097 1 1 1

  tbl = log
  idx = idxlog_kind_computer_process_id_who_msg_created_at
 stat = 2815667 563134 563134 469278 1 1 1 1

  tbl = log
  idx = idxlog_kind_computer_id_process_who_msg_created_at
 stat = 2815667 563134 563134 1 1 1 1 1

  tbl = log
  idx = idxlog_kind_id_computer_process_who_msg_created_at
 stat = 2815667 563134 1 1 1 1 1 1

  tbl = log
  idx = idxlog_kind_computer_process_who_msg_created_at
 stat = 2815667 563134 563134 469278 74097 2 2

  tbl = log
  idx = idxlog_created_at
 stat = 2815667 106
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Max Vlasov
On Fri, Jul 22, 2011 at 5:25 PM, Григорий Григоренко  wrote:
> Database is "insert-only". There wasn't any deletes or updates, will VACUUM 
> actually help in this case? I though it was about unused space?
>

There's also internal fragmentation coming from the nature of indexes.
If your inserted data have to be placed in the middle of the index and
this allocates a new page at the end of the file then the logical
order of the pages used for this index will be non-sequential in terms
of hard disc access.

>
> And after finding this position inside index SQLITE clearly sees that 
> previous and next record does not match query, so query should return nothing.
>
> This cannot take long. It's like scanning B-TREE to a predefined position and 
> then reading just 2 records near.
>
> And that's all. Why spending 2 minutes?


You query contained the id range not existed in the db so maybe here
can be some optimization improvement. That's why probably Richard
asked you to send the analyzed data. But as I suppose you're not going
to supply non existing id ranges :) so if we change the id range to a
valid one, the data will contain many records and sqlite should read
the index that is internally fragmented due to the reasons explain
above.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Simon Slavin

On 22 Jul 2011, at 2:11pm, Григорий Григоренко wrote:

> Let's say it'll shrink by 50%. Still, if first-time runnning query timing 
> will change from 2 min to 1 min it is still not acceptable. 
> I cannot wait even a minute without logging.
> 
> So, before restructuring database and re-writing code I just want to 
> understand SQLITE behaviour.
> The question is: does SQLITE need to load whole index to start using it? Or 
> SQLITE can use partial loads to quickly find records it needs (defined by 
> columns in conditions)?

SQLite loads only the pages it needs.  It doesn't load a whole index into 
memory unless the entire index fits in one page (only a few rows).  If your 
search still runs too slowly your problem is more likely to be that you are 
searching on text fields rather than integer fields.

Searching on integers is /must/ faster than searching on test fields.  If you 
store the id code of the computer in your log file rather than the text name of 
the computer, a search for log entries for a particular computer will be 
faster.  I don't know what you're doing with your 'kind' column, but again if 
you assign integer values to your different values for 'kind' and store those 
instead, your search will be faster.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
Database is "insert-only". There wasn't any deletes or updates, will VACUUM 
actually help in this case? I though it was about unused space?



Still. There's an index:

 idxlog_kind_computer_id_process_who_msg_created_at (

kind, computer, id, ...

)

Query is

kind = XXX AND computer = YYY and id BETWEEN ZZZ1 and ZZZ2

To my opinion this information is sufficient to jump directly to index position 
XXX,YYY,ZZZ1.

SQLITE must be able to do this real fast, isn't it?

And after finding this position inside index SQLITE clearly sees that previous 
and next record does not match query, so query should return nothing. 

This cannot take long. It's like scanning B-TREE to a predefined position and 
then reading just 2 records near.

And that's all. Why spending 2 minutes? 


Does SQLITE needs all of index to be in RAM to begin using  it? 
If so it can be the reason of bad performance...


22 июля 2011, 17:07 от Max Vlasov :
> On Fri, Jul 22, 2011 at 3:50 PM, Григорий Григоренко  
> wrote:
> > Okay, I rebooted and tested again.
> >
> > First run took 76 seconds. Read bytes: ~ 307 Mb, Write bytes: ~ 66 Kb.
> >
> > Re-running: 1 second, Read bytes: ~ 307 Mb, Write bytes ~ 66 Kb.
> >
> >
> 
> Grigory, you posted to me directly (without the list e-mail), I'm
> reposting this reply to the list
> 
> Ok, I downloaded the db and it actually takes long time to execute.
> But this can be explained. Actually if you change your fist id > to
> zero, the query starts to return actual results, so there are many
> records with kind='info' and computer = 'KRAFTWAY' in your database.
> So sqlite really uses the index for locating your records. But the
> problem is also that your db is very fragmented so sqlite when reading
> the index actually reads different parts of this file and it makes the
> windows cache system mad. I think vacuum command should help.
> 
> Max
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Richard Hipp
On Fri, Jul 22, 2011 at 8:55 AM, Григорий Григоренко wrote:

>
>
>
> >
> > Something that might speed up your operation is to issue the SQL command
> 'ANALYZE' just once.  The results are stored in the database file, so you
> can just do it manually now you have some data in the database.  It gives
> the query optimizers lots of clues about how best to optimize each query. So
> do an ANALYZE, then close the database and do your timing tests again.
> >
>
>
> Well, ANALYZE has changed something. Now query is using PRIMARY KEY and
> executing almost instantly :-)
>

Please post the results of:

SELECT * FROM sqlite_stat1;


>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
You are right. There's a way of normalizing and it will certainly reduce 
database size. 
Let's say it'll shrink by 50%. Still, if first-time runnning query timing will 
change from 2 min to 1 min it is still not acceptable. 
I cannot wait even a minute without logging.

So, before restructuring database and re-writing code I just want to understand 
SQLITE behaviour.
The question is: does SQLITE need to load whole index to start using it? Or 
SQLITE can use partial loads to quickly find records it needs (defined by 
columns in conditions)?





> You need to normalize your database.
> 
> 
> 
> 
> 
> 
> 
> Though it's easy to put everything in one table it's horrendously inefficient 
> for indexing.
> 
> 
> 
> Your indexes are huge as you're putting strings in them.
> 
> 
> 
> 
> 
> 
> 
> Ideally you should never have a string in an index if you can avoid it.  Too 
> much space, too long a time to index, to long a time to compare.
> 
> 
> 
> 
> 
> 
> 
> So for starters create a "kind" table and "computer" table and just put the 
> _rowid into your table instead of the string.
> 
> 
> 
> 
> 
> 
> 
> I'll bet you things will run a lot faster and your database will shrink.
> 
> 
> 
> 
> 
> 
> 
> Same goes for other strings.
> 
> 
> 
> 
> 
> 
> 
> Michael D. Black
> 
> 
> 
> Senior Scientist
> 
> 
> 
> NG Information Systems
> 
> 
> 
> Advanced Analytics Directorate
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Max Vlasov
On Fri, Jul 22, 2011 at 3:50 PM, Григорий Григоренко  wrote:
> Okay, I rebooted and tested again.
>
> First run took 76 seconds. Read bytes: ~ 307 Mb, Write bytes: ~ 66 Kb.
>
> Re-running: 1 second, Read bytes: ~ 307 Mb, Write bytes ~ 66 Kb.
>
>

Grigory, you posted to me directly (without the list e-mail), I'm
reposting this reply to the list

Ok, I downloaded the db and it actually takes long time to execute.
But this can be explained. Actually if you change your fist id > to
zero, the query starts to return actual results, so there are many
records with kind='info' and computer = 'KRAFTWAY' in your database.
So sqlite really uses the index for locating your records. But the
problem is also that your db is very fragmented so sqlite when reading
the index actually reads different parts of this file and it makes the
windows cache system mad. I think vacuum command should help.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко



> 
> Something that might speed up your operation is to issue the SQL command 
> 'ANALYZE' just once.  The results are stored in the database file, so you can 
> just do it manually now you have some data in the database.  It gives the 
> query optimizers lots of clues about how best to optimize each query. So do 
> an ANALYZE, then close the database and do your timing tests again.
> 


Well, ANALYZE has changed something. Now query is using PRIMARY KEY and 
executing almost instantly :-)

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Black, Michael (IS)
You need to normalize your database.



Though it's easy to put everything in one table it's horrendously inefficient 
for indexing.

Your indexes are huge as you're putting strings in them.



Ideally you should never have a string in an index if you can avoid it.  Too 
much space, too long a time to index, to long a time to compare.



So for starters create a "kind" table and "computer" table and just put the 
_rowid into your table instead of the string.



I'll bet you things will run a lot faster and your database will shrink.



Same goes for other strings.



Michael D. Black

Senior Scientist

NG Information Systems

Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Григорий Григоренко [grigore...@mail.ru]
Sent: Friday, July 22, 2011 5:38 AM
To: sqlite-users@sqlite.org
Subject: EXT :[sqlite] SELECT query first run is VERY slow

I have a log's database. Schema :


CREATE TABLE log(id integer primary key autoincrement,msg text,created_at 
int,kind text,computer text,process text,who text);
CREATE INDEX idxlog_created_at ON log(created_at);
CREATE INDEX idxlog_kind_computer_id_process_who_msg_created_at ON 
log(kind,computer,id desc,process,who,msg,created_at);
CREATE INDEX idxlog_kind_computer_process_id_who_msg_created_at ON 
log(kind,computer,process,id desc,who,msg,created_at);
CREATE INDEX idxlog_kind_computer_process_who_id_msg_created_at ON 
log(kind,computer,process,who,id desc,msg,created_at);
CREATE INDEX idxlog_kind_id_computer_process_who_msg_created_at ON log(kind,id 
desc,computer,process,who,msg,created_at);


Database: ~ 3.5 Gb size, ~ 2.8 mln records. page_size=8192 bytes, UTF-8.

Platform: Windows XP, Intel Core Duo 3 Ghz.

SQLITE: sqlite3.exe console, latest version (3.7.7.1).


Query:

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log WHERE id > 4070636 AND id <= 9223372036854775807 AND kind = 'info' AND 
computer='KRAFTWAY';


Query plan (EXPLAIN QUERY PLAN):

0|0|0|SEARCH TABLE log USING COVERING INDEX 
idxlog_kind_computer_process_who_id_msg_created_at (kind=? AND computer=?) (~2 
rows)

There are no records in database that match query, i.e. query returns nothing.


Now, the problem.


Reboot system. Run query. Elapsed: ~100 SECONDS. Re-run query. Now it executes 
in less than a second. Re-run query. The same: less than a second. Tried it 
several times.


RUNNING THIS QUERY JUST AFTER REBOOT TAKES UNPREDICTABLE LARGE TIME, FROM 1 TO 
2 MINUTES.


Is this a sqlite bug or normal behaviour?

If it's normal, there are some questions.

Am I hitting some limit with my database? Which one?

Why is SQLITE (having nicely matching index) is spending so much time to 
execute?

SQLITE has to cache some index data? Yeah but the console process (during 
execution of this query) is reading at about 800 Kb per second rate! (while HDD 
can make 100 Mb per second) Why so serious? Execuse me, why so slow?

And SQLITE process it's not consuming CPU (less than 1 percent). What is SQLITE 
doing during this long running query?

It's own private memory is holding still at about 20 Mb. It's not consuming CPU 
and HDD. What is he waiting for?


This problem makes SQLITE completely unusable in my project.


I've shared database and scripts (Warning: it's RAR archive, ZIP cannot hold 
files of 3.5 Gb size. You can download WINRAR here: 
http://www.rarlab.com/download.htm ):

http://dl.dropbox.com/u/2168777/deadly_sqlite.rar

Download size: ~150 Mb.

Unzip and execute "run.cmd".

You'll see execution plan and after a while there will be line "Seconds 
elapsed: XXX".









___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
> 
> > Database extension is ".dblite"
> 
> That should be okay.  No need to change that.
> 
> Something that might speed up your operation is to issue the SQL command 
> 'ANALYZE' just once.  The results are stored in the database file, so you can 
> just do it manually now you have some data in the database.  It gives the 
> query optimizers lots of clues about how best to optimize each query. So do 
> an ANALYZE, then close the database and do your timing tests again.

Okay, will do that.

> 
> If you're still getting strange results after that, you can use the 
> command-line tool to make SQLite reveal whether it's using an index or not.  
> You do that by issuing your SELECT command with 'EXPLAIN QUERY PLAN ' at the 
> beginning of it:
> 
> EXPLAIN QUERY PLAN SELECT 1 as today, id as rowid, 
> kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4070636 AND 
> id <= 9223372036854775807 AND kind = 'info' AND computer='KRAFTWAY';
> 
> Looking at the output it produces (which you may post, by all means) any 
> mention of 'scan' means it's having to read every record in the database, 
> whereas a mention of an index means it's using an index as expected.
> 

You missed it in my post - I did it. I did "EXPLAIN ..." and SQLITE says it's 
using covering index. There's nohting wrong in the query plan.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Simon Slavin

On 22 Jul 2011, at 12:36pm, Григорий Григоренко wrote:

> Database extension is ".dblite"

That should be okay.  No need to change that.

Something that might speed up your operation is to issue the SQL command 
'ANALYZE' just once.  The results are stored in the database file, so you can 
just do it manually now you have some data in the database.  It gives the query 
optimizers lots of clues about how best to optimize each query. So do an 
ANALYZE, then close the database and do your timing tests again.

If you're still getting strange results after that, you can use the 
command-line tool to make SQLite reveal whether it's using an index or not.  
You do that by issuing your SELECT command with 'EXPLAIN QUERY PLAN ' at the 
beginning of it:

EXPLAIN QUERY PLAN SELECT 1 as today, id as rowid, 
kind,who,msg,computer,process,created_at,id FROM log WHERE id > 4070636 AND id 
<= 9223372036854775807 AND kind = 'info' AND computer='KRAFTWAY';

Looking at the output it produces (which you may post, by all means) any 
mention of 'scan' means it's having to read every record in the database, 
whereas a mention of an index means it's using an index as expected.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Max Vlasov
On Fri, Jul 22, 2011 at 3:36 PM, Григорий Григоренко  wrote:
> Database extension is ".dblite"
>
> I'm using Process Explorer 
> (http://technet.microsoft.com/en-us/sysinternals/bb896653) to monitor 
> processes cpu and i/o usage.
>
> During these long running queries I am not using any other program. I've 
> terminated any unused service  (MS SQL, for example). System idle CPU is at 
> 98-99% during all the time.
>
> I've just tried another system (of my colleague) that has Windows 7. Same 
> result: first run took 40 seconds, second: 1 second.
>


What about Read Bytes? How much did sqlite.exe actually read during the run?

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Max Vlasov
On Fri, Jul 22, 2011 at 3:28 PM, Simon Slavin  wrote:
>
> Does Windows XP have some sort of task display where you can see what task is 
> hogging most of the CPU or disk access ?
>

The mentioned Task Manager (Ctrl-Alt-Del -> Task Manager) reports both
CPU and I/O Read Bytes (if one selects the column for the latter) for
every process so if Grigory shares this info with us, this will be
very useful.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
Database extension is ".dblite"

I'm using Process Explorer 
(http://technet.microsoft.com/en-us/sysinternals/bb896653) to monitor processes 
cpu and i/o usage.

During these long running queries I am not using any other program. I've 
terminated any unused service  (MS SQL, for example). System idle CPU is at 
98-99% during all the time. 

I've just tried another system (of my colleague) that has Windows 7. Same 
result: first run took 40 seconds, second: 1 second.




22 июля 2011, 15:28 от Simon Slavin :
> 
> On 22 Jul 2011, at 12:15pm, Max Vlasov wrote:
> 
> > Do you have and anitvirus software installed? It may scan the file at
> > the first usage.
> 
> What is the extension to the filename of the database file ?  Windows does 
> fancy caching for files that have some particular extensions including, IIRC, 
> '.db'.  Unfortunately although this works well for database files used for 
> Windows it's terrible for ones which aren't.
> 
> If you've used a common extension for the database file rename it '.sqlite' 
> or something that isn't going to be confused for another database format.
> 
> And to answer your question, no, it's not normal or a known bug.
> 
> Does Windows XP have some sort of task display where you can see what task is 
> hogging most of the CPU or disk access ?
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Simon Slavin

On 22 Jul 2011, at 12:15pm, Max Vlasov wrote:

> Do you have and anitvirus software installed? It may scan the file at
> the first usage.

What is the extension to the filename of the database file ?  Windows does 
fancy caching for files that have some particular extensions including, IIRC, 
'.db'.  Unfortunately although this works well for database files used for 
Windows it's terrible for ones which aren't.

If you've used a common extension for the database file rename it '.sqlite' or 
something that isn't going to be confused for another database format.

And to answer your question, no, it's not normal or a known bug.

Does Windows XP have some sort of task display where you can see what task is 
hogging most of the CPU or disk access ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
I have antivirus (avast) but it's disabled during testing.

22 июля 2011, 15:15 от Max Vlasov :
> On Fri, Jul 22, 2011 at 2:38 PM, Григорий Григоренко  
> wrote:
> > I have a log's database. Schema :
> >
> > Query:
> >
> > SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
> > FROM log WHERE id > 4070636 AND id <= 9223372036854775807 AND kind = 'info' 
> > AND computer='KRAFTWAY';
> >...
> >
> >
> > Reboot system. Run query. Elapsed: ~100 SECONDS. Re-run query. Now it 
> > executes in less than a second. Re-run query. The same: less than a second. 
> > Tried it several times.
> >
> >
> > RUNNING THIS QUERY JUST AFTER REBOOT TAKES UNPREDICTABLE LARGE TIME, FROM 1 
> > TO 2 MINUTES.
> >
> >
> 
> Do you have and anitvirus software installed? It may scan the file at
> the first usage.
> 
> To be sure sqlite doesn't read too much I suggest opening Task
> Manager, choosing column I/O Read Bytes to be shown and run the query.
> This also can give a hint about the software that possibly reads the
> file all the this time.
> 
> Max
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Max Vlasov
On Fri, Jul 22, 2011 at 2:38 PM, Григорий Григоренко  wrote:
> I have a log's database. Schema :
>
> Query:
>
> SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
> FROM log WHERE id > 4070636 AND id <= 9223372036854775807 AND kind = 'info' 
> AND computer='KRAFTWAY';
>...
>
>
> Reboot system. Run query. Elapsed: ~100 SECONDS. Re-run query. Now it 
> executes in less than a second. Re-run query. The same: less than a second. 
> Tried it several times.
>
>
> RUNNING THIS QUERY JUST AFTER REBOOT TAKES UNPREDICTABLE LARGE TIME, FROM 1 
> TO 2 MINUTES.
>
>

Do you have and anitvirus software installed? It may scan the file at
the first usage.

To be sure sqlite doesn't read too much I suggest opening Task
Manager, choosing column I/O Read Bytes to be shown and run the query.
This also can give a hint about the software that possibly reads the
file all the this time.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SELECT query first run is VERY slow

2011-07-22 Thread Григорий Григоренко
I have a log's database. Schema :


CREATE TABLE log(id integer primary key autoincrement,msg text,created_at 
int,kind text,computer text,process text,who text); 
CREATE INDEX idxlog_created_at ON log(created_at); 
CREATE INDEX idxlog_kind_computer_id_process_who_msg_created_at ON 
log(kind,computer,id desc,process,who,msg,created_at); 
CREATE INDEX idxlog_kind_computer_process_id_who_msg_created_at ON 
log(kind,computer,process,id desc,who,msg,created_at); 
CREATE INDEX idxlog_kind_computer_process_who_id_msg_created_at ON 
log(kind,computer,process,who,id desc,msg,created_at); 
CREATE INDEX idxlog_kind_id_computer_process_who_msg_created_at ON log(kind,id 
desc,computer,process,who,msg,created_at); 


Database: ~ 3.5 Gb size, ~ 2.8 mln records. page_size=8192 bytes, UTF-8.

Platform: Windows XP, Intel Core Duo 3 Ghz.

SQLITE: sqlite3.exe console, latest version (3.7.7.1).


Query:

SELECT 1 as today, id as rowid, kind,who,msg,computer,process,created_at,id 
FROM log WHERE id > 4070636 AND id <= 9223372036854775807 AND kind = 'info' AND 
computer='KRAFTWAY';


Query plan (EXPLAIN QUERY PLAN):

0|0|0|SEARCH TABLE log USING COVERING INDEX 
idxlog_kind_computer_process_who_id_msg_created_at (kind=? AND computer=?) (~2 
rows)

There are no records in database that match query, i.e. query returns nothing.


Now, the problem. 


Reboot system. Run query. Elapsed: ~100 SECONDS. Re-run query. Now it executes 
in less than a second. Re-run query. The same: less than a second. Tried it 
several times. 


RUNNING THIS QUERY JUST AFTER REBOOT TAKES UNPREDICTABLE LARGE TIME, FROM 1 TO 
2 MINUTES.


Is this a sqlite bug or normal behaviour?

If it's normal, there are some questions.

Am I hitting some limit with my database? Which one?

Why is SQLITE (having nicely matching index) is spending so much time to 
execute? 

SQLITE has to cache some index data? Yeah but the console process (during 
execution of this query) is reading at about 800 Kb per second rate! (while HDD 
can make 100 Mb per second) Why so serious? Execuse me, why so slow?

And SQLITE process it's not consuming CPU (less than 1 percent). What is SQLITE 
doing during this long running query? 

It's own private memory is holding still at about 20 Mb. It's not consuming CPU 
and HDD. What is he waiting for?


This problem makes SQLITE completely unusable in my project.


I've shared database and scripts (Warning: it's RAR archive, ZIP cannot hold 
files of 3.5 Gb size. You can download WINRAR here: 
http://www.rarlab.com/download.htm ):

http://dl.dropbox.com/u/2168777/deadly_sqlite.rar

Download size: ~150 Mb.

Unzip and execute "run.cmd".

You'll see execution plan and after a while there will be line "Seconds 
elapsed: XXX".









___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users