[sqlite] query performance help

2011-10-02 Thread Mira Suk

Hey everyone,
 
currently I have some temporary table which I populate this way (recursive 
triggers on)
 
CREATE TEMPORARY TABLE IF NOT EXISTS NestedEntriesResult_ID (IDX INTEGER NOT 
NULL, DBORDER INTEGER PRIMARY KEY NOT NULL);
 
CREATE TEMPORARY TRIGGER IF NOT EXISTS NestedEntriesPopulate_ID
AFTER INSERT
ON NestedEntriesResult_ID
FOR EACH ROW
BEGIN
INSERT INTO NestedEntriesResult_ID (IDX) SELECT IndexME.IDI FROM IndexME WHERE 
IndexME.Parent = New.IDX ORDER BY IndexME.[Order];
END
 
INSERT INTO NestedEntriesResult_ID (IDX) VALUES (?)
this insert runs chain of triggers which insert all child items in following way
0 - 1 - 2 - 3
 - 4
   - 5 - 6
    - 7
   - 8
 
 
table IndexME contains tree-like structure where IDI - primary key, Parent - 
item tree parent - link to parent item IDI, Order - item order within same 
Parent (aka unique value for rows which have same Parent). All important 
columns are indexed.
As can be guessed populating table is quite slow - ~150ms for around 10k rows 
in IndexME.
 
Suggestions for better (faster) method ?
I've yet to try simply doing it C, but I don't think running 10k inserts (to 
get result into table - as it's part of another query) alone will give me 
better performance.
 
thanks in advance,
M.
 
 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] row storage / column order

2011-10-02 Thread Mira Suk

Hey everyone,
 
Imagine tables
 
CREATE TABLE [Data1] (Index INTEGER PRIMARY KEY NOT NULL, Data BLOB, 
Description TEXT)
vs
CREATE TABLE [Data2] (Index INTEGER PRIMARY KEY NOT NULL, Description TEXT, 
Data BLOB)
 
essentially those tables are same. however accessing the column Description 
takes a lot longer in Data1 table (it's actually based on size of blob column)
did I hit some implementation detail (aka all columns data are read till the 
one required is returned) ?
should I actually reorder columns based on expected data / size for SQLite ?
 
Thanks in advance,
M.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] row storage / column order

2011-10-02 Thread Mira Suk

_
 Od: Simon Slavin 
 Datum: 02.10.2011 14:01


On 2 Oct 2011, at 12:49pm, Mira Suk wrote:

 CREATE TABLE [Data1] (Index INTEGER PRIMARY KEY NOT NULL, Data BLOB, 
 Description TEXT)
 vs
 CREATE TABLE [Data2] (Index INTEGER PRIMARY KEY NOT NULL, Description TEXT, 
 Data BLOB)
  
 essentially those tables are same. however accessing the column 
 Description takes a lot longer in Data1 table (it's actually based on size 
 of blob column)
 did I hit some implementation detail (aka all columns data are read till the 
 one required is returned) ?

I will let the experts handle this.  I do know that if you have either TEXT or 
BLOB columns with a lot of data in you never do a 'SELECT *' unless you really 
need it.
 
this is not about select *
simpliest test case is SELECT DISTINCT Description FROM Data? - distinct 
just to force sqlite to process all data before returning.

 should I actually reorder columns based on expected data / size for SQLite ?

If there is behaviour like the above, then your ordering should not be based 
on data/size.  It would be more useful to base it on how often you expected to 
access each column.  So for example, if you display tables showing 
[Description] a lot, but rarely need the [Data], then it might be fastest to 
put [Description] in first.
 
you're correct. the how often factor should play main role. however putting 
integer/double field after blob just because it's less used would not be smart 
too, same IMHO goes for VARCHAR(N) where N is expected small, that is much 
smaller than database pagesize.
 
 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] LEFT JOIN optimization

2011-09-22 Thread Mira Suk

EXPLAIN QUERY PLAN
SELECT DISTINCT ItemsME.Points
FROM (ItemsME_Properties LEFT JOIN ItemsME ON ItemsME_Properties.IDR = 
ItemsME.IDR)
WHERE ItemsME.IDR
IN
(SELECT IDR FROM cProds WHERE Prod = 106)
 
selectid    order    from    detail
0    0    0    SCAN TABLE ItemsME_Properties (~100 rows)
0    1    1    SEARCH TABLE ItemsME USING INDEX IDR (IDR=?) (~2 rows)
0    0    0    EXECUTE LIST SUBQUERY 1
1    0    0    SEARCH TABLE cProds USING INDEX Prod (Prod=?) (~210 rows)
0    0    0    EXECUTE LIST SUBQUERY 2
2    0    0    SEARCH TABLE cProds USING INDEX Prod (Prod=?) (~210 rows)
0    0    0    USE TEMP B-TREE FOR DISTINCT
this takes about 3 seconds
 
 
EXPLAIN QUERY PLAN
SELECT DISTINCT ItemsME.Points
FROM (ItemsME_Properties INNER JOIN ItemsME ON ItemsME_Properties.IDR = 
ItemsME.IDR)
WHERE ItemsME.IDR
IN
(SELECT IDR FROM cProds WHERE Prod = 106)
 
0    0    1    SEARCH TABLE ItemsME USING INDEX IDR (IDR=?) (~25 rows)
0    0    0    EXECUTE LIST SUBQUERY 1
1    0    0    SEARCH TABLE cProds USING INDEX Prod (Prod=?) (~210 rows)
0    1    0    SEARCH TABLE ItemsME_Properties USING COVERING INDEX IDR (IDR=?) 
(~1 rows)
0    0    0    USE TEMP B-TREE FOR DISTINCT
this runs in about 15 milisec
 
= results are equal from both cases as ItemsME_Properties is temporary table 
(updated with triggers as reads are much more frequent) to workaround 
performance issues with search... (it should be 1:1 with ItemsME, just with 
less data, and some data reformatted)
 
maybe I'm just stupid but I don't really see reason for scan in first case. am 
I really stupid ?
 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] LEFT JOIN optimization

2011-09-22 Thread Mira Suk

_
 Od: Jim Morris 
 Komu: 
 Datum: 22.09.2011 23:06
 Předmět: Re: [sqlite] LEFT JOIN optimization

 Your where clause WHERE ItemsME.IDR ... is only satisfied if there is 
 an associated ItemsME record so the left outer join is pointless.  Just 
 use the inner join.  Normally the left outer join would include all of 
 ItemsME_Properties, that probably explains the table scan.
 
Yeah. the answer is I'm stupid. the left join was just for test sake (as was 
said tables are 1:1 and IDR is unique) as I have all kinds of performance 
problems with left outers. this one is on me not on SQLite lol - still would 
say it's strange quirk - but the query is also strange.
 
Thanks.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] c-api document suggestion

2011-09-22 Thread Mira Suk

On 9/21/2011 21:22 Igor Tandetnik wrote:

You can include the NUL terminator, if you want it to actually be stored 
in the database.



Igor Tandetnik


Actually you can't - if you do all SQL string functions will not work.
to be clear -
SELECT TRIM(what ever text column you stored with including null on end of 
string)
will not trim that string.
 
found out the hard way.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Performance problem LEFT OUTER JOIN and string data from right table

2011-09-17 Thread Mira Suk

Hey everyone.
 
new to SQLite so please have patience with me
having two tables and doing left outer join
 
A
IDI INTEGER PRIMARY KEY
Parent INTEGER INDEX
Status INTEGER

 
B
IDR UNIQUE INTEGER FOREIGN KEY IndexME.IDI
Points TEXT (at average ~120 character string)
 (this table is primarily used for large data, so it contains some blobs 
etc...)
 
SELECT A.IDI, B.Text
FROM A
LEFT OUTER JOIN B ON B.IDR = A.IDI
WHERE Parent = ?
 
query written here is a lot simplified (for example Points column is filtered 
using custom function) however main culprit seems to be LEFT OUTER JOIN as 
accessing that same column in query which only has B table in it is lightning 
fast.
result of query is just around 40 rows, (due to Parent filter) and it takes ~2 
seconds on I7 3.4Ghz with database stored on SSD HDD.
 
am I doing something terribly wrong ? suggestions ?
 
oh and I tried running ANALYZE on database connection without any visible 
improvement.
 
thanks in advance,
M.
 
 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Performance problem LEFT OUTER JOIN and string datafromright table

2011-09-17 Thread Mira Suk

 
 Mira Suk  wrote:
 query written here is a lot simplified (for example Points column is 
 filtered using custom function) however main culprit seems
 to be LEFT OUTER JOIN as accessing that same column in query which only has 
 B table in it is lightning fast.

 result of query is
 just around 40 rows, (due to Parent filter)

You may want an index on A(Parent) then.
-- 
Igor Tandetnik
 
A.Parent is has Index
 
query
SELECT IDI FROM A WHERE Parent = ?
is fast - aka this have nothing to with that condition. IMHO only with result 
column from B table.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Performance problem LEFT OUTER JOIN andstringdatafromright table

2011-09-17 Thread Mira Suk

Ok then, show the result of prepending EXPLAIN QUERY PLAN to your statement.
-- 
Igor Tandetnik

First of all thanks for bearing with me :)

functions
TZB_MATCHRECURSIVE(int,int) 
- disabled for this test - always return 1. applies filter recursively
TZB_ISCHILD(int) 
- bitmask check
TZB_MATCHDIM(int,text) 
- disabled for this test - always return 1. does weird stuff with text parameter


You probably by now realized the table forms tree of items. Parent links to IDI 
and this way tree structure is formed.
parent parameter is for testing set to 0 - which is root of tree (query should 
return top-most folders)
in this query all ItemsME.Status and ItemsME.Points are NULL. there is no entry 
in ItemsME for directories and there are no files in root


test 1.

query
SELECT [IndexME].[IDI], [IndexME].[Status], [IndexME].[Icon], [IndexME].[Text]
FROM [IndexME] LEFT OUTER JOIN [ItemsME]
ON [ItemsME].[IDR] = [IndexME].[IDI] WHERE 
[IndexME].[Parent] = ?1 AND 
(TZB_MATCHRECURSIVE([IndexME].[IDI], [IndexME].[Status]) OR
(TZB_ISCHILD([IndexME].[Status]) AND TZB_MATCHDIM([ItemsME].[Status], 
[ItemsME].[Points])))
ORDER BY [IndexME].[Order];


plan
SEARCH TABLE IndexME USING AUTOMATIC COVERING INDEX (Parent=?) (~5 rows)
SEARCH TABLE ItemsME USING AUTOMATIC COVERING INDEX (IDR=?) (~5 rows)
USE TEMP B-TREE FOR ORDER BY

runtime ~2seconds


test 2. (TZB_MATCHDIM removed)

query
SELECT [IndexME].[IDI], [IndexME].[Status], [IndexME].[Icon], [IndexME].[Text]
FROM [IndexME] LEFT OUTER JOIN [ItemsME]
ON [ItemsME].[IDR] = [IndexME].[IDI] WHERE 
[IndexME].[Parent] = ?1 AND 
(TZB_MATCHRECURSIVE([IndexME].[IDI], [IndexME].[Status]) OR
(TZB_ISCHILD([IndexME].[Status])))
ORDER BY [IndexME].[Order];

plan
SEARCH TABLE IndexME USING AUTOMATIC COVERING INDEX (Parent=?) (~5 rows)
SEARCH TABLE ItemsME USING AUTOMATIC COVERING INDEX (IDR=?) (~5 rows)
USE TEMP B-TREE FOR ORDER BY

runtime ~160miliseconds

test 3. (TZB_MATCHDIM replaced with LENGTH to prove it's not my func)
results equal to 1)
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Performance problem LEFT OUTERJOINandstringdatafromright table

2011-09-17 Thread Mira Suk



The difference is that #2 mentions only one field from ItemsME, namely IDR. The 
value of that field comes from the index, the table itself doesn't need to be 
read at all. It's not even clear why #2 bothers to join with ItemsME at all - 
it's a no-op.
#1 uses more fields from ItemsME, so it needs to actually look up and read 
records from that table.


However fact is that IDR is non-existant - for that specific query (which 
SQLite should see from index) so it should not touch table - just fill with 
NULLs.
The more important thing is - even if I add [ItemsME].[insert whatever integer 
field here] into result fields it's same speed as in case 2) - that is 
relatively fast. Just touching any text/blob fields in query make it run ~12x 
slower. And considering resultset containing few rows even if all texts were 
split over 2 pages (none is longer than 300 chars) this would equal to records 
* 2kB IO. in case of 50 records that's 100kB.
Do you really think SQLite needs 1.8 seconds for 100kB in reads on SSD drive ? 
I certainly hope not. Maybe there is reason RIGHT JOIN isn't implemented ?
 
BTW even the 2nd case is considerably slower than single table selects. the 
TZB_MATCHRECURSIVE is doing way more difficult job (selecting all childs and 
filtering them) but only in single table and is still faster (around 40ms per 
top level parent folder which usually contain ~700 folders/items).
 
Just out of curiosity is it possible to download some older versions of SQLite 
to test if this behaviour is present in it ? I can only find current version on 
download page. nOOb here.
 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users