Re: [sqlite] row storage / column order
_ > Od: "Simon Slavin" > Datum: 02.10.2011 14:01 > > >On 2 Oct 2011, at 12:49pm, Mira Suk wrote: > >> CREATE TABLE [Data1] (Index INTEGER PRIMARY KEY NOT NULL, Data BLOB, >> Description TEXT) >> vs >> CREATE TABLE [Data2] (Index INTEGER PRIMARY KEY NOT NULL, Description TEXT, >> Data BLOB) >> >> essentially those tables are same. however accessing the column >> "Description" takes a lot longer in Data1 table (it's actually based on size >> of blob column) >> did I hit some implementation detail (aka all columns data are read till the >> one required is returned) ? > >I will let the experts handle this. I do know that if you have either TEXT or >BLOB columns with a lot of data in you never do a 'SELECT *' unless you really >need it. this is not about select * simpliest test case is SELECT DISTINCT "Description" FROM "Data?" - distinct just to force sqlite to process all data before returning. >> should I actually reorder columns based on expected data / size for SQLite ? > >If there is behaviour like the above, then your ordering should not be based >on data/size. It would be more useful to base it on how often you expected to >access each column. So for example, if you display tables showing >[Description] a lot, but >rarely need the [Data], then it might be fastest to >put [Description] in first. you're correct. the "how often" factor should play main role. however putting integer/double field after blob just because it's less used would not be smart too, same IMHO goes for VARCHAR(N) where N is expected small, that is much smaller than database pagesize. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] row storage / column order
Hey everyone, Imagine tables CREATE TABLE [Data1] (Index INTEGER PRIMARY KEY NOT NULL, Data BLOB, Description TEXT) vs CREATE TABLE [Data2] (Index INTEGER PRIMARY KEY NOT NULL, Description TEXT, Data BLOB) essentially those tables are same. however accessing the column "Description" takes a lot longer in Data1 table (it's actually based on size of blob column) did I hit some implementation detail (aka all columns data are read till the one required is returned) ? should I actually reorder columns based on expected data / size for SQLite ? Thanks in advance, M. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] query performance help
Hey everyone, currently I have some temporary table which I populate this way (recursive triggers on) CREATE TEMPORARY TABLE IF NOT EXISTS NestedEntriesResult_ID (IDX INTEGER NOT NULL, DBORDER INTEGER PRIMARY KEY NOT NULL); CREATE TEMPORARY TRIGGER IF NOT EXISTS NestedEntriesPopulate_ID AFTER INSERT ON NestedEntriesResult_ID FOR EACH ROW BEGIN INSERT INTO NestedEntriesResult_ID (IDX) SELECT IndexME.IDI FROM IndexME WHERE IndexME.Parent = New.IDX ORDER BY IndexME.[Order]; END INSERT INTO NestedEntriesResult_ID (IDX) VALUES (?) this insert runs chain of triggers which insert all child items in following way 0 - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 table IndexME contains tree-like structure where IDI - primary key, Parent - item tree parent - link to "parent item" IDI, Order - item order within same Parent (aka unique value for rows which have same Parent). All important columns are indexed. As can be guessed populating table is quite slow - ~150ms for around 10k rows in IndexME. Suggestions for better (faster) method ? I've yet to try simply doing it C, but I don't think running 10k inserts (to get result into table - as it's part of another query) alone will give me better performance. thanks in advance, M. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] c-api document suggestion
On 9/21/2011 21:22 Igor Tandetnik wrote: You can include the NUL terminator, if you want it to actually be stored in the database. Igor Tandetnik Actually you can't - if you do all SQL string functions will not work. to be clear - SELECT TRIM(what ever text column you stored with including null on end of string) will not trim that string. found out the hard way. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] LEFT JOIN optimization
_ > Od: "Jim Morris" > Komu: > Datum: 22.09.2011 23:06 > Předmět: Re: [sqlite] LEFT JOIN optimization > > Your where clause "WHERE ItemsME.IDR ..." is only satisfied if there is > an associated ItemsME record so the left outer join is pointless. Just > use the inner join. Normally the left outer join would include all of > ItemsME_Properties, that probably explains the table scan. Yeah. the answer is I'm stupid. the left join was just for test sake (as was said tables are 1:1 and IDR is unique) as I have all kinds of performance problems with left outers. this one is on me not on SQLite lol - still would say it's strange quirk - but the query is also strange. Thanks. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] LEFT JOIN optimization
EXPLAIN QUERY PLAN SELECT DISTINCT ItemsME.Points FROM (ItemsME_Properties LEFT JOIN ItemsME ON ItemsME_Properties.IDR = ItemsME.IDR) WHERE ItemsME.IDR IN (SELECT IDR FROM cProds WHERE Prod = 106) selectid order from detail 0 0 0 SCAN TABLE ItemsME_Properties (~100 rows) 0 1 1 SEARCH TABLE ItemsME USING INDEX IDR (IDR=?) (~2 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 1 0 0 SEARCH TABLE cProds USING INDEX Prod (Prod=?) (~210 rows) 0 0 0 EXECUTE LIST SUBQUERY 2 2 0 0 SEARCH TABLE cProds USING INDEX Prod (Prod=?) (~210 rows) 0 0 0 USE TEMP B-TREE FOR DISTINCT this takes about 3 seconds EXPLAIN QUERY PLAN SELECT DISTINCT ItemsME.Points FROM (ItemsME_Properties INNER JOIN ItemsME ON ItemsME_Properties.IDR = ItemsME.IDR) WHERE ItemsME.IDR IN (SELECT IDR FROM cProds WHERE Prod = 106) 0 0 1 SEARCH TABLE ItemsME USING INDEX IDR (IDR=?) (~25 rows) 0 0 0 EXECUTE LIST SUBQUERY 1 1 0 0 SEARCH TABLE cProds USING INDEX Prod (Prod=?) (~210 rows) 0 1 0 SEARCH TABLE ItemsME_Properties USING COVERING INDEX IDR (IDR=?) (~1 rows) 0 0 0 USE TEMP B-TREE FOR DISTINCT this runs in about 15 milisec => results are equal from both cases as ItemsME_Properties is temporary table (updated with triggers as reads are much more frequent) to workaround performance issues with search... (it should be 1:1 with ItemsME, just with less data, and some data reformatted) maybe I'm just stupid but I don't really see reason for scan in first case. am I really stupid ? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance problem LEFT OUTERJOINandstringdatafromright table
The difference is that #2 mentions only one field from ItemsME, namely IDR. The value of that field comes from the index, the table itself doesn't need to be read at all. It's not even clear why #2 bothers to join with ItemsME at all - it's a no-op. #1 uses more fields from ItemsME, so it needs to actually look up and read records from that table. However fact is that IDR is non-existant - for that specific query (which SQLite should see from index) so it should not touch table - just fill with NULLs. The more important thing is - even if I add [ItemsME].[insert whatever integer field here] into result fields it's same speed as in case 2) - that is relatively fast. Just touching any text/blob fields in query make it run ~12x slower. And considering resultset containing few rows even if all texts were split over 2 pages (none is longer than 300 chars) this would equal to records * 2kB IO. in case of 50 records that's 100kB. Do you really think SQLite needs 1.8 seconds for 100kB in reads on SSD drive ? I certainly hope not. Maybe there is reason RIGHT JOIN isn't implemented ? BTW even the 2nd case is considerably slower than single table selects. the TZB_MATCHRECURSIVE is doing way more difficult job (selecting all childs and filtering them) but only in single table and is still faster (around 40ms per top level parent folder which usually contain ~700 folders/items). Just out of curiosity is it possible to download some older versions of SQLite to test if this behaviour is present in it ? I can only find current version on download page. nOOb here. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance problem LEFT OUTER JOIN andstringdatafromright table
>Ok then, show the result of prepending EXPLAIN QUERY PLAN to your statement. >-- >Igor Tandetnik First of all thanks for bearing with me :) functions TZB_MATCHRECURSIVE(int,int) - disabled for this test - always return 1. applies filter recursively TZB_ISCHILD(int) - bitmask check TZB_MATCHDIM(int,text) - disabled for this test - always return 1. does weird stuff with text parameter You probably by now realized the table forms tree of items. Parent links to IDI and this way tree structure is formed. parent parameter is for testing set to 0 - which is root of tree (query should return top-most folders) in this query all ItemsME.Status and ItemsME.Points are NULL. there is no entry in ItemsME for directories and there are no "files" in root test 1. query SELECT [IndexME].[IDI], [IndexME].[Status], [IndexME].[Icon], [IndexME].[Text] FROM [IndexME] LEFT OUTER JOIN [ItemsME] ON [ItemsME].[IDR] = [IndexME].[IDI] WHERE [IndexME].[Parent] = ?1 AND (TZB_MATCHRECURSIVE([IndexME].[IDI], [IndexME].[Status]) OR (TZB_ISCHILD([IndexME].[Status]) AND TZB_MATCHDIM([ItemsME].[Status], [ItemsME].[Points]))) ORDER BY [IndexME].[Order]; plan "SEARCH TABLE IndexME USING AUTOMATIC COVERING INDEX (Parent=?) (~5 rows)" "SEARCH TABLE ItemsME USING AUTOMATIC COVERING INDEX (IDR=?) (~5 rows)" "USE TEMP B-TREE FOR ORDER BY" runtime ~2seconds test 2. (TZB_MATCHDIM removed) query SELECT [IndexME].[IDI], [IndexME].[Status], [IndexME].[Icon], [IndexME].[Text] FROM [IndexME] LEFT OUTER JOIN [ItemsME] ON [ItemsME].[IDR] = [IndexME].[IDI] WHERE [IndexME].[Parent] = ?1 AND (TZB_MATCHRECURSIVE([IndexME].[IDI], [IndexME].[Status]) OR (TZB_ISCHILD([IndexME].[Status]))) ORDER BY [IndexME].[Order]; plan "SEARCH TABLE IndexME USING AUTOMATIC COVERING INDEX (Parent=?) (~5 rows)" "SEARCH TABLE ItemsME USING AUTOMATIC COVERING INDEX (IDR=?) (~5 rows)" "USE TEMP B-TREE FOR ORDER BY" runtime ~160miliseconds test 3. (TZB_MATCHDIM replaced with LENGTH to prove it's not my func) results equal to 1) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance problem LEFT OUTER JOIN and string datafromright table
> Mira Suk wrote: >> query written here is a lot simplified (for example "Points" column is >> filtered using custom function) however main culprit seems >> to be LEFT OUTER JOIN as accessing that same column in query which only has >> B table in it is lightning fast. >> >> result of query is >> just around 40 rows, (due to Parent filter) > >You may want an index on A(Parent) then. >-- >Igor Tandetnik A.Parent is has Index query SELECT IDI FROM A WHERE Parent = ? is fast - aka this have nothing to with that condition. IMHO only with result column from B table. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Performance problem LEFT OUTER JOIN and string data from right table
Hey everyone. new to SQLite so please have patience with me having two tables and doing left outer join A IDI INTEGER PRIMARY KEY Parent INTEGER INDEX Status INTEGER B IDR UNIQUE INTEGER FOREIGN KEY IndexME.IDI Points TEXT (at average ~120 character string) (this table is primarily used for large data, so it contains some blobs etc...) SELECT A.IDI, B.Text FROM A LEFT OUTER JOIN B ON B.IDR = A.IDI WHERE Parent = ? query written here is a lot simplified (for example "Points" column is filtered using custom function) however main culprit seems to be LEFT OUTER JOIN as accessing that same column in query which only has B table in it is lightning fast. result of query is just around 40 rows, (due to Parent filter) and it takes ~2 seconds on I7 3.4Ghz with database stored on SSD HDD. am I doing something terribly wrong ? suggestions ? oh and I tried running ANALYZE on database connection without any visible improvement. thanks in advance, M. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users