Re: [sqlite] WHERE expression with operators from text functions?
I tried your query tonight, Igor, and it worked, not that I understand how... the recursive query syntax is still a mystery to me and the many NOTs are a challenge to interpret. Compared to the FTS solution, I observe some preliminary differences in performance that are interesting. The content being searched is made up from many tables. For the FTS search it is all inserted into the FTS4 virtual table which automatically builds the FTS index. For your recursive search, I create a VIEW of the same content. With larger databases, the time taken to build the FTS virtual table grows much faster than the VIEW. On the other hand, the time taken to search grows much faster for the recursive search than for the FTS. The search expression had three terms ANDed (a AND b AND c). I was delighted with having NOT, AND, OR operators in FTS among other capabilities. To incorporate all three in the recursive search is non-obvious. Tom -- View this message in context: http://sqlite.1065341.n5.nabble.com/WHERE-expression-with-operators-from-text-functions-tp78653p78700.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] WHERE expression with operators from text functions?
Igor, that is a most oblique and intriguing approach. I will try it out and try to get my head around it in the next day or so. Thanks, Tom Igor Tandetnik-2 wrote > On 10/16/2014 12:03 PM, Tom Holden wrote: >> I need a way to convert the text result to an expression that WHERE >> evaluates as an expression. >> >> Any possibility to do this within SQLite? > > with recursive split(str, tail) as ( >select null, 'string1+string2+string3' > union all >select substr(tail, 1, instr(tail || '+', '+')-1), substr(tail, > instr(tail || '+', '+') + 1) >from split where tail != '' > ) > select * from mytable where not exists ( >select str from split >where str is not null and value not like '%' || str || '%' > ); > > -- > Igor Tandetnik > > ___ > sqlite-users mailing list > sqlite-users@ > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- View this message in context: http://sqlite.1065341.n5.nabble.com/WHERE-expression-with-operators-from-text-functions-tp78653p78697.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] updating using a value from another table
Igor Tandetnik-2 wrote > Or alternatively, without a WHERE clause: > > update a set i = coalesce((select i from b where b.a = a.a), i); What I have used similar to this is: UPDATE a SET i = ifnull((select i from b where b.a = a.a), i); Tom -- View this message in context: http://sqlite.1065341.n5.nabble.com/updating-using-a-value-from-another-table-tp71588p71607.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] 5000 tables with 3000 records vs 1 table with 15000000records
"Mike Zang" ... > In fact, the file is stock data and one file is for one stock, so they are > all in the same format. > > Then, in most case, only one stock should be selected. > Then, from a 'disk' operating system and memory management perspective, wouldn't one file per stock continue to be faster and more efficient than having all the stocks in one big file? A master database with pointers to each of the stock databases, ATTACH as many as you want to view at one time, maybe combine data into temporary table(s), create and drop indexes on the fly thus keeping storage at a minimum. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] What is the most efficient way to get the close bynumbers?
"Peng Yu" ... >> I think the penalty is in the extra JOIN required - 3 tables instead of >> 2 - >> with the speed advantage on the 'between' constraint being swamped by the >> volume of intermediate rows. > > I don't quite understand why there are 3 tables with R-tree. Would you > please show me what query you used? Thank you very much. For every row in table A1, look up in the R-Tree table A_X those rows whose xmin and xmax contain the position value, then look up in table A2 the rowids corresponding to A_X.id and discard those whose name does not match A1.name and those whose position is the same. 3 tables, 2 JOINs. SELECT A1.Name, A1.position, A_X.id, A_X.xmin, A_X.xmax, A2.Name, A2.position FROM A AS A1, A_X, A AS A2 WHERE A1.position != A2.position and A1.Name = A2.Name and A2.id = A_X.id and A1.position between A_X.xmin and A_X.xmax LIMIT 40; Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] What is the most efficient way to get the close by numbers?
"Eric Smith" ... > I haven't used it myself, but I'm pretty sure this is what the R*tree > module was designed for: I have not used it either but was intrigued by your suggestion. Looking into it, my sense was that it would be advantageous for a 2-dimension or more search and I think this was borne out by my experiments, albeit with my very limited knowledge and experience. I hasten to add that maybe my queries were not the best designed. I created an R-tree virtual table with the minimum number of columns: id, xmin, xmax and populated it with the rowid from Peng's table A and position-10 in xmin and position+10 in xmax. Thus any value lying between xmax and xmin is no more than a distance of 10 from the point pointed to in Table A by id. Indeed, a simple select on a single value between xmax and xmin returned rows from the virtual table faster than the correspondingly simple select on the real table with an index on position (something like 6ms vs. 10 ms). However, on the 10,000 row test table on which I reported earlier that Jim's 'between' query was fastest at ~2.3s, the best I could get by working an R-tree virtual table into the mix was ~30s, and that with an index on Name or Name+Position. Without the index... ~60s, not much better than Peng's original indexless query and way more complicated. I think the penalty is in the extra JOIN required - 3 tables instead of 2 - with the speed advantage on the 'between' constraint being swamped by the volume of intermediate rows. Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] What is the most efficient way to gettheclosebynumbers?
"Simon Slavin" ... > >> Here's how I have interpreted Simon's suggested chunky query: >> >> select * from A as A1, A as A2 where A1.name=A2.name and >> A1.position != A2.position and >> A2.chunk between A1.chunk - 1 and A1.chunk + 1 and >> A2.position between A1.position - 10 and A1.position + 10 ; > > That's exactly what I meant. It allows you to reject everything not in > those three chunks immediately, before even working out what 'position - > 10' and 'position + 10' are. But looking again at your formulation with > 'between' in ... > >> select * from A as A1, A as A2 where A1.name=A2.name and >> A1.position != A2.position and >> A2.position between A1.position - 10 and A1.position + 10 ; > > I think that that might be just as fast, and not require you to work out > the chunks, or take up all the space the chunk indexes will take up, or > the time it will take to generate them. So we're back to the same old > song: Try the simpler solution. If it's fast enough, use it. Only if it > isn't fast enough is it worth optimizing. > > By the way, you might find that swapping the last two lines makes it > faster: > > >> select * from A as A1, A as A2 where A1.name=A2.name and >> A2.position between A1.position - 10 and A1.position + 10 and >> A1.position != A2.position ; > > But you might not: SQLite's optimizer may already have spotted it. I did some tests on a 10,000 row table similar to Peng's example with name a random alpha and random position 0-99, distance 10: each query returned 333,450 results. ~62s: Peng's original query with no index, either order of constraints ~4s: Peng's original query with idxNamePos, either order ~2.3s: Jim's 'between' query with idxNamePos, either order ~2.7s: Simon's 'chunky' addition to Jim's with idxNameChunkPos, any order So: 1. Simpler is best, in this case. The 'chunking' option paid a ~16% penalty in speed and added considerable complexity and storage requirement. 2. The order of the where constraints did not seem to matter - I guess the optimiser sorts it out or maybe the random nature of the data offers no advantage to any particular order. Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] What is the most efficient way to gettheclosebynumbers?
"Simon Slavin" ... > > By the way, you might find that swapping the last two lines makes it > faster: > > >> select * from A as A1, A as A2 where A1.name=A2.name and >> A2.position between A1.position - 10 and A1.position + 10 and >> A1.position != A2.position ; > > But you might not: SQLite's optimizer may already have spotted it. > I had opted for the other because it had fewer VM instructions and I thought the != condition might be a faster process than the 'between' and eliminate some points from being processed by 'between'. Of course, the result is highly dependent on the correlation of data - if every point is unique then the != condition is a wasteful test. Thanks for the feedback, Simon & Igor. Sorry, Peng, if it seems I highjacked the discussion. Hope it was helpful to you, too. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] What is the most efficient way to get theclosebynumbers?
"Simon Slavin" wrote ... > > On 21 Aug 2010, at 3:29am, Igor Tandetnik wrote: > >> ve3meo <holden_fam...@sympatico.ca> wrote: >>> If the number of VM instructions is a good inverse indicator of >>> efficiency, >>> then the second query with the index appears to be the most efficient. >>> I'm >>> sure somebody will point out if it is not. >> >> It is not. It may very well take fewer instructions to implement a >> straightforward linear scan than it is to join through the index. > > Yeah. I think your formulation like this: > >> 41 - Jim's query revamped to work: >> select * from A as A1, A as A2 where A1.name=A2.name and >> A1.position != A2.position and >> A2.position between A1.position - 10 and A1.position + 10 ; > > is going to be the fastest. It involves more VM instructions but ends up > searching fewer points. You might want to time it as it is, then add an > index (just for testing) on just (position) and see whether that improves > matters. Thanks for the explanation. I thought I might be naively wishing for too much from the count of VM instructions when some may take much longer to execute than others. I was intrigued by Peng's question and wondered if I might be able to predict efficiency by examining Explain Query Plan and Explain. Apparently not. I understand the principle of minimising the number of points to search and I think that this argues for the WHERE conditions to have the coarse resolution first and the finest resolution last. To pick up on Simon's suggestion to chunk, I added an integer field 'chunk' to the table: UPDATE A SET chunk = (position/10); I indexed a bunch of ways to see what the optimiser would do: create index idxNameChunkPos ON A(name,chunk,position); create index idxNamePos ON A(name,position); create index idxNameChunk ON A(name,chunk); create index idxName ON A(name); create index idxPos ON A(position); create index idxChunk ON A(chunk); Here's how I have interpreted Simon's suggested chunky query: select * from A as A1, A as A2 where A1.name=A2.name and A1.position != A2.position and A2.chunk between A1.chunk - 1 and A1.chunk + 1 and A2.position between A1.position - 10 and A1.position + 10 ; Explain Query Plan shows that this query uses table A2 with idxNameChunkPos. Is this liable to result in a minimum of points searched? But so do Peng's original and re-ordered query use idxNameChunkPos even though Chunk is not in the query. Despite the presence of all these other indexes, Jim's suggested query persists in using idxNamePos, as one would expect it to be the optimal index for that query and for the original and re-ordered. I'm wary of what the optimiser does. On those first two queries, I have seen that it switches from using idxNamePos to idxName after the addition of the Chunk field when only those two indexes were present. I should add that the SQLite manager through which I play with SQLite uses a 3.7.0 DLL. Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] What is the most efficient way to get the close bynumbers?
Without an index, EXPLAIN returns the following number of virtual machine instructions for each query: 41 - original query: select * from A as A1, A as A2 where A1.name=A2.name and abs(A1.position - A2.position) <= 10 and A1.position != A2.position ; 39 - original query re-ordered: select * from A as A1, A as A2 where A1.name=A2.name and A1.position != A2.position and abs(A1.position - A2.position) <= 10 ; 41 - Jim's query revamped to work: select * from A as A1, A as A2 where A1.name=A2.name and A1.position != A2.position and A2.position between A1.position - 10 and A1.position + 10 ; In all cases above, Table A2 is used with an automatic index. create index idxNamePos ON A(name,position); With this index present, the above queries all use it EXPLAIN reports this number of VM instructions, respectively: 32, 30, 36. If the number of VM instructions is a good inverse indicator of efficiency, then the second query with the index appears to be the most efficient. I'm sure somebody will point out if it is not. I didn't include Simon's suggestion as I did not know how to implement it. Tom "Jim Morris"wrote in message news:4c6f0a83.4090...@bearriver.com... > If there is an index on (name, position) the a where like below might > use it. > > A1.name=A2.name and A2.position between( A1.position - 10, A1.position + > 10 ) > > > On 8/20/2010 3:54 PM, Peng Yu wrote: >> Hi, >> >> I have the following code to search for neighboring positions >> (distance<=10). But it is slow for large data set. I'm wondering what >> is the most efficient query for such a search. Note that I don't >> create an index, as I'm not sure what index to create on table A. >> >> $ cat main.sql >> #!/usr/bin/env bash >> >> rm -f main.db >> sqlite3 main.db<> >> create table A (name text, position integer); >> insert into A values('a', 1); >> insert into A values('a', 5); >> insert into A values('a', 21); >> insert into A values('b', 3); >> insert into A values('b', 15); >> insert into A values('b', 19); >> >> .mode column >> .headers on >> .echo on >> select * from A as A1, A as A2 where A1.name=A2.name and >> abs(A1.position - A2.position)<= 10 and A1.position != A2.position; >> >> EOF >> > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] PRAGMA database_list: insert into table?
"Roger Binns" <rog...@rogerbinns.com> wrote in message news:4c4b1ca7.1040...@rogerbinns.com... > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 07/24/2010 05:38 AM, ve3meo wrote: >> I am a humble user of the command line implementation and of various >> SQLite >> managers and not an application developer so I cannot do it in code. > > You are coding! The command line tool is merely a thin wrapper around the > library and the managers are a gui equivalent. > > I'd encourage you to try one of the scripting languages out there. They > provide an easy way of adding a little more control flow around the SQL, > variables, better interoperability with files etc. These languages are > easy > to learn and use. > > My personal preference is Python, but you won't go far wrong with Lua, > Tcl, > Perl, PHP, Basic etc. See a long list of languages that have SQLite > available at: > > http://www.sqlite.org/cvstrac/wiki?p=SqliteWrappers I have dabbled years ago with Tcl and even resurrected Quick Basic recently so I could do something - even probably a DOS batch file. Before heading there, I wanted to exhaust the possibility of doing it within a SQL query. That's convenient for me and other users to be able to do that and many other queries from within the same SQLite manager. > >>> Depending on what the information is you are obtaining, it may be >>> possible >>> to get it via direct queries on sqlite_master. >> >> Thanks, I'll investigate further. > > How about telling us what you are trying to achieve (not how you are > trying > to achieve it :-) I use a genealogical software that employs SQLite as its database engine. That software is deficient in some respects and I have developed quite a few SQLite queries that I can run outside the application to examine and modify data in ways that the app cannot. One query aims to extract various properties of the genealogical database that can tell me its characteristics in more detail than the software does and may flag possible problems or areas needing attention. The query compares certain data against that of an attached reference database which may change with a new release of the software. What I want to add to the results of this query are the names of the test database file and the reference file. I was hoping to do that from within the same sql file. Tom > > Roger ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] PRAGMA database_list: insert into table?
"Roger Binns" <rog...@rogerbinns.com> wrote in message news:4c4a5bd5.5010...@rogerbinns.com... > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 07/23/2010 07:47 PM, ve3meo wrote: >> Is it possible to store the results of a PRAGMA statement, especially >> PRAGMA >> database_list in a SQLite temporary table using only SQLite commands? > > No. Is there any particular reason your code can't copy them internally? > Remember that SQLite is a library - it lives inside your application - and > is not some remote unchangeable component. I am a humble user of the command line implementation and of various SQLite managers and not an application developer so I cannot do it in code. > Depending on what the information is you are obtaining, it may be possible > to get it via direct queries on sqlite_master. Thanks, I'll investigate further. > > Roger > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] PRAGMA database_list: insert into table?
Is it possible to store the results of a PRAGMA statement, especially PRAGMA database_list in a SQLite temporary table using only SQLite commands? I have tried every combination I can think of without success. Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] wrong output for a select group by to a sub query with acase expression
I replicated your results. ORDERing the results seems to resolve the error but clearly the GROUP BY is otherwise getting confused by the CASE. This works: select categ, count(1) from (select *, (case when a=0 then 0 else cast((a + 25) / 50 + 0.5 as int) end) as categ from test order by a) a group by categ; -- right or this: select categ, count(1) from (select *, (case when a=0 then 0 else cast((a + 25) / 50 + 0.5 as int) end) as categ from test order by categ) a group by categ; -- right and this, w/o either the CASE or the ORDER, except it does not handle the a=0 condition: select cast((a + 25) / 50 + 0.5 as int) as categ, COUNT(1) from test group by categ; -- almost right There seem to be a couple of unused elements in your query and it could be written: select categ, count(1) from (select (case when a=0 then 0 else cast((a + 25) / 50 + 0.5 as int) end) as categ from test order by a) group by categ; -- right Tom "Pedro Pedruzzi"wrote in message news:4b901b1d.8090...@gmail.com... > I'm using sqlite3-3.6.22, downloaded today from sqlite web site. > > Steps to reproduce: > > create table test(a real); > .import bugdata test > > select categ, count(1) from (select *, (case when a=0 then 0 else > cast((a + 25) / 50 + 0.5 as int) end) as categ from test) a group by > categ; > > 1|10 > 1|25 > 3|26 > > The count is ok, but the categ is not supposed to duplicate. > > This next very similar query works ok (with the provided data in > particular the correct results happens to be the same). > > select categ, count(1) from (select *, (case when 0=1 then 0 else > cast((a + 25) / 50 + 0.5 as int) end) as categ from test) a group by > categ; > > 1|10 > 2|25 > 3|26 > > > Here is the bugdata: > > 120.0 > 35.5 > 95.0 > 41.0 > 51.5 > 64.5 > 140.0 > 64.5 > 108.5 > 138.5 > 138.5 > 94.5 > 130.5 > 119.5 > 148.5 > 75.5 > 94.0 > 144.0 > 78.5 > 86.0 > 112.0 > 132.0 > 51.0 > 42.0 > 44.0 > 48.0 > 12.0 > 35.5 > 35.5 > 75.5 > 77.5 > 130.5 > 103.0 > 110.5 > 53.5 > 86.5 > 122.0 > 146.0 > 129.0 > 91.5 > 141.0 > 76.5 > 66.5 > 35.5 > 126.0 > 90.0 > 96.0 > 134.0 > 63.0 > 106.5 > 77.5 > 35.5 > 64.0 > 121.0 > 119.5 > 126.0 > 58.0 > 123.0 > 133.0 > 77.0 > 56.0 > > Regards, > -- Pedro Pedruzzi > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Retrieving column names
"P Kishor"wrote in message news:cdf6db501002270448s2d28ef4fm9a831fdfa7bf3...@mail.gmail.com... > On Sat, Feb 27, 2010 at 6:43 AM, Peter Rodwell > wrote: >> I'm sure this question has been asked and answered a million times, but >> I've not been able >> to find the answer. Googling for it has turned up lots of answers, none >> of which seem to >> work with SQLite: > > > http://www.google.com/search?q=sqlite+column+names > > The top four answers returned all explain how to do the above correctly. > > Use PRAGMA, or SELECT from sqlite_master > Is there a way to use SELECT against the PRAGMA result-set? This returns an error: SELECT * FROM (PRAGMA table_info(tablename)); ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] unexpected behavior with CROSS and LEFT JOIN together
I posted the following and it didn't appear - I probably hit Reply to Sender instead of Reply Group: > Does INDEX sqlite_autoindex_currency_1 contain only pointers to the > identical currencies? > > Tom This little change also works: ON +c1.cur=c2.cur; or ON c1.cur=+c2.cur; Along with LIKE, that seems to hide the condition from being subject to the query optimiser and its use of the autoindex with just the cur field in it. That ON phrase relates to the CROSS JOIN so that effectively you have this: SELECT c1.cur cur1, c2.cur cur2 FROM currency c1 CROSS JOIN currency c2 ON c1.cur = c2.cur; results: cur1 cur2 EUR EUR GBP GBP USD USD Tom Reading Richard's post below, I'm confused. Is the bug that: a) that ON condition should not be picked up by the query optimiser as relating to the CROSS JOIN but to the LEFT JOIN immediately ahead of it? b) should the query optimiser use the autoindex anyway? (because of the CROSS JOIN between c1 and c2, regardless of condition) "D. Richard Hipp" <d...@hwaci.com> wrote in message news:d097a35f-b1ca-4131-b11d-03c11afb1...@hwaci.com... > > On Feb 24, 2010, at 5:20 PM, Pavel Ivanov wrote: > >>> Does INDEX sqlite_autoindex_currency_1 contain only pointers to the >>> identical currencies? >> >> First of all AFAIK 'PRIMARY KEY' implies uniqueness of the column. And >> second: I bet if you execute 'PRAGMA case_sensitive_like = true' then >> plans will be the same. >> See http://www.sqlite.org/optoverview.html#like_opt for details. >> > > Guys: Thanks for all the discussion. But I know what is causing the > problem. I would have already posted the ticket, but I'm having a > little problem with Fossil right this minute and I need to debug that > first... > > The problem is that the query optimizer is using the c1.cur=c2.cur > term together with indices to limit the search to only those rows that > satisfy the condition. That's the right thing to do in most cases, > but not when the expression is on a LEFT JOIN but refers only to > tables to the right of the LEFT JOIN. > > This problem has existed in SQLite forever and has never been seen > before. So it is obscure. The simple fix is to put the ON clause on > the CROSS JOIN where it belongs. Yes, SQLite should still do the > right thing even if the ON is in the wrong place, and I'll fix that > directly. Let me get Fossil running again first, though, please. > > Thanks for reporting the problem and for the analysis. > >> >> Pavel >> >> On Wed, Feb 24, 2010 at 5:15 PM, ve3meo >> <holden_fam...@sympatico.ca> >> wrote: >>> >>> "ve3meo" <holden_fam...@sympatico.ca> wrote >>> in >>> message news:hm47t5$5l...@dough.gmane.org... >>>> >>>> "Igor Tandetnik" <itandet...@mvps.org> wrote in >>>> message news:hm45gu$s5...@dough.gmane.org... >>>>> Mark Brand <mabr...@mabrand.nl> wrote: >>>>>> --Gives unexpected results >>>>>> SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate, x.rate) rate >>>>>> FROM currency c1 >>>>>> CROSS JOIN currency c2 >>>>>> LEFT JOIN exchange x >>>>>>ON x.cur1=c1.cur >>>>>>AND x.cur2=c2.cur >>>>>> LEFT JOIN (SELECT 1 rate) self >>>>>>ON c1.cur=c2.cur; >>>>>> >>>>>> /* results >>>>>> >>>>>> EUR|EUR|1 >>>>>> GBP|GBP|1 >>>>>> USD|USD|1 >>>>>> >>>>>> */ >>>>> >>>>> Yes, looks like a bug to me. I see no reason why it shouldn't >>>>> work. LEFT >>>>> JOIN should never produce fewer rows than what left-hand-side table >>>>> contains. >>>>> >>>>> As a workaround, try this instead: >>>>> >>>>> SELECT c1.cur cur1, c2.cur cur2, >>>>>(case when c1.cur=c2.cur then 1 else x.rate) rate >>>>> FROM currency c1 >>>>> CROSS JOIN currency c2 >>>>> LEFT JOIN exchange x >>>>>ON x.cur1=c1.cur >>>>>AND x.cur2=c2.cur; >>>>> >>>>> Igor Tandetnik >>>> >>>> One little change makes it work: >>>> >>>> SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate,x.rate) rate >>>> FROM currency c1 >>>> CROSS JOIN currency c2 >>>> LEFT JOIN exchange x >>>>ON x.cur1=c1.cur >>>>AND x.cur2=c2.cur >>>> LEFT JOIN (SELECT 1 ra
Re: [sqlite] unexpected behavior with CROSS and LEFT JOIN together
"ve3meo" <holden_fam...@sympatico.ca> wrote in message news:hm47t5$5l...@dough.gmane.org... > > "Igor Tandetnik" <itandet...@mvps.org> wrote in > message news:hm45gu$s5...@dough.gmane.org... >> Mark Brand <mabr...@mabrand.nl> wrote: >>> --Gives unexpected results >>> SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate, x.rate) rate >>> FROM currency c1 >>> CROSS JOIN currency c2 >>> LEFT JOIN exchange x >>>ON x.cur1=c1.cur >>>AND x.cur2=c2.cur >>> LEFT JOIN (SELECT 1 rate) self >>>ON c1.cur=c2.cur; >>> >>> /* results >>> >>> EUR|EUR|1 >>> GBP|GBP|1 >>> USD|USD|1 >>> >>> */ >> >> Yes, looks like a bug to me. I see no reason why it shouldn't work. LEFT >> JOIN should never produce fewer rows than what left-hand-side table >> contains. >> >> As a workaround, try this instead: >> >> SELECT c1.cur cur1, c2.cur cur2, >>(case when c1.cur=c2.cur then 1 else x.rate) rate >> FROM currency c1 >> CROSS JOIN currency c2 >> LEFT JOIN exchange x >>ON x.cur1=c1.cur >>AND x.cur2=c2.cur; >> >> Igor Tandetnik > > One little change makes it work: > > SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate,x.rate) rate > FROM currency c1 > CROSS JOIN currency c2 > LEFT JOIN exchange x >ON x.cur1=c1.cur >AND x.cur2=c2.cur > LEFT JOIN (SELECT 1 rate) self >ON c1.cur LIKE c2.cur; > > Don't ask me why '=' and 'LIKE' should behave differently in this example. > > Tom They do affect the EXPLAIN QUERY PLAN results: "=" order from detail 0 0 TABLE currency AS c1 1 1 TABLE currency AS c2 WITH INDEX sqlite_autoindex_currency_1 2 2 TABLE exchange AS x WITH INDEX sqlite_autoindex_exchange_1 3 3 TABLE AS self "LIKE" order from detail 0 0 TABLE currency AS c1 1 1 TABLE currency AS c2 2 2 TABLE exchange AS x WITH INDEX sqlite_autoindex_exchange_1 3 3 TABLE AS self Does INDEX sqlite_autoindex_currency_1 contain only pointers to the identical currencies? Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] unexpected behavior with CROSS and LEFT JOIN together
"Igor Tandetnik"wrote in message news:hm45gu$s5...@dough.gmane.org... > Mark Brand wrote: >> --Gives unexpected results >> SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate, x.rate) rate >> FROM currency c1 >> CROSS JOIN currency c2 >> LEFT JOIN exchange x >>ON x.cur1=c1.cur >>AND x.cur2=c2.cur >> LEFT JOIN (SELECT 1 rate) self >>ON c1.cur=c2.cur; >> >> /* results >> >> EUR|EUR|1 >> GBP|GBP|1 >> USD|USD|1 >> >> */ > > Yes, looks like a bug to me. I see no reason why it shouldn't work. LEFT > JOIN should never produce fewer rows than what left-hand-side table > contains. > > As a workaround, try this instead: > > SELECT c1.cur cur1, c2.cur cur2, >(case when c1.cur=c2.cur then 1 else x.rate) rate > FROM currency c1 > CROSS JOIN currency c2 > LEFT JOIN exchange x >ON x.cur1=c1.cur >AND x.cur2=c2.cur; > > Igor Tandetnik One little change makes it work: SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate,x.rate) rate FROM currency c1 CROSS JOIN currency c2 LEFT JOIN exchange x ON x.cur1=c1.cur AND x.cur2=c2.cur LEFT JOIN (SELECT 1 rate) self ON c1.cur LIKE c2.cur; Don't ask me why '=' and 'LIKE' should behave differently in this example. Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Restricting fast no-result query yields slowno-resultquery
Thank you, thank you, Igor! This relates to the earlier thread I had started titled "Speed regression after 3.6.17" where I was wrestling with a very slow query because the query optimiser was selecting an inappropriate index, one that was very chunky because the field was logical (0,1). This led to officially inappropriate use of the INDEXED BY clause to override the query optimiser or the creation of a temporary table from a query without that field as a constraint so it that would run quickly and the temp table could be queried quickly with the otherwise offending complaint. Your idea of masking the field from the query optimiser with a simple operator works magic. On one test of a query that links the table to itself took 1130 seconds using the bare field as the constraint, 0.291 seconds with the + operator incorporating the field into an expression, an improvement by a factor of 3883!! EXPLAIN QUERY PLAN clearly showed the switch from the inappropriate index to the right one. Great tip! Tom "Igor Tandetnik"wrote in message news:hkh4a0$mk...@ger.gmane.org... > Kelly Jones wrote: >> I have a query that runs very quickly and returns no results: >> >> SELECT * FROM filebackup WHERE sha1='x'; >> >> However, the more restrictive query below runs very slowly, although >> it obviously can't have any results either: >> >> SELECT * FROM filebackup WHERE sha1='x' AND refid=0; >> >> I have indexes on both sha1 (string) and refid (int). Only issue I >> see: this is a large table and refid=0 for 90%+ of the rows. > > For whatever reason, SQLite decides to use an index on refid for this > query. You can suppress it like this: > > SELECT * FROM filebackup WHERE sha1='x' AND +refid=0; > > Igor Tandetnik > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] update on Ticket 3437
You're right! I arrived at a failure of a different kind because I did not drop the Races table as per his script between adding the first record with ID='20' and the second with ID=20. If Races table contains two records, the first with ID='20' and the second with ID=20, his query fails, even though the second record satisfies the condition. Adding DISTINCT somehow resolves this failure; a JOIN ON RaceID=ID assuredly does. If the order is reversed, then his query works. So the WHERE clause looks at the first result from the SELECT; DISTINCT probably re-ordered the results with the Integer value first, the Text value second. I got caught up in this behaviour without noting that his query would have worked had I not skipped dropping the table. (I'm a novice) Tom "Igor Tandetnik"wrote in message news:hjhqq1$ut...@ger.gmane.org... > Tom Holden wrote: >> Nick, I think your query fails because there is a potential for multiple >> values on the right-hand side of WHERE RaceID=. > > No, it fails because 20 != '20' ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQL Crash with sqlite 3.6.22 commandline
That's great! I'm an end-user and cannot compile it to check myself. Thanks to shane, drh, Hub Dog for the rapid fix. I will have to dig around to figure out how/when that shows up in a release. That leaves the question about the change of syntax between 3.5.4 and 3.6.17. In the earlier it was sufficient to override the unavailable collation with a COLLATE clause at the first instance of the field if it is to be later compared in a WHERE clause, e.g.: SELECT Name COLLATE NOCASE FROM AddressTable WHERE Name LIKE '%_'; By release 3.6.17, it was necessary to put the COLLATE clause where the comparison is made, e.g.: SELECT Name FROM AddressTable WHERE Name COLLATE NOCASE LIKE '%_'; 3.5.4 was happy either way. However, if this latter SELECT is a virtual table the subject of another SELECT stmt, then both 3.5.4 and 3.6.17+ throw up an error: "No such collation sequence: x" whereas 3.5.4 is happy if the first statement is the virtual table. For both 3.5.4 and 3.6.17 to be happy, COLLATE NOCASE must be used twice: SELECT * FROM (SELECT Name COLLATE NOCASE FROM AddressTable WHERE Name COLLATE NOCASE LIKE '%_'); OR the collated field must be renamed in the form of the first statement: SELECT * FROM (SELECT Name COLLATE NOCASE AS Nuts FROM AddressTable WHERE Nuts LIKE '%_'); The ORDER BY clause operated differently in 3.5.4, requiring the COLLATE clause within, despite its earlier use in the SELECT expr. Leading to SELECT Name COLLATE NOCASE FROM AddressTable WHERE Name LIKE '%_' ORDER BY Name COLLATE NOCASE; as sufficient for 3.5.4 SELECT Name FROM AddressTable WHERE Name COLLATE NOCASE LIKE '%_' ORDER BY Name COLLATE NOCASE; as necessary for 3.6.17+ and acceptable to 3.5.4 and these, as necessary for both: SELECT * FROM (SELECT Name COLLATE NOCASE FROM AddressTable WHERE Name COLLATE NOCASE LIKE '%_' ORDER BY Name COLLATE NOCASE); -- OR -- SELECT * FROM (SELECT Name COLLATE NOCASE AS Nuts FROM AddressTable WHERE Nuts LIKE '%_' ORDER BY Nuts); So it appears that the later versions are more rigorous than 3.5.4, having a consistent requirement for the collation override in any clause where comparison is made on a field with a missing, application dependent collation sequence, rather than an inferred override from a previous declaration. The trap then is that 'lazy' overrides allowed in 3.5.4 will trigger a "No such collation sequence" in later versions. I think I prefer the 'lazy' override for both WHERE and ORDER BY but perhaps there is good reason to require more explicit COLLATION declarations. Tom "Hub Dog"wrote in message news:4de0b9581001220719u72a3857eq88b674a14d265...@mail.gmail.com... >I can confirm that the crash problem has been fixed by > http://www.sqlite.org/src/info/1258875e07 checked-in . > Now executing following sql will report no such collation sequence: > RMNOCASE > instead of crash. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQL Crash with sqlite 3.6.22 commandline
I just discovered that attachments can be sent through this newsgroup so I have attached a small database with which you should be able to reproduce the problem. The one table in it has a field collated RMNOCASE. The following query produces these results in three different versions of sqlite: 3.5.4 works perfectly 3.6.17 gracefully reports an error - missing RMNOCASE collation 3.6.21 crash SELECT Name COLLATE NOCASE FROM AddressTable WHERE Name LIKE '%_'; 3.5.4 carried the COLLATE NOCASE override at the beginning of the SELECT through to the comparison in the WHEN. The later ones do not, and they 'progress' from reporting an error to a crash. This regression renders queries, on databases having collations unavailable to the sqlite in use, that were developed on older versions of sqlite problematic when run from newer versions. Regards, Tom "D. Richard Hipp"wrote in message news:41371dfd-279f-429d-9186-476efb63e...@hwaci.com... >I am unable to reproduce this problem. Using the script below, with > RMNOCASE changed to just NOCASE, everything works fine on the SQLite > command-line shell on the website on Linux. I also tried various > other versions of SQLite with the same result. > > > On Jan 21, 2010, at 8:00 AM, Hub Dog wrote: > >> I hava a table. The table schema is >> >> CREATE TABLE AddressTable >> ( >> AddressID INTEGER PRIMARY KEY , >> AddressType INTEGER , >> Name TEXT COLLATE RMNOCASE , >> Street1 TEXT , >> Street2 TEXT , >> City TEXT , >> State TEXT , >> Zip TEXT , >> Country TEXT , >> Phone1 TEXT , >> Phone2 TEXT , >> Fax TEXT , >> Email TEXT , >> URL TEXT , >> Latitude INTEGER , >> Longitude INTEGER , >> Note BLOB >> ) ;. >> >> if I execute following sql to query data , the sqlite 3.6.22 command >> line >> downloaded from www.sqlite.org will crash. >> >> SELECT >> Adr.Name COLLATE NOCASE AS AddressName >> FROM >> AddressTable AS Adr >> WHERE >> Adr.Name LIKE '%_'. >> >> if I change the Adr.Name to AddressName , the sql execute result is >> ok. >> >> SELECT >> Adr.Name COLLATE NOCASE AS AddressName >> FROM >> AddressTable AS Adr >> WHERE >> AddressName LIKE '%_' ; >> >> it seems the crash was related with the collate RMNOCASE of >> AddressTable >> table's field Name. >> in default sqlite command line, there is no rmnocase collation. so I >> mapped >> it to the default nocase collation. >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > D. Richard Hipp > d...@hwaci.com > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > begin 666 Address.db3
Re: [sqlite] Sqlite3 Optimization Question
Uhh, sounds like a search function of a text editor or a simple routine you could write (or find) in your development language. I can't see the magic of using sqlite. Tom "Michael Thomason"wrote in message news:3cbb39411001220619j1c6cc8f5x46e0aeb4d0b91...@mail.gmail.com... Thank you. I like that answer and I'll give it a try. The application is a word game, and I have a dictionary of words. The user enters a word, and the application checks against that word to see if it exist. That's the only query ever used. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQL Crash with sqlite 3.6.22 commandline
I originally experienced the problem using a couple of Windows sqlite managers. It seems that ones using the latest few versions of sqlite have the problem, variously reported as: "Access violation at address x in module . Read of address 0005", where x is dependent on the application. Of the three that I tested that had this crash, two reported it thusly, one threw up an unhandled Win32 exception and croaked - they are all, I believe using 3.6.21 or 22. Running the sqlite.exe command line version 3.6.22 throws up the unhandled Win32 exception. Running the 3.6.17 release gracefully reports "SQL error near line 1: no such collation sequence: RMNOCASE". The errors and crashes are from the following query: SELECT Name COLLATE NOCASE FROM AddressTable WHERE Name LIKE '%_'; This revised query works on all versions listed above: SELECT Name COLLATE NOCASE AS NewName FROM AddressTable WHERE NewName LIKE '%_' ; as does: SELECT Name FROM AddressTable WHERE Name COLLATE NOCASE LIKE '%_'; I can forward you a sample database but I do not have access to the RMNOCASE collation. Tom "D. Richard Hipp"wrote in message news:41371dfd-279f-429d-9186-476efb63e...@hwaci.com... >I am unable to reproduce this problem. Using the script below, with > RMNOCASE changed to just NOCASE, everything works fine on the SQLite > command-line shell on the website on Linux. I also tried various > other versions of SQLite with the same result. > > > On Jan 21, 2010, at 8:00 AM, Hub Dog wrote: > >> I hava a table. The table schema is >> >> CREATE TABLE AddressTable >> ( >> AddressID INTEGER PRIMARY KEY , >> AddressType INTEGER , >> Name TEXT COLLATE RMNOCASE , >> Street1 TEXT , >> Street2 TEXT , >> City TEXT , >> State TEXT , >> Zip TEXT , >> Country TEXT , >> Phone1 TEXT , >> Phone2 TEXT , >> Fax TEXT , >> Email TEXT , >> URL TEXT , >> Latitude INTEGER , >> Longitude INTEGER , >> Note BLOB >> ) ;. >> >> if I execute following sql to query data , the sqlite 3.6.22 command >> line >> downloaded from www.sqlite.org will crash. >> >> SELECT >> Adr.Name COLLATE NOCASE AS AddressName >> FROM >> AddressTable AS Adr >> WHERE >> Adr.Name LIKE '%_'. >> >> if I change the Adr.Name to AddressName , the sql execute result is >> ok. >> >> SELECT >> Adr.Name COLLATE NOCASE AS AddressName >> FROM >> AddressTable AS Adr >> WHERE >> AddressName LIKE '%_' ; >> >> it seems the crash was related with the collate RMNOCASE of >> AddressTable >> table's field Name. >> in default sqlite command line, there is no rmnocase collation. so I >> mapped >> it to the default nocase collation. >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > D. Richard Hipp > d...@hwaci.com > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Speed regression after 3.6.17
"Max Vlasov"wrote in message news:7cb963ca1001130315o69235717n92393be027eef...@mail.gmail.com... > > >> After much playing about, I have determined that it is necessary to >> violate >> the guidelines on not using INDEXED BY and NOT INDEXED in an attempt to >> optimise performance. > > > Although you're very determined about your conclusions, I saw a > misunderstanding about INDEXED BY in your statements. > The docs say "The INDEXED BY clause is *not* intended for use in tuning > the > preformance of a query". From my point of you one should think about these > extensions as sqlite's sql equivalent of assertions from high-level > languages. Max, thanks for your follow up. I think we understand the guideline the same way. It says that one should not use INDEXED BY for tuning a query. I had to use INDEXED BY and NOT INDEXED to tune my query because the query optimiser mal-tuned it by ~70:1. After 3.6.17, the query optimiser tuned even worse - to ~1000:1. Operating on ANALYZE stats took what was the fastest result with a couple of INDEXED BY clauses down to ~70:1. The only way I could get the fastest result across different sqlite versions and with/without ANALYZE stats was to do the very opposite of what the docs say - i.e., use INDEXED clauses liberally to restrict what the query optimiser can do to screw up my query. I don't understand your last sentence as I am not a programmer and have no experience of working with sql assertions. > For example, a quick test. > I have a table Contacts with an index ids_Contacts using field CenterId. > If I try to execute > > SELECT * FROM Contacts INDEXED BY idx_Contacts > > sqlite says: "cannot use index: idx_Contacts". > > Only when a change made adding explicit ORDER BY clause applied > > SELECT * FROM Contacts INDEXED BY idx_Contacts ORDER BY CenterId > > ... the query executes successfully. That's odd. Using a SQLite manager with sqlite 3.6.21, both lines give the same error on one of my tables with 5 indexes. SELECT Surname FROM NameTable INDEXED BY idxSurname SELECT Surname FROM NameTable INDEXED BY idxSurname ORDER BY Surname COLLATE NOCASE And EXPLAIN QUERY PLAN indicates that the query optimiser used no index in both cases when the INDEXED BY clause was deleted. > If I remove now INDEXED BY from the latter statement nothing will change > in > how the query is executed, _only_ if someone excplicitely deletes the > index > from the database. Assuming that this was indeed a multi-table query that worked, the query can only execute identically with/without the INDEXED clauses provided the query optimiser consistently used the same indexes. And, yes, it would throw up an error if the INDEXED BY index had been dropped. > So any of your statements that argues that adding or removing INDEXED BY > affects the way your queries is executed is have to be double checked. Isn't that what I demonstrated with the comparative results of EXPLAIN QUERY PLAN? The optimiser chose inappropriate indexes when allowed to do so. Your statement assumes that the query optimiser chose the same indexes as the ones I chose in iteratively tuning for fastest speed. > If you're still sure that you found a major flaw in the most deployed sql > database in the world, please narrow your tests and conclusions to > something > easier reproducible. Everyone understands that you're solving one > particular > problem related to you, but if you find time to make things more simple, > it > will be to everyone's benefit. Max, I don't know that I have found a major flaw in the database engine or that I have thrown up a very exceptional database structure that its optimiser can't properly deal with. I most certainly had to do the opposite of what the online docs said about the use of INDEXED clauses. My tests are easily reproducible if you care to let me send you the database and queries. The SQLite manager developer who led me to investigate the change in performance around 3.6.18 of my early query that lacked INDEXED clauses has also offered the database in the thread he started titled "Performance regression". I don't know how I can make things more simple - I'm not a programmer and my interest in sqlite is pretty narrow. I am hoping that bringing my problem and observations to this forum will elicit the necessary interest from the sqlite developer community as to why it was necessary for me to violate their very clear instruction that one should not use INDEXED clauses to tune a query. That instruction can only be correct if, and only if, the query optimiser can be relied on to tune the query optimally. In this case, it clearly does not. Thanks to the feedback so far, I have learned how to use EXPLAIN and that I was mistaken to initially report this as a speed regression. Rather it turns out to be that query optimisation has changed after 3.6.17, that there is at least one database structure for which the query
Re: [sqlite] Speed regression after 3.6.17
"ve3meo" <holden_fam...@sympatico.ca> wrote in message news:hiivpn$7f...@ger.gmane.org... > > "Simon Slavin" <slav...@bigfraud.org> wrote in > message > news:ad11c649-d23a-46ff-8545-85ffb5219...@bigfraud.org... >> >> The database structure has not changed. It is optimisation of a >> particular kind of query which seems to be working differently -- and not >> as well. Unfortunately, the example supplied is extremely complicated >> and it's not easy to tell which aspect of the query is at fault. > > Sorry. What I meant: is it possible that it is just the particular > structure of my database that is susceptible to this performance > regression? > > I agree that the optimisation strategy has changed for the better between > 3.6.17 and 3.6.20 for the unANALYZED unINDEXED query. > > I postulate that the optimisation strategy is defective for the ANALYZED, > INDEXED query for both 3.6.17 and 3.6.20 while 3.5.4 throws up a false > syntax error. > [snip] > Thanks for any follow-up. Is this the right way to flag a problem for the > SQlite developers? > After much playing about, I have determined that it is necessary to violate the guidelines on not using INDEXED BY and NOT INDEXED in an attempt to optimise performance. This would seem to be the case in a database where there are tables being queried that have one or more CREATEd INDEXes that could lead the SQLite query optimiser to choose an inappropriate index. The optimiser could thus direct the query to be processed extremely sub-optimally, as has been demonstrated by my database and queries where a 1000:1 ratio in execution time has been experienced. The longest times were for those queries in which the optimiser had total freedom to pick the order that tables were processed and which indexes were used. The shortest were those in which: a) at least some explicit INDEXED BY clauses were added and the optimiser lucked out on the rest BEFORE ANALYZE was run, or b) AFTER ANALYZE was run, NOT INDEXED clauses were added to prevent the optimiser from using anything other than the PRIMARY KEY, and more INDEXED BY clauses were added to override the optimiser's now inappropriate selections. Post 3.6.17, it is clear that the query optimiser degraded performance extremely sub-optimally for this database in its ante-ANALYZED state. For the ANALYZED state, the query optimiser degraded the performance of queries with partial INDEX control for 3.6.17 and post 3.6.17. That says to me, "THE QUERY OPTIMISER IS NOT RELIABLE". 3.5.4 returned errors near "INDEXED" because INDEXED BY and NOT INDEXED extensions were only added at 3.6.3 in late 2008. Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Speed regression after 3.6.17
"Simon Slavin"wrote in message news:ad11c649-d23a-46ff-8545-85ffb5219...@bigfraud.org... > > The database structure has not changed. It is optimisation of a > particular kind of query which seems to be working differently -- and not > as well. Unfortunately, the example supplied is extremely complicated and > it's not easy to tell which aspect of the query is at fault. Sorry. What I meant: is it possible that it is just the particular structure of my database that is susceptible to this performance regression? I agree that the optimisation strategy has changed for the better between 3.6.17 and 3.6.20 for the unANALYZED unINDEXED query. I postulate that the optimisation strategy is defective for the ANALYZED, INDEXED query for both 3.6.17 and 3.6.20 while 3.5.4 throws up a false syntax error. The example I've given is a UNION of five SELECTs but it is the first two that have been the most problematic. As they are virtually identical, let me repeat the EXPLAIN QUERY PLAN results, first for the always slow, not deliberately indexed query and then for the once fast, explicitly indexed query. All the index files are predefined by the genealogy database application software. The full query returns ~52,000 rows in the approx. time shown for each SQLite release and condition. Some variation is to be expected from varying concurrent processes but, generally, sqlite consumes ~98% of CPU, most of the time without disk thrashing. -- THE UNINDEXED QUERY SELECT FactTypeTable.Name COLLATE NOCASE AS Fact, 'Principal' AS 'Role Type', NameTable1.OwnerID AS RIN, NameTable1.Surname COLLATE NOCASE AS Surname, NameTable1.Suffix COLLATE NOCASE AS Suffix, NameTable1.Prefix COLLATE NOCASE AS Prefix, NameTable1.Given COLLATE NOCASE AS 'Given Name', NameTable2.OwnerID AS 'Sharer RIN', NameTable2.Surname COLLATE NOCASE AS 'Sharer Surname', NameTable2.Suffix COLLATE NOCASE AS 'Sharer Suffix', NameTable2.Prefix COLLATE NOCASE AS 'Sharer Prefix', NameTable2.Given COLLATE NOCASE AS 'Sharer Given Name', COUNT(1) AS Count FROM EventTable INNER JOIN FactTypeTable ON EventTable.EventType = FactTypeTable.FactTypeID INNER JOIN FamilyTable ON EventTable.OwnerID = FamilyTable.FamilyID INNER JOIN NameTable AS NameTable1 ON FamilyTable.FatherID = NameTable1.OwnerID INNER JOIN NameTable AS NameTable2 ON FamilyTable.MotherID = NameTable2.OwnerID WHERE EventTable.OwnerType = 1 AND NameTable1.IsPrimary = 1 AND NameTable2.IsPrimary = 1 GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 . . ORDER BY RIN RESULTS BEFORE ANALYZE SQLite 3.5.4: ~250s EXPLAIN QUERY PLAN results BEFORE ANALYZE "order", "from", "detail" "0", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary" "1", "2", "TABLE FamilyTable WITH INDEX idxFamilyFatherID" "2", "0", "TABLE EventTable WITH INDEX idxOwnerDate" "3", "1", "TABLE FactTypeTable USING PRIMARY KEY" "4", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary" SQLite 3.6.17: ~250s cf 190s after ANALYZE EXPLAIN QUERY PLAN results BEFORE ANALYZE order,from,detail 0,3,TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary, 1,2,TABLE FamilyTable WITH INDEX idxFamilyFatherID, 2,0,TABLE EventTable WITH INDEX idxOwnerDate, 3,1,TABLE FactTypeTable USING PRIMARY KEY, 4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary, SQLite 3.6.20: ~2500s cf ~165s after ANALYZE EXPLAIN QUERY PLAN results BEFORE ANALYZE "order","from","detail" "0","3","TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary" "1","4","TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary" "2","2","TABLE FamilyTable WITH INDEX idxFamilyMotherID" "3","0","TABLE EventTable WITH INDEX idxOwnerDate" "4","1","TABLE FactTypeTable USING PRIMARY KEY" RESULTS AFTER ANALYZE SQLite 3.5.4: ~198s cf ~250s before ANALYZE EXPLAIN QUERY PLAN results AFTER ANALYZE "order", "from", "detail" "0", "1", "TABLE FactTypeTable" "1", "2", "TABLE FamilyTable" "2", "0", "TABLE EventTable WITH INDEX idxOwnerEvent" "3", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID" "4", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID" SQLite 3.6.17: ~190s cf ~250s before ANALYZE EXPLAIN QUERY PLAN results AFTER ANALYZE order,from,detail 0,1,TABLE FactTypeTable, 1,2,TABLE FamilyTable, 2,0,TABLE EventTable WITH INDEX idxOwnerEvent, 3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID, 4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID, SQLite 3.6.20: ~165s cf ~2500s before ANALYZE EXPLAIN QUERY PLAN results AFTER ANALYZE "order","from","detail" "0","1","TABLE FactTypeTable" "1","2","TABLE FamilyTable" "2","0","TABLE EventTable WITH INDEX idxOwnerEvent" "3","3","TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID" "4","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID" - THE INDEXED QUERY SELECT FactTypeTable.Name COLLATE NOCASE AS Fact, 'Principal' AS 'Role Type', NameTable1.OwnerID AS RIN,
Re: [sqlite] Speed regression after 3.6.17
"Max Vlasov"wrote in message news:7cb963ca1001120149u550da7fr56cfc0ed261a9...@mail.gmail.com... > ... Are you sure you have > identical tests for your comparison? I mean these are the same data sets > with the same scheme? Absolutely, errr, to the best of my knowledge. Same database, same query, but different SQLite managers having different versions of SQLite. But the fact that the SQLite manager is different is immaterial - the developer of one of them was the one to identify that it is the SQLite version that is the controlling variable: 3.6.17 and somewhat earlier - better; 3.6.18 and later - worse. The volatility of SQLite performance over these version changes and with the use of ANALYZE is very disquieting for me. The database structure is that of a popular genealogy software. Having run ANALYZE, now not only has my fastest query been degraded by 70:1, so has the speed of the 'owner' software in generating its reports. Attention needs to be paid to what changed between 3.6.17 and 3.6.18 that has adversely affected performance and why is it that ANALYZE is deleterious. Is it just this database structure that has been affected? Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Speed regression after 3.6.17
"D. Richard Hipp" <d...@hwaci.com> wrote in message news:4c2248d2-898d-41ec-81fc-bafdb726c...@hwaci.com... > > On Jan 11, 2010, at 9:28 PM, ve3meo wrote: >> Oddly enough, by revising the query to explicitly >> use an index, the two later releases are much faster... > > Did you run ANALYZE before you tried using explicit indices? > No. You have introduced me to it. So I tried using a SQLite manager with the 3.6.17 release. Shaved 25% off the unindexed (SQLite optimised only) query time to ~190s. Multiplied the previously fast explicitly indexed query from 3s to 210s. I was happy with the 3s. Here are the EXPLAIN QUERY PLAN results before and after ANALYZE: Before: order,from,detail 0,0,TABLE EventTable, 1,1,TABLE FactTypeTable USING PRIMARY KEY, 2,2,TABLE FamilyTable USING PRIMARY KEY, 3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID, 4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID, 0,0,TABLE EventTable, 1,1,TABLE FactTypeTable USING PRIMARY KEY, 2,2,TABLE FamilyTable USING PRIMARY KEY, 3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID, 4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID, 0,2,TABLE NameTable WITH INDEX idxNamePrimary, 1,0,TABLE EventTable WITH INDEX idxOwnerDate, 2,1,TABLE FactTypeTable USING PRIMARY KEY, 0,0,TABLE NameTable WITH INDEX idxNamePrimary, 0,0,TABLE WitnessTable, 1,1,TABLE roletable USING PRIMARY KEY, 2,2,TABLE EventTable USING PRIMARY KEY, 3,5,TABLE FactTypeTable USING PRIMARY KEY, 4,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID, 5,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID, After: order,from,detail 0,1,TABLE FactTypeTable, 1,2,TABLE FamilyTable, 2,0,TABLE EventTable WITH INDEX idxOwnerEvent, 3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID, 4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID, 0,1,TABLE FactTypeTable, 1,2,TABLE FamilyTable, 2,0,TABLE EventTable WITH INDEX idxOwnerEvent, 3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID, 4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID, 0,1,TABLE FactTypeTable, 1,2,TABLE NameTable WITH INDEX idxNamePrimary, 2,0,TABLE EventTable WITH INDEX idxOwnerEvent, 0,0,TABLE NameTable WITH INDEX idxNamePrimary, 0,1,TABLE roletable, 1,5,TABLE FactTypeTable, 2,2,TABLE EventTable, 3,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID, 4,0,TABLE WitnessTable WITH INDEX idxWitnessEventID, 5,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID, Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Speed regression after 3.6.17
Max Vlasovwrites: > > On Mon, Jan 11, 2010 at 2:33 AM, Tom Holden wrote: > > > I have been trying out a number of SQLite managers, one test being the > > execution time for the same query on the same database on the same computer. > > The scattering of results was very surprising with a spread on the order of > > 10:1. > > > > Is the message posted 11 hours ago about the same issue? ( > http://www.mail-archive.com/sqlite-users-CzDROfG0BjIdnm+yROfE0A public.gmane.org/msg49650.html) > Anyway I still suggest the same (see in the thread) - compare VDBE code > sequences > > Max Yes, that is the developer of one of the SQLite managers who posted the earlier message re "Performance regression...". I am not a developer, merely a new user of applications that use SQLite who is delving into the application database. I compared the EXPLAIN QUERY PLAN results for SQLite 3.5.4, 3.6.17 and 3.6.20. The first two are identical; the third is quite different. I don't quite know what that means; maybe it indicates that the later version joins the tables less efficiently. Oddly enough, by revising the query to explicitly use an index, the two later releases are much faster than for the unindexed query while the first returns an error near 'INDEXED'. Here are the results of the EXPLAIN QUERY PLAN for the original, unindexed query: For 3.5.4 and 3.6.17 which executed in ~240s: "order", "from", "detail" "0", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary" "1", "2", "TABLE FamilyTable WITH INDEX idxFamilyFatherID" "2", "0", "TABLE EventTable WITH INDEX idxOwnerDate" "3", "1", "TABLE FactTypeTable USING PRIMARY KEY" "4", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary" "0", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary" "1", "2", "TABLE FamilyTable WITH INDEX idxFamilyFatherID" "2", "0", "TABLE EventTable WITH INDEX idxOwnerDate" "3", "1", "TABLE FactTypeTable USING PRIMARY KEY" "4", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary" "0", "2", "TABLE NameTable WITH INDEX idxNamePrimary" "1", "0", "TABLE EventTable WITH INDEX idxOwnerDate" "2", "1", "TABLE FactTypeTable USING PRIMARY KEY" "0", "0", "TABLE NameTable WITH INDEX idxNamePrimary" "0", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary" "1", "0", "TABLE WitnessTable WITH INDEX idxWitnessPersonID" "2", "1", "TABLE roletable USING PRIMARY KEY" "3", "2", "TABLE EventTable USING PRIMARY KEY" "4", "5", "TABLE FactTypeTable USING PRIMARY KEY" "5", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID" And for 3.6.20 which executed the same query in ~2500s: "order","from","detail" "0","3","TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary" "1","4","TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary" "2","2","TABLE FamilyTable WITH INDEX idxFamilyMotherID" "3","0","TABLE EventTable WITH INDEX idxOwnerDate" "4","1","TABLE FactTypeTable USING PRIMARY KEY" "0","3","TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary" "1","4","TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary" "2","2","TABLE FamilyTable WITH INDEX idxFamilyMotherID" "3","0","TABLE EventTable WITH INDEX idxOwnerDate" "4","1","TABLE FactTypeTable USING PRIMARY KEY" "0","2","TABLE NameTable WITH INDEX idxNamePrimary" "1","0","TABLE EventTable WITH INDEX idxOwnerDate" "2","1","TABLE FactTypeTable USING PRIMARY KEY" "0","0","TABLE NameTable WITH INDEX idxNamePrimary" "0","3","TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary" "1","0","TABLE WitnessTable WITH INDEX idxWitnessPersonID" "2","1","TABLE roletable USING PRIMARY KEY" "3","2","TABLE EventTable USING PRIMARY KEY" "4","5","TABLE FactTypeTable USING PRIMARY KEY" "5","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID" Here are the results for the indexed query, identical for 3.6.17 and 3.6.20 but an error with 3.5.4 so no result to compare, executes in ~2s: "order","from","detail" "0","0","TABLE EventTable" "1","1","TABLE FactTypeTable USING PRIMARY KEY" "2","2","TABLE FamilyTable USING PRIMARY KEY" "3","3","TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID" "4","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID" "0","0","TABLE EventTable" "1","1","TABLE FactTypeTable USING PRIMARY KEY" "2","2","TABLE FamilyTable USING PRIMARY KEY" "3","3","TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID" "4","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID" "0","2","TABLE NameTable WITH INDEX idxNamePrimary" "1","0","TABLE EventTable WITH INDEX idxOwnerDate" "2","1","TABLE FactTypeTable USING PRIMARY KEY" "0","0","TABLE NameTable WITH INDEX idxNamePrimary" "0","0","TABLE WitnessTable" "1","1","TABLE roletable USING PRIMARY KEY" "2","2","TABLE EventTable USING PRIMARY KEY" "3","5","TABLE FactTypeTable USING PRIMARY KEY" "4","3","TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID" "5","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID" The results for these