Re: [sqlite] WHERE expression with operators from text functions?

2014-10-18 Thread ve3meo
I tried your query tonight, Igor, and it worked, not that I understand how... 
the recursive query syntax is still a mystery to me and the many NOTs are a
challenge to interpret. Compared to the FTS solution, I observe some
preliminary differences in performance that are interesting. 

The content being searched is made up from many tables. For the FTS search
it is all inserted into the FTS4 virtual table which automatically builds
the FTS index. For your recursive search, I create a VIEW of the same
content. With larger databases, the time taken to build the FTS virtual
table grows much faster than the VIEW. On the other hand, the time taken to
search grows much faster for the recursive search than for the FTS. The
search expression had three terms ANDed (a AND b AND c).

I was delighted with having NOT, AND, OR operators in FTS among other
capabilities. To incorporate all three in the recursive search is
non-obvious.

Tom



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/WHERE-expression-with-operators-from-text-functions-tp78653p78700.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] WHERE expression with operators from text functions?

2014-10-17 Thread ve3meo
Igor, that is a most oblique and intriguing approach. I will try it out and
try to get my head around it in the next day or so.

Thanks,
Tom


Igor Tandetnik-2 wrote
> On 10/16/2014 12:03 PM, Tom Holden wrote:
>> I need a way to convert the text result to an expression that WHERE
>> evaluates as an expression.
>>
>> Any possibility to do this within SQLite?
> 
> with recursive split(str, tail) as (
>select null, 'string1+string2+string3'
> union all
>select substr(tail, 1, instr(tail || '+', '+')-1), substr(tail, 
> instr(tail || '+', '+') + 1)
>from split where tail != ''
> )
> select * from mytable where not exists (
>select str from split
>where str is not null and value not like '%' || str || '%'
> );
> 
> -- 
> Igor Tandetnik
> 
> ___
> sqlite-users mailing list

> sqlite-users@

> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users





--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/WHERE-expression-with-operators-from-text-functions-tp78653p78697.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] updating using a value from another table

2013-10-07 Thread ve3meo
Igor Tandetnik-2 wrote
> Or alternatively, without a WHERE clause:
> 
> update a set i = coalesce((select i from b where b.a = a.a), i);

What I have used similar to this is: 

UPDATE a SET i = ifnull((select i from b where b.a = a.a), i);

Tom



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/updating-using-a-value-from-another-table-tp71588p71607.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] 5000 tables with 3000 records vs 1 table with 15000000records

2010-08-22 Thread ve3meo

"Mike Zang" ...
> In fact, the file is stock data and one file is for one stock, so they are
> all in the same format.
>
> Then, in most case, only one stock should be selected.
>
Then, from a 'disk' operating system and memory management perspective, 
wouldn't one file per stock continue to be faster and more efficient than 
having all the stocks in one big file? A master database with pointers to 
each of the stock databases, ATTACH as many as you want to view at one time, 
maybe combine data into temporary table(s), create and drop indexes on the 
fly thus keeping storage at a minimum. 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to get the close bynumbers?

2010-08-22 Thread ve3meo
"Peng Yu" ...
>> I think the penalty is in the extra JOIN required - 3 tables instead of 
>> 2 -
>> with the speed advantage on the 'between' constraint being swamped by the
>> volume of intermediate rows.
>
> I don't quite understand why there are 3 tables with R-tree. Would you
> please show me what query you used? Thank you very much.

For every row in table A1, look up in the R-Tree table A_X those rows whose 
xmin and xmax contain the position value, then look up in table A2 the 
rowids corresponding to A_X.id and discard those whose name does not match 
A1.name and those whose position is the same. 3 tables, 2 JOINs.

SELECT A1.Name, A1.position, A_X.id, A_X.xmin, A_X.xmax, A2.Name, 
A2.position FROM A AS A1, A_X, A AS A2
WHERE
A1.position != A2.position and
A1.Name = A2.Name and
A2.id = A_X.id and
A1.position between A_X.xmin and A_X.xmax
LIMIT 40;

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to get the close by numbers?

2010-08-21 Thread ve3meo
"Eric Smith" ...
> I haven't used it myself, but I'm pretty sure this is what the R*tree
> module was designed for:

I have not used it either but was intrigued by your suggestion. Looking into 
it, my sense was that it would be advantageous for a 2-dimension or more 
search and I think this was borne out by my experiments, albeit with my very 
limited knowledge and experience. I hasten to add that maybe my queries were 
not the best designed.

I created an R-tree virtual table with the minimum number of columns: id, 
xmin, xmax and populated it with the rowid from Peng's table A and 
position-10 in xmin and position+10 in xmax. Thus any value lying between 
xmax and xmin is no more than a distance of 10 from the point pointed to in 
Table A by id. Indeed, a simple select on a single value between xmax and 
xmin returned rows from the virtual table faster than the correspondingly 
simple select on the real table with an index on position (something like 
6ms vs. 10 ms).

However, on the 10,000 row test table on which I reported earlier that Jim's 
'between' query was fastest at ~2.3s, the best I could get by working an 
R-tree virtual table into the mix was ~30s, and that with an index on Name 
or Name+Position. Without the index... ~60s, not much better than Peng's 
original indexless query and way more complicated.

I think the penalty is in the extra JOIN required - 3 tables instead of 2 - 
with the speed advantage on the 'between' constraint being swamped by the 
volume of intermediate rows.

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to gettheclosebynumbers?

2010-08-21 Thread ve3meo
"Simon Slavin" ...
>
>> Here's how I have interpreted Simon's suggested chunky query:
>>
>> select * from A as A1, A as A2 where A1.name=A2.name and
>> A1.position != A2.position and
>> A2.chunk between A1.chunk - 1 and A1.chunk + 1 and
>> A2.position between A1.position - 10 and A1.position + 10 ;
>
> That's exactly what I meant.  It allows you to reject everything not in 
> those three chunks immediately, before even working out what 'position - 
> 10' and 'position + 10' are.  But looking again at your formulation with 
> 'between' in ...
>
>> select * from A as A1, A as A2 where A1.name=A2.name and
>> A1.position != A2.position and
>> A2.position between A1.position - 10 and A1.position + 10 ;
>
> I think that that might be just as fast, and not require you to work out 
> the chunks, or take up all the space the chunk indexes will take up, or 
> the time it will take to generate them.  So we're back to the same old 
> song: Try the simpler solution.  If it's fast enough, use it.  Only if it 
> isn't fast enough is it worth optimizing.
>
> By the way, you might find that swapping the last two lines makes it 
> faster:
>
>
>> select * from A as A1, A as A2 where A1.name=A2.name and
>> A2.position between A1.position - 10 and A1.position + 10 and
>> A1.position != A2.position ;
>
> But you might not: SQLite's optimizer may already have spotted it.

I did some tests on a 10,000 row table similar to Peng's example with name a 
random alpha and random position 0-99, distance 10: each query returned 
333,450 results.

~62s: Peng's original query with no index, either order of constraints
~4s: Peng's original query with idxNamePos, either order
~2.3s: Jim's 'between' query with idxNamePos, either order
~2.7s: Simon's 'chunky' addition to Jim's with idxNameChunkPos, any order

So:
1. Simpler is best, in this case. The 'chunking' option paid a ~16% penalty 
in speed and added considerable complexity and storage requirement.
2. The order of the where constraints did not seem to matter - I guess the 
optimiser sorts it out or maybe the random nature of the data offers no 
advantage to any particular order.

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to gettheclosebynumbers?

2010-08-21 Thread ve3meo

"Simon Slavin" ...
>
> By the way, you might find that swapping the last two lines makes it 
> faster:
>
>
>> select * from A as A1, A as A2 where A1.name=A2.name and
>> A2.position between A1.position - 10 and A1.position + 10 and
>> A1.position != A2.position ;
>
> But you might not: SQLite's optimizer may already have spotted it.
>
I had opted for the other because it had fewer VM instructions and I thought 
the != condition might be a faster process than the 'between' and eliminate 
some points from being processed by 'between'. Of course, the result is 
highly dependent on the correlation of data - if every point is unique then 
the != condition is a wasteful test.

Thanks for the feedback, Simon & Igor.

Sorry, Peng, if it seems I highjacked the discussion. Hope it was helpful to 
you, too. 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to get theclosebynumbers?

2010-08-20 Thread ve3meo
"Simon Slavin"  wrote ...
>
> On 21 Aug 2010, at 3:29am, Igor Tandetnik wrote:
>
>> ve3meo <holden_fam...@sympatico.ca> wrote:
>>> If the number of VM instructions is a good inverse indicator of 
>>> efficiency,
>>> then the second query with the index appears to be the most efficient. 
>>> I'm
>>> sure somebody will point out if it is not.
>>
>> It is not. It may very well take fewer instructions to implement a 
>> straightforward linear scan than it is to join through the index.
>
> Yeah.  I think your formulation like this:
>
>> 41 - Jim's query revamped to work:
>> select * from A as A1, A as A2 where A1.name=A2.name and
>> A1.position != A2.position and
>> A2.position between A1.position - 10 and A1.position + 10 ;
>
> is going to be the fastest.  It involves more VM instructions but ends up 
> searching fewer points.  You might want to time it as it is, then add an 
> index (just for testing) on just (position) and see whether that improves 
> matters.

Thanks for the explanation. I thought I might be naively wishing for too 
much from the count of VM instructions when some may take much longer to 
execute than others. I was intrigued by Peng's question and wondered if I 
might be able to predict efficiency by examining Explain Query Plan and 
Explain. Apparently not.

I understand the principle of minimising the number of points to search and 
I think that this argues for the WHERE conditions to have the coarse 
resolution first and the finest resolution last. To pick up on Simon's 
suggestion to chunk, I added an integer field 'chunk' to the table:

UPDATE A SET chunk = (position/10);

I indexed a bunch of ways to see what the optimiser would do:

create index idxNameChunkPos ON A(name,chunk,position);
create index idxNamePos ON A(name,position);
create index idxNameChunk ON A(name,chunk);
create index idxName ON A(name);
create index idxPos ON A(position);
create index idxChunk ON A(chunk);


Here's how I have interpreted Simon's suggested chunky query:

select * from A as A1, A as A2 where A1.name=A2.name and
A1.position != A2.position and
A2.chunk between A1.chunk - 1 and A1.chunk + 1 and
A2.position between A1.position - 10 and A1.position + 10 ;

Explain Query Plan shows that this query uses table A2 with idxNameChunkPos. 
Is this liable to result in a minimum of points searched?

But so do Peng's original and re-ordered query use idxNameChunkPos even 
though Chunk is not in the query. Despite the presence of all these other 
indexes, Jim's suggested query persists in using idxNamePos, as one would 
expect it to be the optimal index for that query and for the original and 
re-ordered.

I'm wary of what the optimiser does. On those first two queries, I have seen 
that it switches from using idxNamePos to idxName after the addition of the 
Chunk field when only those two indexes were present. I should add that the 
SQLite manager through which I play with SQLite uses a 3.7.0 DLL.

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to get the close bynumbers?

2010-08-20 Thread ve3meo
Without an index, EXPLAIN returns the following number of virtual machine 
instructions for each query:

41 - original query:
select * from A as A1, A as A2 where A1.name=A2.name and
abs(A1.position - A2.position) <= 10 and
A1.position != A2.position ;

39 - original query re-ordered:
select * from A as A1, A as A2 where A1.name=A2.name and
A1.position != A2.position and
abs(A1.position - A2.position) <= 10 ;

41 - Jim's query revamped to work:
select * from A as A1, A as A2 where A1.name=A2.name and
A1.position != A2.position and
A2.position between A1.position - 10 and A1.position + 10 ;

In all cases above, Table A2 is used with an automatic index.

create index idxNamePos ON A(name,position);

With this index present, the above queries all use it EXPLAIN reports this 
number of VM instructions, respectively:

32, 30, 36.

If the number of VM instructions is a good inverse indicator of efficiency, 
then the second query with the index appears to be the most efficient. I'm 
sure somebody will point out if it is not.

I didn't include Simon's suggestion as I did not know how to implement it.

Tom

"Jim Morris"  wrote in 
message news:4c6f0a83.4090...@bearriver.com...
>  If there is an index on (name, position) the a where like below might
> use it.
>
> A1.name=A2.name and A2.position between( A1.position - 10, A1.position + 
> 10 )
>
>
> On 8/20/2010 3:54 PM, Peng Yu wrote:
>> Hi,
>>
>> I have the following code to search for neighboring positions
>> (distance<=10). But it is slow for large data set. I'm wondering what
>> is the most efficient query for such a search. Note that I don't
>> create an index, as I'm not sure what index to create on table A.
>>
>> $ cat main.sql
>> #!/usr/bin/env bash
>>
>> rm -f main.db
>> sqlite3 main.db<>
>> create table A (name text, position integer);
>> insert into A values('a', 1);
>> insert into A values('a', 5);
>> insert into A values('a', 21);
>> insert into A values('b', 3);
>> insert into A values('b', 15);
>> insert into A values('b', 19);
>>
>> .mode column
>> .headers on
>> .echo on
>> select * from A as A1, A as A2 where A1.name=A2.name and
>> abs(A1.position - A2.position)<= 10 and A1.position != A2.position;
>>
>> EOF
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] PRAGMA database_list: insert into table?

2010-07-25 Thread ve3meo
"Roger Binns" <rog...@rogerbinns.com> wrote in 
message news:4c4b1ca7.1040...@rogerbinns.com...
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 07/24/2010 05:38 AM, ve3meo wrote:
>> I am a humble user of the command line implementation and of various 
>> SQLite
>> managers and not an application developer so I cannot do it in code.
>
> You are coding!  The command line tool is merely a thin wrapper around the
> library and the managers are a gui equivalent.
>
> I'd encourage you to try one of the scripting languages out there.  They
> provide an easy way of adding a little more control flow around the SQL,
> variables, better interoperability with files etc.  These languages are 
> easy
> to learn and use.
>
> My personal preference is Python, but you won't go far wrong with Lua, 
> Tcl,
> Perl, PHP, Basic etc.  See a long list of languages that have SQLite
> available at:
>
>  http://www.sqlite.org/cvstrac/wiki?p=SqliteWrappers

I have dabbled years ago with Tcl and even resurrected Quick Basic recently 
so I could do something - even probably a DOS batch file. Before heading 
there, I wanted to exhaust the possibility of doing it within a SQL query. 
That's convenient for me and other users to be able to do that and many 
other queries from within the same SQLite manager.

>
>>> Depending on what the information is you are obtaining, it may be 
>>> possible
>>> to get it via direct queries on sqlite_master.
>>
>> Thanks, I'll investigate further.
>
> How about telling us what you are trying to achieve (not how you are 
> trying
> to achieve it :-)

I use a genealogical software that employs SQLite as its database engine. 
That software is deficient in some respects and I have developed quite a few 
SQLite queries that I can run outside the application to examine and modify 
data in ways that the app cannot. One query aims to extract various 
properties of the genealogical database that can tell me its characteristics 
in more detail than the software does and may flag possible problems or 
areas needing attention. The query compares certain data against that of an 
attached reference database which may change with a new release of the 
software. What I want to add to the results of this query are the names of 
the test database file and the reference file. I was hoping to do that from 
within the same sql file.

Tom

>
> Roger



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] PRAGMA database_list: insert into table?

2010-07-24 Thread ve3meo
"Roger Binns" <rog...@rogerbinns.com> wrote in 
message news:4c4a5bd5.5010...@rogerbinns.com...
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 07/23/2010 07:47 PM, ve3meo wrote:
>> Is it possible to store the results of a PRAGMA statement, especially 
>> PRAGMA
>> database_list in a SQLite temporary table using only SQLite commands?
>
> No.  Is there any particular reason your code can't copy them internally?
> Remember that SQLite is a library - it lives inside your application - and
> is not some remote unchangeable component.

I am a humble user of the command line implementation and of various SQLite 
managers and not an application developer so I cannot do it in code.

> Depending on what the information is you are obtaining, it may be possible
> to get it via direct queries on sqlite_master.

Thanks, I'll investigate further.

>
> Roger
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] PRAGMA database_list: insert into table?

2010-07-23 Thread ve3meo
Is it possible to store the results of a PRAGMA statement, especially PRAGMA 
database_list in a SQLite temporary table using only SQLite commands? I have 
tried every combination I can think of without success.

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] wrong output for a select group by to a sub query with acase expression

2010-03-05 Thread ve3meo
I replicated your results. ORDERing the results seems to resolve the error 
but clearly the GROUP BY is otherwise getting confused by the CASE.

This works:

select categ, count(1) from (select *, (case when a=0 then 0 else
cast((a + 25) / 50 + 0.5 as int) end) as categ from test order by a) a group 
by categ;  -- right

or this:

select categ, count(1) from (select *, (case when a=0 then 0 else
cast((a + 25) / 50 + 0.5 as int) end) as categ from test order by categ) a 
group by categ;  -- right

and this, w/o either the CASE or the ORDER, except it does not handle the 
a=0 condition:

select cast((a + 25) / 50 + 0.5 as int) as categ, COUNT(1) from test group 
by categ; -- almost right

There seem to be a couple of unused elements in your query and it could be 
written:

select categ, count(1) from (select (case when a=0 then 0 else
cast((a + 25) / 50 + 0.5 as int) end) as categ from test order by a) group 
by categ;  -- right

Tom


"Pedro Pedruzzi"  
wrote in message news:4b901b1d.8090...@gmail.com...
> I'm using sqlite3-3.6.22, downloaded today from sqlite web site.
>
> Steps to reproduce:
>
> create table test(a real);
> .import bugdata test
>
> select categ, count(1) from (select *, (case when a=0 then 0 else
> cast((a + 25) / 50 + 0.5 as int) end) as categ from test) a group by 
> categ;
>
> 1|10
> 1|25
> 3|26
>
> The count is ok, but the categ is not supposed to duplicate.
>
> This next very similar query works ok (with the provided data in
> particular the correct results happens to be the same).
>
> select categ, count(1) from (select *, (case when 0=1 then 0 else
> cast((a + 25) / 50 + 0.5 as int) end) as categ from test) a group by 
> categ;
>
> 1|10
> 2|25
> 3|26
>
>
> Here is the bugdata:
>
> 120.0
> 35.5
> 95.0
> 41.0
> 51.5
> 64.5
> 140.0
> 64.5
> 108.5
> 138.5
> 138.5
> 94.5
> 130.5
> 119.5
> 148.5
> 75.5
> 94.0
> 144.0
> 78.5
> 86.0
> 112.0
> 132.0
> 51.0
> 42.0
> 44.0
> 48.0
> 12.0
> 35.5
> 35.5
> 75.5
> 77.5
> 130.5
> 103.0
> 110.5
> 53.5
> 86.5
> 122.0
> 146.0
> 129.0
> 91.5
> 141.0
> 76.5
> 66.5
> 35.5
> 126.0
> 90.0
> 96.0
> 134.0
> 63.0
> 106.5
> 77.5
> 35.5
> 64.0
> 121.0
> 119.5
> 126.0
> 58.0
> 123.0
> 133.0
> 77.0
> 56.0
>
> Regards,
> -- Pedro Pedruzzi
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Retrieving column names

2010-02-27 Thread ve3meo

"P Kishor"  wrote in 
message news:cdf6db501002270448s2d28ef4fm9a831fdfa7bf3...@mail.gmail.com...
> On Sat, Feb 27, 2010 at 6:43 AM, Peter Rodwell 
>  wrote:
>> I'm sure this question has been asked and answered a million times, but 
>> I've not been able
>> to find the answer. Googling for it has turned up lots of answers, none 
>> of which seem to
>> work with SQLite:
>
>
> http://www.google.com/search?q=sqlite+column+names
>
> The top four answers returned all explain how to do the above correctly.
>
> Use PRAGMA, or SELECT from sqlite_master
>

Is there a way to use SELECT against the PRAGMA result-set?

This returns an error:
SELECT * FROM (PRAGMA table_info(tablename)); 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] unexpected behavior with CROSS and LEFT JOIN together

2010-02-24 Thread ve3meo
I posted the following and it didn't appear - I probably hit Reply to Sender 
instead of Reply Group:

> Does INDEX sqlite_autoindex_currency_1 contain only pointers to the
> identical currencies?
>
> Tom

This little change also works:

ON +c1.cur=c2.cur;
or
ON c1.cur=+c2.cur;

Along with LIKE, that seems to hide the condition from being subject to the
query optimiser and its use of the autoindex with just the cur field in it.
That ON phrase relates to the CROSS JOIN so that effectively you have this:

SELECT c1.cur cur1, c2.cur cur2
FROM currency c1
CROSS JOIN currency c2
ON c1.cur = c2.cur;

 results:
cur1 cur2
EUR EUR
GBP GBP
USD USD

Tom

Reading Richard's post below, I'm confused. Is the bug that:
a) that ON condition should not be picked up by the query optimiser as 
relating to the CROSS JOIN but to the LEFT JOIN immediately ahead of it?
b) should the query optimiser use the autoindex anyway? (because of the 
CROSS JOIN between c1 and c2, regardless of condition)


"D. Richard Hipp" <d...@hwaci.com> wrote in 
message news:d097a35f-b1ca-4131-b11d-03c11afb1...@hwaci.com...
>
> On Feb 24, 2010, at 5:20 PM, Pavel Ivanov wrote:
>
>>> Does INDEX sqlite_autoindex_currency_1 contain only pointers to the
>>> identical currencies?
>>
>> First of all AFAIK 'PRIMARY KEY' implies uniqueness of the column. And
>> second: I bet if you execute 'PRAGMA case_sensitive_like = true' then
>> plans will be the same.
>> See http://www.sqlite.org/optoverview.html#like_opt for details.
>>
>
> Guys:  Thanks for all the discussion.  But I know what is causing the
> problem.  I would have already posted the ticket, but I'm having a
> little problem with Fossil right this minute and I need to debug that
> first...
>
> The problem is that the query optimizer is using the c1.cur=c2.cur
> term together with indices to limit the search to only those rows that
> satisfy the condition.  That's the right thing to do in most cases,
> but not when the expression is on a LEFT JOIN but refers only to
> tables to the right of the LEFT JOIN.
>
> This problem has existed in SQLite forever and has never been seen
> before.  So it is obscure.  The simple fix is to put the ON clause on
> the CROSS JOIN where it belongs.  Yes, SQLite should still do the
> right thing even if the ON is in the wrong place, and I'll fix that
> directly.  Let me get Fossil running again first, though, please.
>
> Thanks for reporting the problem and for the analysis.
>
>>
>> Pavel
>>
>> On Wed, Feb 24, 2010 at 5:15 PM, ve3meo 
>> <holden_fam...@sympatico.ca>
>> wrote:
>>>
>>> "ve3meo" <holden_fam...@sympatico.ca> wrote 
>>> in
>>> message news:hm47t5$5l...@dough.gmane.org...
>>>>
>>>> "Igor Tandetnik" <itandet...@mvps.org> wrote in
>>>> message news:hm45gu$s5...@dough.gmane.org...
>>>>> Mark Brand <mabr...@mabrand.nl> wrote:
>>>>>> --Gives unexpected results
>>>>>> SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate, x.rate) rate
>>>>>> FROM currency c1
>>>>>> CROSS JOIN currency c2
>>>>>> LEFT JOIN exchange x
>>>>>>ON x.cur1=c1.cur
>>>>>>AND x.cur2=c2.cur
>>>>>> LEFT JOIN (SELECT 1 rate) self
>>>>>>ON c1.cur=c2.cur;
>>>>>>
>>>>>> /* results
>>>>>>
>>>>>> EUR|EUR|1
>>>>>> GBP|GBP|1
>>>>>> USD|USD|1
>>>>>>
>>>>>> */
>>>>>
>>>>> Yes, looks like a bug to me. I see no reason why it shouldn't
>>>>> work. LEFT
>>>>> JOIN should never produce fewer rows than what left-hand-side table
>>>>> contains.
>>>>>
>>>>> As a workaround, try this instead:
>>>>>
>>>>> SELECT c1.cur cur1, c2.cur cur2,
>>>>>(case when c1.cur=c2.cur then 1 else x.rate) rate
>>>>> FROM currency c1
>>>>> CROSS JOIN currency c2
>>>>> LEFT JOIN exchange x
>>>>>ON x.cur1=c1.cur
>>>>>AND x.cur2=c2.cur;
>>>>>
>>>>> Igor Tandetnik
>>>>
>>>> One little change makes it work:
>>>>
>>>> SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate,x.rate) rate
>>>> FROM currency c1
>>>> CROSS JOIN currency c2
>>>> LEFT JOIN exchange x
>>>>ON x.cur1=c1.cur
>>>>AND x.cur2=c2.cur
>>>> LEFT JOIN (SELECT 1 ra

Re: [sqlite] unexpected behavior with CROSS and LEFT JOIN together

2010-02-24 Thread ve3meo

"ve3meo" <holden_fam...@sympatico.ca> wrote in 
message news:hm47t5$5l...@dough.gmane.org...
>
> "Igor Tandetnik" <itandet...@mvps.org> wrote in 
> message news:hm45gu$s5...@dough.gmane.org...
>> Mark Brand <mabr...@mabrand.nl> wrote:
>>> --Gives unexpected results
>>> SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate, x.rate) rate
>>> FROM currency c1
>>> CROSS JOIN currency c2
>>> LEFT JOIN exchange x
>>>ON x.cur1=c1.cur
>>>AND x.cur2=c2.cur
>>> LEFT JOIN (SELECT 1 rate) self
>>>ON c1.cur=c2.cur;
>>>
>>> /* results
>>>
>>> EUR|EUR|1
>>> GBP|GBP|1
>>> USD|USD|1
>>>
>>> */
>>
>> Yes, looks like a bug to me. I see no reason why it shouldn't work. LEFT 
>> JOIN should never produce fewer rows than what left-hand-side table 
>> contains.
>>
>> As a workaround, try this instead:
>>
>> SELECT c1.cur cur1, c2.cur cur2,
>>(case when c1.cur=c2.cur then 1 else x.rate) rate
>> FROM currency c1
>> CROSS JOIN currency c2
>> LEFT JOIN exchange x
>>ON x.cur1=c1.cur
>>AND x.cur2=c2.cur;
>>
>> Igor Tandetnik
>
> One little change makes it work:
>
> SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate,x.rate) rate
> FROM currency c1
> CROSS JOIN currency c2
> LEFT JOIN exchange x
>ON x.cur1=c1.cur
>AND x.cur2=c2.cur
> LEFT JOIN (SELECT 1 rate) self
>ON c1.cur LIKE c2.cur;
>
> Don't ask me why '=' and 'LIKE' should behave differently in this example.
>
> Tom

They do affect the EXPLAIN QUERY PLAN results:

"="
order from detail
0 0 TABLE currency AS c1
1 1 TABLE currency AS c2 WITH INDEX sqlite_autoindex_currency_1
2 2 TABLE exchange AS x WITH INDEX sqlite_autoindex_exchange_1
3 3 TABLE  AS self

"LIKE"
order from detail
0 0 TABLE currency AS c1
1 1 TABLE currency AS c2
2 2 TABLE exchange AS x WITH INDEX sqlite_autoindex_exchange_1
3 3 TABLE  AS self

Does INDEX sqlite_autoindex_currency_1 contain only pointers to the 
identical currencies?

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] unexpected behavior with CROSS and LEFT JOIN together

2010-02-24 Thread ve3meo

"Igor Tandetnik"  wrote in message 
news:hm45gu$s5...@dough.gmane.org...
> Mark Brand  wrote:
>> --Gives unexpected results
>> SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate, x.rate) rate
>> FROM currency c1
>> CROSS JOIN currency c2
>> LEFT JOIN exchange x
>>ON x.cur1=c1.cur
>>AND x.cur2=c2.cur
>> LEFT JOIN (SELECT 1 rate) self
>>ON c1.cur=c2.cur;
>>
>> /* results
>>
>> EUR|EUR|1
>> GBP|GBP|1
>> USD|USD|1
>>
>> */
>
> Yes, looks like a bug to me. I see no reason why it shouldn't work. LEFT 
> JOIN should never produce fewer rows than what left-hand-side table 
> contains.
>
> As a workaround, try this instead:
>
> SELECT c1.cur cur1, c2.cur cur2,
>(case when c1.cur=c2.cur then 1 else x.rate) rate
> FROM currency c1
> CROSS JOIN currency c2
> LEFT JOIN exchange x
>ON x.cur1=c1.cur
>AND x.cur2=c2.cur;
>
> Igor Tandetnik

One little change makes it work:

SELECT c1.cur cur1, c2.cur cur2, COALESCE(self.rate,x.rate) rate
FROM currency c1
CROSS JOIN currency c2
LEFT JOIN exchange x
ON x.cur1=c1.cur
AND x.cur2=c2.cur
LEFT JOIN (SELECT 1 rate) self
ON c1.cur LIKE c2.cur;

Don't ask me why '=' and 'LIKE' should behave differently in this example.

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Restricting fast no-result query yields slowno-resultquery

2010-02-05 Thread ve3meo
Thank you, thank you, Igor! This relates to the earlier thread I had started 
titled "Speed regression after 3.6.17" where I was wrestling with a very 
slow query because the query optimiser was selecting an inappropriate index, 
one that was very chunky because the field was logical (0,1). This led to 
officially inappropriate use of the INDEXED BY clause to override the query 
optimiser or the creation of a temporary table from a query without that 
field as a constraint so it that would run quickly and the temp table could 
be queried quickly with the otherwise offending complaint. Your idea of 
masking the field from the query optimiser with a simple operator works 
magic. On one test of a query that links the table to itself took 1130 
seconds using the bare field as the constraint, 0.291 seconds with the + 
operator incorporating the field into an expression, an improvement by a 
factor of 3883!! EXPLAIN QUERY PLAN clearly showed the switch from the 
inappropriate index to the right one.

Great tip!

Tom

"Igor Tandetnik"  wrote in message 
news:hkh4a0$mk...@ger.gmane.org...
> Kelly Jones wrote:
>> I have a query that runs very quickly and returns no results:
>>
>> SELECT * FROM filebackup WHERE sha1='x';
>>
>> However, the more restrictive query below runs very slowly, although
>> it obviously can't have any results either:
>>
>> SELECT * FROM filebackup WHERE sha1='x' AND refid=0;
>>
>> I have indexes on both sha1 (string) and refid (int). Only issue I
>> see: this is a large table and refid=0 for 90%+ of the rows.
>
> For whatever reason, SQLite decides to use an index on refid for this 
> query. You can suppress it like this:
>
> SELECT * FROM filebackup WHERE sha1='x' AND +refid=0;
>
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] update on Ticket 3437

2010-01-24 Thread ve3meo
You're right! I arrived at a failure of a different kind because I did not 
drop the Races table as per his script between adding the first record with 
ID='20' and the second with ID=20. If Races table contains two records, the 
first with ID='20' and the second with ID=20, his query fails, even though 
the second record satisfies the condition. Adding DISTINCT somehow resolves 
this failure; a JOIN ON RaceID=ID assuredly does. If the order is reversed, 
then his query works. So the WHERE clause looks at the first result from the 
SELECT; DISTINCT probably re-ordered the results with the Integer value 
first, the Text value second. I got caught up in this behaviour without 
noting that his query would have worked had I not skipped dropping the 
table. (I'm a novice)

Tom

"Igor Tandetnik"  wrote in message 
news:hjhqq1$ut...@ger.gmane.org...
> Tom Holden wrote:
>> Nick, I think your query fails because there is a potential for multiple
>> values on the right-hand side of WHERE RaceID=.
>
> No, it fails because 20 != '20'



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQL Crash with sqlite 3.6.22 commandline

2010-01-22 Thread ve3meo
That's great! I'm an end-user and cannot compile it to check myself. Thanks 
to shane, drh, Hub Dog for the rapid fix. I will have to dig around to 
figure out how/when that shows up in a release.

That leaves the question about the change of syntax between 3.5.4 and 
3.6.17. In the earlier it was sufficient to override the unavailable 
collation with a COLLATE clause at the first instance of the field if it is 
to be later compared in a WHERE clause, e.g.:

SELECT  Name COLLATE NOCASE
FROM AddressTable
WHERE  Name LIKE '%_';

By release 3.6.17, it was necessary to put the COLLATE clause where the 
comparison is made, e.g.:

SELECT  Name
FROM AddressTable
WHERE  Name COLLATE NOCASE LIKE '%_';

3.5.4 was happy either way.

However, if this latter SELECT is a virtual table the subject of another 
SELECT stmt, then both 3.5.4 and 3.6.17+ throw up an error: "No such 
collation sequence: x" whereas 3.5.4 is happy if the first statement is 
the virtual table.

For both 3.5.4 and 3.6.17 to be happy, COLLATE NOCASE must be used twice:

SELECT * FROM
 (SELECT  Name COLLATE NOCASE
  FROM AddressTable
  WHERE  Name COLLATE NOCASE LIKE '%_');

OR the collated field must be renamed in the form of the first statement:

SELECT * FROM
 (SELECT  Name COLLATE NOCASE AS Nuts
  FROM AddressTable
  WHERE  Nuts LIKE '%_');

The ORDER BY clause operated differently in 3.5.4, requiring the COLLATE 
clause within, despite its earlier use in the SELECT expr. Leading to

SELECT  Name COLLATE NOCASE
FROM AddressTable
WHERE  Name LIKE '%_'
ORDER BY Name COLLATE NOCASE;
as sufficient for 3.5.4

SELECT  Name
FROM AddressTable
WHERE  Name COLLATE NOCASE LIKE '%_'
ORDER BY Name COLLATE NOCASE;
as necessary for 3.6.17+ and acceptable to 3.5.4

and these, as necessary for both:

SELECT * FROM
 (SELECT  Name COLLATE NOCASE
  FROM AddressTable
  WHERE  Name COLLATE NOCASE LIKE '%_'
  ORDER BY Name COLLATE NOCASE);

-- OR --

SELECT * FROM
 (SELECT   Name COLLATE NOCASE AS Nuts
   FROM  AddressTable
   WHERE   Nuts LIKE '%_'
   ORDER BY Nuts);

So it appears that the later versions are more rigorous than 3.5.4, having a 
consistent requirement for the collation override in any clause where 
comparison is made on a field with a missing, application dependent 
collation sequence, rather than an inferred override from a previous 
declaration.

The trap then is that 'lazy' overrides allowed in 3.5.4 will trigger a "No 
such collation sequence" in later versions. I think I prefer the 'lazy' 
override for both WHERE and ORDER BY but perhaps there is good reason to 
require more explicit COLLATION declarations.

Tom

"Hub Dog"  wrote in message 
news:4de0b9581001220719u72a3857eq88b674a14d265...@mail.gmail.com...
>I can confirm that the crash problem has been fixed by
> http://www.sqlite.org/src/info/1258875e07 checked-in .
> Now executing following sql will report no such collation sequence: 
> RMNOCASE
> instead of crash.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQL Crash with sqlite 3.6.22 commandline

2010-01-22 Thread ve3meo
I just discovered that attachments can be sent through this newsgroup so I 
have attached a small database with which you should be able to reproduce 
the problem. The one table in it has a field collated RMNOCASE. The 
following query produces these results in three different versions of 
sqlite:

3.5.4 works perfectly
3.6.17 gracefully reports an error - missing RMNOCASE collation
3.6.21 crash

SELECT
  Name COLLATE NOCASE
FROM
  AddressTable
WHERE
  Name LIKE '%_';

3.5.4 carried the COLLATE NOCASE override at the beginning of the SELECT 
through to the comparison in the WHEN. The later ones do not, and they 
'progress' from reporting an error to a crash.

This regression renders queries, on databases having collations unavailable 
to the sqlite in use, that were developed on older versions of sqlite 
problematic when run from newer versions.

Regards,
Tom

"D. Richard Hipp"  wrote in 
message news:41371dfd-279f-429d-9186-476efb63e...@hwaci.com...
>I am unable to reproduce this problem.  Using the script below, with
> RMNOCASE changed to just NOCASE, everything works fine on the SQLite
> command-line shell on the website on Linux.  I also tried various
> other versions of SQLite with the same result.
>
>
> On Jan 21, 2010, at 8:00 AM, Hub Dog wrote:
>
>> I hava a table. The table schema is
>>
>> CREATE TABLE AddressTable
>> (
>>  AddressID INTEGER PRIMARY KEY ,
>>  AddressType INTEGER ,
>>  Name TEXT COLLATE RMNOCASE ,
>>  Street1 TEXT ,
>>  Street2 TEXT ,
>>  City TEXT ,
>>  State TEXT ,
>>  Zip TEXT ,
>>  Country TEXT ,
>>  Phone1 TEXT ,
>>  Phone2 TEXT ,
>>  Fax TEXT ,
>>  Email TEXT ,
>>  URL TEXT ,
>>  Latitude INTEGER ,
>>  Longitude INTEGER ,
>>  Note BLOB
>> ) ;.
>>
>> if I execute following sql to query data , the sqlite 3.6.22 command
>> line
>> downloaded from www.sqlite.org will crash.
>>
>> SELECT
>>  Adr.Name COLLATE NOCASE AS AddressName
>> FROM
>>  AddressTable AS Adr
>> WHERE
>>  Adr.Name LIKE '%_'.
>>
>> if I change the Adr.Name to AddressName  , the sql execute result is
>> ok.
>>
>> SELECT
>>  Adr.Name COLLATE NOCASE AS AddressName
>> FROM
>>  AddressTable AS Adr
>> WHERE
>>  AddressName LIKE '%_' ;
>>
>> it seems the crash was related with the collate RMNOCASE of
>> AddressTable
>> table's field Name.
>> in default sqlite command line, there is no rmnocase collation. so I
>> mapped
>> it to the default  nocase collation.
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> D. Richard Hipp
> d...@hwaci.com
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 


begin 666 Address.db3

Re: [sqlite] Sqlite3 Optimization Question

2010-01-22 Thread ve3meo
Uhh, sounds like a search function of a text editor or a simple routine you 
could write (or find) in your development language. I can't see the magic of 
using sqlite.

Tom

"Michael Thomason"  wrote 
in message 
news:3cbb39411001220619j1c6cc8f5x46e0aeb4d0b91...@mail.gmail.com...
Thank you.  I like that answer and I'll give it a try.

The application is a word game, and I have a dictionary of words.  The
user enters a word, and the application checks against that word to
see if it exist.  That's the only query ever used.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQL Crash with sqlite 3.6.22 commandline

2010-01-21 Thread ve3meo
I originally experienced the problem using a couple of Windows sqlite 
managers. It seems that ones using the latest few versions of sqlite have 
the problem, variously reported as:
"Access violation at address x in module . Read of address 
0005", where x is dependent on the application. Of the three that I 
tested that had this crash, two reported it thusly, one threw up an 
unhandled Win32 exception and croaked - they are all, I believe using 3.6.21 
or 22.

Running the sqlite.exe command line version 3.6.22 throws up the unhandled 
Win32 exception.
Running the 3.6.17 release gracefully reports "SQL error near line 1: no 
such collation sequence: RMNOCASE".

The errors and crashes are from the following query:
SELECT
  Name COLLATE NOCASE
FROM
  AddressTable
WHERE
  Name LIKE '%_';

This revised query works on all versions listed above:
SELECT
  Name COLLATE NOCASE AS NewName
FROM
  AddressTable
WHERE
  NewName LIKE '%_' ;

as does:
SELECT
  Name
FROM
  AddressTable
WHERE
  Name COLLATE NOCASE LIKE '%_';

I can forward you a sample database but I do not have access to the RMNOCASE 
collation.

Tom

"D. Richard Hipp"  wrote in 
message news:41371dfd-279f-429d-9186-476efb63e...@hwaci.com...
>I am unable to reproduce this problem.  Using the script below, with
> RMNOCASE changed to just NOCASE, everything works fine on the SQLite
> command-line shell on the website on Linux.  I also tried various
> other versions of SQLite with the same result.
>
>
> On Jan 21, 2010, at 8:00 AM, Hub Dog wrote:
>
>> I hava a table. The table schema is
>>
>> CREATE TABLE AddressTable
>> (
>>  AddressID INTEGER PRIMARY KEY ,
>>  AddressType INTEGER ,
>>  Name TEXT COLLATE RMNOCASE ,
>>  Street1 TEXT ,
>>  Street2 TEXT ,
>>  City TEXT ,
>>  State TEXT ,
>>  Zip TEXT ,
>>  Country TEXT ,
>>  Phone1 TEXT ,
>>  Phone2 TEXT ,
>>  Fax TEXT ,
>>  Email TEXT ,
>>  URL TEXT ,
>>  Latitude INTEGER ,
>>  Longitude INTEGER ,
>>  Note BLOB
>> ) ;.
>>
>> if I execute following sql to query data , the sqlite 3.6.22 command
>> line
>> downloaded from www.sqlite.org will crash.
>>
>> SELECT
>>  Adr.Name COLLATE NOCASE AS AddressName
>> FROM
>>  AddressTable AS Adr
>> WHERE
>>  Adr.Name LIKE '%_'.
>>
>> if I change the Adr.Name to AddressName  , the sql execute result is
>> ok.
>>
>> SELECT
>>  Adr.Name COLLATE NOCASE AS AddressName
>> FROM
>>  AddressTable AS Adr
>> WHERE
>>  AddressName LIKE '%_' ;
>>
>> it seems the crash was related with the collate RMNOCASE of
>> AddressTable
>> table's field Name.
>> in default sqlite command line, there is no rmnocase collation. so I
>> mapped
>> it to the default  nocase collation.
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> D. Richard Hipp
> d...@hwaci.com
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed regression after 3.6.17

2010-01-13 Thread ve3meo

"Max Vlasov"  wrote in 
message news:7cb963ca1001130315o69235717n92393be027eef...@mail.gmail.com...
> >
>> After much playing about, I have determined that it is necessary to 
>> violate
>> the guidelines on not using INDEXED BY and NOT INDEXED in an attempt to
>> optimise performance.
>
>
> Although you're very determined about your conclusions, I saw a
> misunderstanding about INDEXED BY in your statements.
> The docs say "The INDEXED BY clause is *not* intended for use in tuning 
> the
> preformance of a query". From my point of you one should think about these
> extensions as sqlite's sql equivalent of assertions from high-level
> languages.

Max, thanks for your follow up. I think we understand the guideline the same 
way. It says that one should not use INDEXED BY for tuning a query. I had to 
use INDEXED BY and NOT INDEXED to tune my query because the query optimiser 
mal-tuned it by ~70:1. After 3.6.17, the query optimiser tuned even worse - 
to ~1000:1. Operating on ANALYZE stats took what was the fastest result with 
a couple of INDEXED BY clauses down to ~70:1. The only way I could get the 
fastest result across different sqlite versions and with/without ANALYZE 
stats was to do the very opposite of what the docs say - i.e., use INDEXED 
clauses liberally to restrict what the query optimiser can do to screw up my 
query.

I don't understand your last sentence as I am not a programmer and have no 
experience of working with sql assertions.

> For example, a quick test.
> I have a table Contacts with an index ids_Contacts using field CenterId.
> If I try to execute
>
> SELECT * FROM Contacts INDEXED BY idx_Contacts
>
> sqlite says: "cannot use index: idx_Contacts".
>
> Only when a change made adding explicit ORDER BY clause applied
>
> SELECT * FROM Contacts INDEXED BY idx_Contacts ORDER BY CenterId
>
> ... the query executes successfully.

That's odd. Using a SQLite manager with sqlite 3.6.21, both lines give the 
same error on one of my tables with 5 indexes.
SELECT Surname FROM NameTable INDEXED BY idxSurname
SELECT Surname FROM NameTable INDEXED BY idxSurname ORDER BY Surname COLLATE 
NOCASE

And EXPLAIN QUERY PLAN indicates that the query optimiser used no index in 
both cases when the INDEXED BY clause was deleted.

> If I remove now INDEXED BY from the latter statement nothing will change 
> in
> how the query is executed, _only_ if someone excplicitely deletes the 
> index
> from the database.

Assuming that this was indeed a multi-table query that worked, the query can 
only execute identically with/without the INDEXED clauses provided the query 
optimiser consistently used the same indexes. And, yes, it would throw up an 
error if the INDEXED BY index had been dropped.

> So any of your statements that argues that adding or removing INDEXED BY
> affects the way your queries is executed is have to be double checked.

Isn't that what I demonstrated with the comparative results of EXPLAIN QUERY 
PLAN? The optimiser chose inappropriate indexes when allowed to do so. Your 
statement assumes that the query optimiser chose the same indexes as the 
ones I chose in iteratively tuning for fastest speed.

> If you're still sure that you found a major flaw in the most deployed sql
> database in the world, please narrow your tests and conclusions to 
> something
> easier reproducible. Everyone understands that you're solving one 
> particular
> problem related to you, but if you find time to make things more simple, 
> it
> will be to everyone's benefit.

Max, I don't know that I have found a major flaw in the database engine or 
that I have thrown up a very exceptional database structure that its 
optimiser can't properly deal with. I most certainly had to do the opposite 
of what the online docs said about the use of INDEXED clauses. My tests are 
easily reproducible if you care to let me send you the database and queries. 
The SQLite manager developer who led me to investigate the change in 
performance around 3.6.18 of my early query that lacked INDEXED clauses has 
also offered the database in the thread he started titled "Performance 
regression".

I don't know how I can make things more simple - I'm not a programmer and my 
interest in sqlite is pretty narrow. I am hoping that bringing my problem 
and observations to this forum will elicit the necessary interest from the 
sqlite developer community as to why it was necessary for me to violate 
their very clear instruction that one should not use INDEXED clauses to tune 
a query. That instruction can only be correct if, and only if, the query 
optimiser can be relied on to tune the query optimally. In this case, it 
clearly does not.

Thanks to the feedback so far, I have learned how to use EXPLAIN and that I 
was mistaken to initially report this as a speed regression. Rather it turns 
out to be that query optimisation has changed after 3.6.17, that there is at 
least one database structure for which the query 

Re: [sqlite] Speed regression after 3.6.17

2010-01-12 Thread ve3meo

"ve3meo" <holden_fam...@sympatico.ca> wrote in 
message news:hiivpn$7f...@ger.gmane.org...
>
> "Simon Slavin" <slav...@bigfraud.org> wrote in 
> message 
> news:ad11c649-d23a-46ff-8545-85ffb5219...@bigfraud.org...
>>
>> The database structure has not changed.  It is optimisation of a 
>> particular kind of query which seems to be working differently -- and not 
>> as well.  Unfortunately, the example supplied is extremely complicated 
>> and it's not easy to tell which aspect of the query is at fault.
>
> Sorry. What I meant: is it possible that it is just the particular 
> structure of my database that is susceptible to this performance 
> regression?
>
> I agree that the optimisation strategy has changed for the better between 
> 3.6.17 and 3.6.20 for the unANALYZED unINDEXED query.
>
> I postulate that the optimisation strategy is defective for the ANALYZED, 
> INDEXED query for both 3.6.17 and 3.6.20 while 3.5.4 throws up a false 
> syntax error.
> [snip]
> Thanks for any follow-up. Is this the right way to flag a problem for the 
> SQlite developers?
>
After much playing about, I have determined that it is necessary to violate 
the guidelines on not using INDEXED BY and NOT INDEXED in an attempt to 
optimise performance. This would seem to be the case in a database where 
there are tables being queried that have one or more CREATEd INDEXes that 
could lead the SQLite query optimiser to choose an inappropriate index. The 
optimiser could thus direct the query to be processed extremely 
sub-optimally, as has been demonstrated by my database and queries where a 
1000:1 ratio in execution time has been experienced. The longest times were 
for those queries in which the optimiser had total freedom to pick the order 
that tables were processed and which indexes were used. The shortest were 
those in which:

a) at least some explicit INDEXED BY clauses were added and the optimiser 
lucked out on the rest BEFORE ANALYZE was run, or
b) AFTER ANALYZE was run, NOT INDEXED clauses were added to prevent the 
optimiser from using anything other than the PRIMARY KEY, and more INDEXED 
BY clauses were added to override the optimiser's now inappropriate 
selections.

Post 3.6.17, it is clear that the query optimiser degraded performance 
extremely sub-optimally for this database in its ante-ANALYZED state. For 
the ANALYZED state, the query optimiser degraded the performance of queries 
with partial INDEX control for 3.6.17 and post 3.6.17. That says to me, "THE 
QUERY OPTIMISER IS NOT RELIABLE".

3.5.4 returned errors near "INDEXED" because INDEXED BY and NOT INDEXED 
extensions were only added at 3.6.3 in late 2008.

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed regression after 3.6.17

2010-01-12 Thread ve3meo

"Simon Slavin"  wrote in 
message news:ad11c649-d23a-46ff-8545-85ffb5219...@bigfraud.org...
>
> The database structure has not changed.  It is optimisation of a 
> particular kind of query which seems to be working differently -- and not 
> as well.  Unfortunately, the example supplied is extremely complicated and 
> it's not easy to tell which aspect of the query is at fault.

Sorry. What I meant: is it possible that it is just the particular structure 
of my database that is susceptible to this performance regression?

I agree that the optimisation strategy has changed for the better between 
3.6.17 and 3.6.20 for the unANALYZED unINDEXED query.

I postulate that the optimisation strategy is defective for the ANALYZED, 
INDEXED query for both 3.6.17 and 3.6.20 while 3.5.4 throws up a false 
syntax error.

The example I've given is a UNION of five SELECTs but it is the first two 
that have been the most problematic. As they are virtually identical, let me 
repeat the EXPLAIN QUERY PLAN results, first for the always slow, not 
deliberately indexed  query and then for the once fast, explicitly indexed 
query. All the index files are predefined by the genealogy database 
application software.

The full query returns ~52,000 rows in the approx. time shown for each 
SQLite release and condition. Some variation is to be expected from varying 
concurrent processes but, generally, sqlite consumes ~98% of CPU, most of 
the time without disk thrashing.

--
THE UNINDEXED QUERY

SELECT FactTypeTable.Name COLLATE NOCASE AS Fact, 'Principal' AS 'Role 
Type', NameTable1.OwnerID AS RIN, NameTable1.Surname COLLATE NOCASE AS 
Surname, NameTable1.Suffix COLLATE NOCASE AS Suffix, NameTable1.Prefix 
COLLATE NOCASE AS Prefix, NameTable1.Given COLLATE NOCASE AS 'Given Name', 
NameTable2.OwnerID AS 'Sharer RIN', NameTable2.Surname COLLATE NOCASE AS 
'Sharer Surname', NameTable2.Suffix COLLATE NOCASE AS 'Sharer Suffix', 
NameTable2.Prefix COLLATE NOCASE AS 'Sharer Prefix', NameTable2.Given 
COLLATE NOCASE AS 'Sharer Given Name', COUNT(1) AS Count
FROM EventTable
INNER JOIN FactTypeTable ON EventTable.EventType = FactTypeTable.FactTypeID
INNER JOIN FamilyTable ON EventTable.OwnerID = FamilyTable.FamilyID
INNER JOIN NameTable AS NameTable1 ON FamilyTable.FatherID = 
NameTable1.OwnerID
INNER JOIN NameTable AS NameTable2 ON FamilyTable.MotherID = 
NameTable2.OwnerID
WHERE EventTable.OwnerType = 1 AND NameTable1.IsPrimary = 1 AND 
NameTable2.IsPrimary = 1
GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
.
.
ORDER BY RIN

RESULTS BEFORE ANALYZE


SQLite 3.5.4: ~250s
EXPLAIN QUERY PLAN results BEFORE ANALYZE
"order", "from", "detail"
"0", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary"
"1", "2", "TABLE FamilyTable WITH INDEX idxFamilyFatherID"
"2", "0", "TABLE EventTable WITH INDEX idxOwnerDate"
"3", "1", "TABLE FactTypeTable USING PRIMARY KEY"
"4", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary"

SQLite 3.6.17: ~250s cf 190s after ANALYZE
EXPLAIN QUERY PLAN results BEFORE ANALYZE
order,from,detail
0,3,TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary,
1,2,TABLE FamilyTable WITH INDEX idxFamilyFatherID,
2,0,TABLE EventTable WITH INDEX idxOwnerDate,
3,1,TABLE FactTypeTable USING PRIMARY KEY,
4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary,

SQLite 3.6.20: ~2500s cf ~165s after ANALYZE
EXPLAIN QUERY PLAN results BEFORE ANALYZE
"order","from","detail"
"0","3","TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary"
"1","4","TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary"
"2","2","TABLE FamilyTable WITH INDEX idxFamilyMotherID"
"3","0","TABLE EventTable WITH INDEX idxOwnerDate"
"4","1","TABLE FactTypeTable USING PRIMARY KEY"

RESULTS AFTER ANALYZE

SQLite 3.5.4:  ~198s cf ~250s before ANALYZE
EXPLAIN QUERY PLAN results AFTER ANALYZE
"order", "from", "detail"
"0", "1", "TABLE FactTypeTable"
"1", "2", "TABLE FamilyTable"
"2", "0", "TABLE EventTable WITH INDEX idxOwnerEvent"
"3", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID"
"4", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID"

SQLite 3.6.17: ~190s cf ~250s before ANALYZE
EXPLAIN QUERY PLAN results AFTER ANALYZE
order,from,detail
0,1,TABLE FactTypeTable,
1,2,TABLE FamilyTable,
2,0,TABLE EventTable WITH INDEX idxOwnerEvent,
3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID,
4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID,

SQLite 3.6.20: ~165s cf ~2500s before ANALYZE
EXPLAIN QUERY PLAN results AFTER ANALYZE
"order","from","detail"
"0","1","TABLE FactTypeTable"
"1","2","TABLE FamilyTable"
"2","0","TABLE EventTable WITH INDEX idxOwnerEvent"
"3","3","TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID"
"4","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID"

-
THE INDEXED QUERY

SELECT FactTypeTable.Name COLLATE NOCASE AS Fact, 'Principal' AS 'Role 
Type', NameTable1.OwnerID AS RIN, 

Re: [sqlite] Speed regression after 3.6.17

2010-01-12 Thread ve3meo

"Max Vlasov"  wrote in 
message news:7cb963ca1001120149u550da7fr56cfc0ed261a9...@mail.gmail.com...
> ... Are you sure you have
> identical tests for your comparison? I mean these are the same data sets
> with the same scheme?

Absolutely, errr, to the best of my knowledge. Same database, same query, 
but different SQLite managers having different versions of SQLite. But the 
fact that the SQLite manager is different is immaterial - the developer of 
one of them was the one to identify that it is the SQLite version that is 
the controlling variable: 3.6.17 and somewhat earlier - better; 3.6.18 and 
later - worse.

The volatility of SQLite performance over these version changes and with the 
use of ANALYZE is very disquieting for me. The database structure is that of 
a popular genealogy software. Having run ANALYZE, now not only has my 
fastest query been degraded by 70:1, so has the speed of the 'owner' 
software in generating its reports.

Attention needs to be paid to what changed between 3.6.17 and 3.6.18 that 
has adversely affected performance and why is it that ANALYZE is 
deleterious. Is it just this database structure that has been affected?

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed regression after 3.6.17

2010-01-11 Thread ve3meo

"D. Richard Hipp" <d...@hwaci.com> wrote in 
message news:4c2248d2-898d-41ec-81fc-bafdb726c...@hwaci.com...
>
> On Jan 11, 2010, at 9:28 PM, ve3meo wrote:
>> Oddly enough, by revising the query to explicitly
>> use an index, the two later releases are much faster...
>
> Did you run ANALYZE before you tried using explicit indices?
>
No. You have introduced me to it. So I tried using a SQLite manager with the 
3.6.17 release. Shaved 25% off the unindexed (SQLite optimised only) query 
time to ~190s. Multiplied the previously fast explicitly indexed query from 
3s to 210s. I was happy with the 3s.

Here are the EXPLAIN QUERY PLAN results before and after ANALYZE:

Before:
order,from,detail
0,0,TABLE EventTable,
1,1,TABLE FactTypeTable USING PRIMARY KEY,
2,2,TABLE FamilyTable USING PRIMARY KEY,
3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID,
4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID,
0,0,TABLE EventTable,
1,1,TABLE FactTypeTable USING PRIMARY KEY,
2,2,TABLE FamilyTable USING PRIMARY KEY,
3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID,
4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID,
0,2,TABLE NameTable WITH INDEX idxNamePrimary,
1,0,TABLE EventTable WITH INDEX idxOwnerDate,
2,1,TABLE FactTypeTable USING PRIMARY KEY,
0,0,TABLE NameTable WITH INDEX idxNamePrimary,
0,0,TABLE WitnessTable,
1,1,TABLE roletable USING PRIMARY KEY,
2,2,TABLE EventTable USING PRIMARY KEY,
3,5,TABLE FactTypeTable USING PRIMARY KEY,
4,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID,
5,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID,

After:
order,from,detail
0,1,TABLE FactTypeTable,
1,2,TABLE FamilyTable,
2,0,TABLE EventTable WITH INDEX idxOwnerEvent,
3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID,
4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID,
0,1,TABLE FactTypeTable,
1,2,TABLE FamilyTable,
2,0,TABLE EventTable WITH INDEX idxOwnerEvent,
3,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID,
4,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID,
0,1,TABLE FactTypeTable,
1,2,TABLE NameTable WITH INDEX idxNamePrimary,
2,0,TABLE EventTable WITH INDEX idxOwnerEvent,
0,0,TABLE NameTable WITH INDEX idxNamePrimary,
0,1,TABLE roletable,
1,5,TABLE FactTypeTable,
2,2,TABLE EventTable,
3,4,TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID,
4,0,TABLE WitnessTable WITH INDEX idxWitnessEventID,
5,3,TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID,

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed regression after 3.6.17

2010-01-11 Thread ve3meo
Max Vlasov  writes:

> 
> On Mon, Jan 11, 2010 at 2:33 AM, Tom Holden wrote:
> 
> > I have been trying out a number of SQLite managers, one test being the
> > execution time for the same query on the same database on the same 
computer.
> > The scattering of results was very surprising with a spread on the order of
> > 10:1.
> >
> 
> Is the message posted 11 hours ago about the same issue? (
> http://www.mail-archive.com/sqlite-users-CzDROfG0BjIdnm+yROfE0A  
public.gmane.org/msg49650.html)
> Anyway I still suggest the same (see in the thread) - compare VDBE code
> sequences
> 
> Max
Yes, that is the developer of one of the SQLite managers who posted the 
earlier message re "Performance regression...". I am not a developer, merely a 
new user of applications that use SQLite who is delving into the application 
database. I compared the EXPLAIN QUERY PLAN results for SQLite 3.5.4, 3.6.17 
and 3.6.20. The first two are identical; the third is quite different. I don't 
quite know what that means; maybe it indicates that the later version joins 
the tables less efficiently. Oddly enough, by revising the query to explicitly 
use an index, the two later releases are much faster than for the unindexed 
query while the first returns an error near 'INDEXED'. 

Here are the results of the EXPLAIN QUERY PLAN for the original, unindexed 
query:
 
For 3.5.4 and 3.6.17 which executed in ~240s:
"order", "from", "detail"
"0", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary"
"1", "2", "TABLE FamilyTable WITH INDEX idxFamilyFatherID"
"2", "0", "TABLE EventTable WITH INDEX idxOwnerDate"
"3", "1", "TABLE FactTypeTable USING PRIMARY KEY"
"4", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary"
"0", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary"
"1", "2", "TABLE FamilyTable WITH INDEX idxFamilyFatherID"
"2", "0", "TABLE EventTable WITH INDEX idxOwnerDate"
"3", "1", "TABLE FactTypeTable USING PRIMARY KEY"
"4", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary"
"0", "2", "TABLE NameTable WITH INDEX idxNamePrimary"
"1", "0", "TABLE EventTable WITH INDEX idxOwnerDate"
"2", "1", "TABLE FactTypeTable USING PRIMARY KEY"
"0", "0", "TABLE NameTable WITH INDEX idxNamePrimary"
"0", "3", "TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary"
"1", "0", "TABLE WitnessTable WITH INDEX idxWitnessPersonID"
"2", "1", "TABLE roletable USING PRIMARY KEY"
"3", "2", "TABLE EventTable USING PRIMARY KEY"
"4", "5", "TABLE FactTypeTable USING PRIMARY KEY"
"5", "4", "TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID"

And for 3.6.20 which executed the same query in ~2500s:
"order","from","detail"
"0","3","TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary"
"1","4","TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary"
"2","2","TABLE FamilyTable WITH INDEX idxFamilyMotherID"
"3","0","TABLE EventTable WITH INDEX idxOwnerDate"
"4","1","TABLE FactTypeTable USING PRIMARY KEY"
"0","3","TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary"
"1","4","TABLE NameTable AS NameTable2 WITH INDEX idxNamePrimary"
"2","2","TABLE FamilyTable WITH INDEX idxFamilyMotherID"
"3","0","TABLE EventTable WITH INDEX idxOwnerDate"
"4","1","TABLE FactTypeTable USING PRIMARY KEY"
"0","2","TABLE NameTable WITH INDEX idxNamePrimary"
"1","0","TABLE EventTable WITH INDEX idxOwnerDate"
"2","1","TABLE FactTypeTable USING PRIMARY KEY"
"0","0","TABLE NameTable WITH INDEX idxNamePrimary"
"0","3","TABLE NameTable AS NameTable1 WITH INDEX idxNamePrimary"
"1","0","TABLE WitnessTable WITH INDEX idxWitnessPersonID"
"2","1","TABLE roletable USING PRIMARY KEY"
"3","2","TABLE EventTable USING PRIMARY KEY"
"4","5","TABLE FactTypeTable USING PRIMARY KEY"
"5","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID"

Here are the results for the indexed query, identical for 3.6.17 and 3.6.20 
but an error with 3.5.4 so no result to compare, executes in ~2s: 
"order","from","detail"
"0","0","TABLE EventTable"
"1","1","TABLE FactTypeTable USING PRIMARY KEY"
"2","2","TABLE FamilyTable USING PRIMARY KEY"
"3","3","TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID"
"4","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID"
"0","0","TABLE EventTable"
"1","1","TABLE FactTypeTable USING PRIMARY KEY"
"2","2","TABLE FamilyTable USING PRIMARY KEY"
"3","3","TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID"
"4","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID"
"0","2","TABLE NameTable WITH INDEX idxNamePrimary"
"1","0","TABLE EventTable WITH INDEX idxOwnerDate"
"2","1","TABLE FactTypeTable USING PRIMARY KEY"
"0","0","TABLE NameTable WITH INDEX idxNamePrimary"
"0","0","TABLE WitnessTable"
"1","1","TABLE roletable USING PRIMARY KEY"
"2","2","TABLE EventTable USING PRIMARY KEY"
"3","5","TABLE FactTypeTable USING PRIMARY KEY"
"4","3","TABLE NameTable AS NameTable1 WITH INDEX idxNameOwnerID"
"5","4","TABLE NameTable AS NameTable2 WITH INDEX idxNameOwnerID"

The results for these