Re: [sqlite] [EXTERNAL] slow join, fast subselect

2019-04-18 Thread Poor Yorick
On Wed, Apr 17, 2019 at 11:43:13AM -0600, Keith Medcalf wrote:
> 
> Your made up plans are intriguing.  The plan you show for the latter query 
> omit to join a and b.  Are you just making things up?

The query plans were cut and pasted from the terminal.  It's easy enough to
deduce where these plans came from:  As someone else pointed out, the ",b" in
the second query shouldn't be there, so I removed it before generating the
query plans.  That step of the query plan is irrelevant anyway.  The point is
that in the subselect variant the query the planner chooses this 


  
  7 0 0 {USING ROWID SEARCH ON TABLE b FOR IN-OPERATOR} 

  

which, given the conditions, is a far better choise than what the planner
chooses in the "join" variant: 


  
  8 0 0 {SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?)}

  

It would be easy enough again for the planner to deduce this, but as Ryan
Smith described, may not be worth doing in the general case.  I don't know.
I'm just reporting in from the field.


-- 
Poor Yorick

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] slow join, fast subselect

2019-04-17 Thread Poor Yorick
On Wed, Apr 17, 2019 at 01:24:11PM +, David Raymond wrote:
> Would you post what those explain query plans results are? All the other 
> replies not withstanding I'm still curious as to why #2 would be faster 
> (assuming "rowid" is indeed the actual rowid anyway)
> 
> Also, is that a typo in #2, if you're not using b, why would you include it 
> in the from clause? Wouldn't that introduce a whole bunch of duplicates? As 
> in a copy of a.rowid for every single record in b? (Maybe my brain just 
> hasn't finished warming up this morning)
> 
> #1
> select a.rowid
> from a join b on a.rowid = b.rowid
> where a.ref = $x 
> 
> #2
> select a.rowid
> from a,b
> where a.ref = $x and a.rowid in (select rowid from b)
> 
> 

3 0 0 {SEARCH TABLE a USING COVERING INDEX idx_ref (ref=?)}
8 0 0 {SEARCH TABLE b USING INTEGER PRIMARY KEY (rowid=?)}

2 0 0 {SEARCH TABLE a USING COVERING INDEX idx_ref (ref=? AND rowid=?)}
7 0 0 {USING ROWID SEARCH ON TABLE b FOR IN-OPERATOR}

-- 
Poor Yorick
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] slow join, fast subselect

2019-04-17 Thread Poor Yorick
On Wed, Apr 17, 2019 at 11:36:18AM +0200, R Smith wrote:
> On 2019/04/17 10:55 AM, Poor Yorick wrote:
[SNIP]
> 
> In your above example you really wish to know all the a's which have an
> entry in b. The first query asks to join and list all b's found for every a
> (which works mathematically in this case by virtue of rowid uniqueness, but
> isn't the real question and forces a lot of "join" algorithm checking on the
> QP), the correct question is the second query: Show every a which can also
> be found in b. It releases the QP of a lot of responsibility and let's it
> follow a plan that is much faster.
> 
> 
> Hope that makes sense :)
> 
> Ryan
> 

That's an apt and accessible description of the issue, but at the denotational
level the meanings of the queries are in fact identical under the conditions
you enumerated.  Ideally sqlite would notice and adjust its query plan
accordingly.  If the cost of doing so doesn't justify the effort, that could be
documented.  As good as the sqlite documentation is, it currently lacks this
sort of higher-level guidance.

-- 
Poor Yorick
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] slow join, fast subselect

2019-04-17 Thread Poor Yorick

On Wed, Apr 17, 2019 at 10:15:31AM +, Hick Gunter wrote:
> Try EXPLAIN QUERY PLAN  or even EXPLAIN  to see what is going on 
> in each case.


I already have, of course.  The question is, how much effort would it be to get
sqlite choose the better query plan in the "join" case as well?

-- 
Poor Yorick
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] slow join, fast subselect

2019-04-17 Thread Poor Yorick
I've used the following two test queries in a version of sqlite built against a
recent checkout of trunk, and also another recent version of sqlite.  a.ref is
indexed.  The subselect query is faster than the join query -- two orders of
magnitude faster on a larger dataset.  Is sqlite missing some easy optimisation
opportunity here?


select a.rowid
from a join b on a.rowid = b.rowid
where a.ref = $x 


select a.rowid
from a,b
where a.ref = $x and a.rowid in (select rowid from b)


-- 
Poor Yorick

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Tcl [db eval varname ...] non-array varname silently fails

2016-08-04 Thread Poor Yorick
If the array-name in a [db eval arrayname ...] command isn't actually an 
array,

sqlite silently fails.  The script is evaluated but the variable doesn't
contain values from the query:


package require sqlite3
sqlite3 db :memory:
db eval {
create table t (f)
; insert into t values ("h") , ("he")
}
set record 2
db eval {select * from t} record {
puts [list row [array get record]]
}

Of course, one reasonable answer is, "Then don't do that!", but it would 
be

less surprising if the underlying error propagated in this case.


--
Poor Yorick


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] union, order by, and a phantom record

2016-02-08 Thread Poor Yorick
On 2016-02-08 19:15, Richard Hipp wrote:
> On 2/8/16, Richard Hipp  wrote:
>> On 2/8/16, Poor Yorick  wrote:
>>> The following query produces a third phantom record on my system:
>> 
>> Running "PRAGMA automatic_index=OFF;" might alleviate the symptoms
>> your are experiencing, until we can get a proper fix published.
>> 
> 
> The bug appears to be quite a bit more serious.  A proposed fix has
> been checked into trunk.  Please try out the trunk to see if that
> works better for you.  We will continue to analyze the problem and
> write tests in the meantime.
> 
> The latest "snapshot" at https://www.sqlite.org/download.html contains
> the proposed fix.

My working code has already morphed into something that isn't triggering 
the
bug, but it's gratifying to see a proposed fix so quickly, as I'm likely 
to hit
it again during this project.

-- 
Yorick



[sqlite] union, order by, and a phantom record

2016-02-08 Thread Poor Yorick
The following query produces a third phantom record on my system:


= start script =
package require sqlite3

sqlite3 [namespace current]::db :memory:

db eval {
 create table if not exists eav (
 id integer primary key autoincrement
 ,entity numeric
 ,attribute text
 ,value
 )
 ; insert into eav values
 (3 ,1 ,'name','Imogen')
 ,(5 ,1 ,'gender'  ,'female')
 ,(6,1 ,'son' ,'Guiderius')
 ,(7,1 ,'son' ,'Arvirargus')
}

set pattern0 1
set report2 name
set report3 gender

puts [db eval {
 select distinct eav.* from eav
 join eav as eav0 on eav.entity == eav0.entity
 and eav0.entity == :pattern0 where eav.attribute == :report2
 union
 select distinct eav.* from eav
 join eav as eav0 on eav.entity == eav0.entity
 and eav0.entity == :pattern0 where eav.attribute == :report3
 order by eav.id
}]
= end script =

The result is:

3 1 name Imogen 5 1 gender female 5 1 1 female

-- 
Poor Yorick