Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-16 Thread Steinar Midtskogen
To answer my own question, whether there is an efficient way to find
max() of an increasingly sorted column in a virtual array: What is
needed is to make sure that xBestIndex sets orderByConsumed, and that
the module takes care of all sorting.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Steinar Midtskogen
[Gerry Snyder]

> At worst you could use another table to keep track of the maximum and
> minimum, and update it with triggers when something is added to or deleted
> from the virtual table.

My module knows what the maximum and minimum values are at all times.
It also knows that the column is sorted.  The trouble is that it
doesn't know that the values it can produce for that column will be
fed to a max() and min() function.  If it did, it could simply just
return one value.

I might be missing something, though.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Steinar Midtskogen
[Kit]

> 2012/4/15 Steinar Midtskogen :
>> So, is there really a way to create an index in a virtual table, or a
>> way to emulate this?
>
> Why? You don't need this. Use index on base tables.

My base tables are indexed.  Let's say I want to make a very simple
virtual table this way:

 create virtual table vtab using copy(indexed_table);

which simply maps any query for vtab to indexed_table and returns that.
So let's say that indexed_table have an integer column "key" which
also a primary key.  So "select max(key) from indexed_table" will be
fast no matter how big it is and the module can find this value in a
blink.  What I would like to is to have "select max(key) from vtab"
run fast as well, without having to run through the billion rows in
index.

So what happens when I run "select max(key) from vtab"?  Well, all
xFilter will know is that it needs to produce the "key" column, and there
should be a "order by key" clause as well, but even if we can assume
that what we're dealing with is a sorted column, and xFilter could
look up the max in no time, xFilter doesn't know that the query is for
the max value.  Can my module do anything better than to produce all
the rows for sqlite to feed into the max aggregate function?

>> My xRowid function simply returns the value of the "unix_time" column,
>> but even "select max(rowid)" is equally slow.
>> Steinar
>
> Why you need "select max(rowid)"? Something is wrong in your data
> design. Use autoincrement.

I don't need it, but a virtual table must provide one.  I'm not sure
why.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Gerry Snyder
At worst you could use another table to keep track of the maximum and
minimum, and update it with triggers when something is added to or deleted
from the virtual table.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Kit
2012/4/15 Steinar Midtskogen :
> So, is there really a way to create an index in a virtual table, or a
> way to emulate this?

Why? You don't need this. Use index on base tables.

> My xRowid function simply returns the value of the "unix_time" column,
> but even "select max(rowid)" is equally slow.
> Steinar

Why you need "select max(rowid)"? Something is wrong in your data
design. Use autoincrement.
-- 
Kit
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Steinar Midtskogen
[Simon Slavin]

> On 15 Apr 2012, at 1:31pm, Steinar Midtskogen  wrote:
>
>> Another question about max()/min() optimisation.  Is there a way I can
>> implement a virtual table so that max()/min() of a sorted
>> (incrementing) column (which could be an integer primary key in a
>> regular table) gets fast?
>
> The max() and min() functions work instantly if the columns they're
> looking at are indexed.  They simply find the first or last entry in
> the index.  So when defining your virtual table routines, just make
> sure your key columns have an index, and that your xBestIndex method
> finds the right index.

According to the "Using SQLite" book, "you cannot create an index on
a view or on a virtual table".

Also, when declaring the virtual table using sqlite3_declare_vtab the
book says: "any constraints, default values, or key definitions within
the table definition are also ignord - this includes any definition of
INTEGER PRIMARY KEY as a ROWID alias".

So, is there really a way to create an index in a virtual table, or a
way to emulate this?

My xRowid function simply returns the value of the "unix_time" column,
but even "select max(rowid)" is equally slow.
-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Steinar Midtskogen
Hello again,

Another question about max()/min() optimisation.  Is there a way I can
implement a virtual table so that max()/min() of a sorted
(incrementing) column (which could be an integer primary key in a
regular table) gets fast?

For example,

sqlite> explain query plan select max(unix_time) from vtab;
0|0|0|SEARCH TABLE vtab VIRTUAL TABLE INDEX 0: (~1 rows)

Currently, "select max(unix_time) from vtab" causes SQLite to search
through millions of rows, which may take nearly half a minute for my
table, no faster than other non-sorted columns.

I've added special treatment of this sorted "unix_time" column in
xBestIndex, so that a query like:

 select max(unix_time) from vtab where unix_time > strftime("%s", "2012-04-14");

runs fast (i.e. then my table will only look through a few rows at the end).


Perhaps what I'm asking is whether it's possible to add a special
treatment for max() and min() in a virtual table.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Dan Kennedy

On 04/14/2012 03:14 AM, Steinar Midtskogen wrote:

Puneet Kishor  writes:


If you want the results in separate columns, you can do something like

SELECT Min(a) minimum, 'none' maximum FROM t UNION ALL SELECT 'none' minimum, 
Max(a) minimum FROM t;


Then it does a full scan again.

But Igor's suggestion "SELECT (SELECT min(unix_time) FROM table),
(SELECT max(unix_time) FROM table)" works fine, and means less code.


This:

  http://www.sqlite.org/optoverview.html#minmax

Both the subqueries qualify for the optimization, so the overall
query is fast. With the UNION ALL version, the second column in the
result set disqualifies both sides from using the optimization. So
it is slow.

I think if you were to change the UNION ALL version to the following
it would be just as fast as the sub-selects.

  SELECT Min(a) minimum FROM t
UNION ALL
  SELECT Max(a) minimum FROM t;




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Puneet Kishor

On Apr 13, 2012, at 3:14 PM, Steinar Midtskogen wrote:

> Puneet Kishor  writes:
> 
>> If you want the results in separate columns, you can do something like
>> 
>> SELECT Min(a) minimum, 'none' maximum FROM t UNION ALL SELECT 'none' 
>> minimum, Max(a) minimum FROM t;
> 
> Then it does a full scan again.
> 
> But Igor's suggestion "SELECT (SELECT min(unix_time) FROM table),
> (SELECT max(unix_time) FROM table)" works fine, and means less code.
> 


Yes, Igor's suggestion is definitely better, but where is the full table scan?

sqlite> EXPLAIN QUERY PLAN SELECT Min(a) FROM t UNION ALL SELECT Max(a) 
FROM t;
selectid|order|from|detail
1|0|0|SEARCH TABLE t USING INTEGER PRIMARY KEY (~1 rows)
2|0|0|SEARCH TABLE t USING INTEGER PRIMARY KEY (~1 rows)
0|0|0|COMPOUND SUBQUERIES 1 AND 2 (UNION ALL)

Am I missing something?


--
Puneet Kishor

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Steinar Midtskogen
Puneet Kishor  writes:

> If you want the results in separate columns, you can do something like
>
> SELECT Min(a) minimum, 'none' maximum FROM t UNION ALL SELECT 'none' minimum, 
> Max(a) minimum FROM t;

Then it does a full scan again.

But Igor's suggestion "SELECT (SELECT min(unix_time) FROM table),
(SELECT max(unix_time) FROM table)" works fine, and means less code.

Thanks!

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Puneet Kishor
Try the following

sqlite> EXPLAIN QUERY PLAN SELECT Min(a) FROM t UNION ALL SELECT Max(a) FROM t;
selectid|order|from|detail
1|0|0|SEARCH TABLE t USING INTEGER PRIMARY KEY (~1 rows)
2|0|0|SEARCH TABLE t USING INTEGER PRIMARY KEY (~1 rows)
0|0|0|COMPOUND SUBQUERIES 1 AND 2 (UNION ALL)

Should be a lot faster than a single query without UNION.

If you want the results in separate columns, you can do something like

SELECT Min(a) minimum, 'none' maximum FROM t UNION ALL SELECT 'none' minimum, 
Max(a) minimum FROM t;

On Apr 13, 2012, at 2:44 PM, Steinar Midtskogen wrote:

> Alessandro Marzocchi  writes:
> 
>> What does EXPLAIN QUERY PLAN says?
> 
> sqlite> EXPLAIN QUERY PLAN SELECT min(unix_time) FROM table;
> 0|0|0|SEARCH TABLE table USING INTEGER PRIMARY KEY (~1 rows)
> 
> sqlite> EXPLAIN QUERY PLAN SELECT max(unix_time) FROM table;
> 0|0|0|SEARCH TABLE table USING INTEGER PRIMARY KEY (~1 rows)
> 
> sqlite> EXPLAIN QUERY PLAN SELECT min(unix_time), max(unix_time) FROM table;
> 0|0|0|SCAN TABLE table (~100 rows)
> 
> I suppose a query for a single min/max gets optimised, while a query
> involving multiple columns doesn't.
> 
> I have a much bigger table as well, and on that one the speedup is in
> the millions to run two SELECTs.  It's hard to guess that there will
> be such a difference, but I suppose I should be happy that there is at
> least an optimised way to get min and max for the integer primary key.
> 
> -- 
> Steinar
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Igor Tandetnik

On 4/13/2012 2:59 PM, Steinar Midtskogen wrote:

I have a table with "unix_time" as primary key and I want to get the
minimum and maximum values of "unix_time".  When I do:

   SELECT min(unix_time), max(unix_time) from table;

it is very slow.  It takes about 250ms, nearly everything in the
step() call.

However, if I do:

   SELECT min(unix_time) FROM table; SELECT max(unix_time) FROM table;

to get the same values, it takes a fraction of the time.  The speedup
is more than 2000x.


If you want to do it with a single query (say, to minimize disturbance 
to existing code), you could make it


select (SELECT min(unix_time) FROM table), (SELECT max(unix_time) FROM 
table);


I'm pretty sure this will get executed the fast way.
--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Steinar Midtskogen
Alessandro Marzocchi  writes:

> What does EXPLAIN QUERY PLAN says?

sqlite> EXPLAIN QUERY PLAN SELECT min(unix_time) FROM table;
0|0|0|SEARCH TABLE table USING INTEGER PRIMARY KEY (~1 rows)

sqlite> EXPLAIN QUERY PLAN SELECT max(unix_time) FROM table;
0|0|0|SEARCH TABLE table USING INTEGER PRIMARY KEY (~1 rows)

sqlite> EXPLAIN QUERY PLAN SELECT min(unix_time), max(unix_time) FROM table;
0|0|0|SCAN TABLE table (~100 rows)

I suppose a query for a single min/max gets optimised, while a query
involving multiple columns doesn't.

I have a much bigger table as well, and on that one the speedup is in
the millions to run two SELECTs.  It's hard to guess that there will
be such a difference, but I suppose I should be happy that there is at
least an optimised way to get min and max for the integer primary key.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Alessandro Marzocchi
What does EXPLAIN QUERY PLAN says?


Il giorno 13 aprile 2012 21:04, Marc L. Allen
<mlal...@outsitenetworks.com>ha scritto:

> Maybe the query analyzer isn't smart enough to do two seeks in this case,
> so it does a scan?
>
> > -Original Message-
> > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> > boun...@sqlite.org] On Behalf Of Steinar Midtskogen
> > Sent: Friday, April 13, 2012 3:00 PM
> > To: sqlite-users@sqlite.org
> > Subject: [sqlite] Why are two select statements 2000 times faster than
> > one?
> >
> > Hello,
> >
> > I have a table with "unix_time" as primary key and I want to get the
> > minimum and maximum values of "unix_time".  When I do:
> >
> >   SELECT min(unix_time), max(unix_time) from table;
> >
> > it is very slow.  It takes about 250ms, nearly everything in the
> > step() call.
> >
> > However, if I do:
> >
> >   SELECT min(unix_time) FROM table; SELECT max(unix_time) FROM table;
> >
> > to get the same values, it takes a fraction of the time.  The speedup
> > is more than 2000x.
> >
> > Why?
> >
> > --
> > Steinar
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Marc L. Allen
Maybe the query analyzer isn't smart enough to do two seeks in this case, so it 
does a scan?

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Steinar Midtskogen
> Sent: Friday, April 13, 2012 3:00 PM
> To: sqlite-users@sqlite.org
> Subject: [sqlite] Why are two select statements 2000 times faster than
> one?
> 
> Hello,
> 
> I have a table with "unix_time" as primary key and I want to get the
> minimum and maximum values of "unix_time".  When I do:
> 
>   SELECT min(unix_time), max(unix_time) from table;
> 
> it is very slow.  It takes about 250ms, nearly everything in the
> step() call.
> 
> However, if I do:
> 
>   SELECT min(unix_time) FROM table; SELECT max(unix_time) FROM table;
> 
> to get the same values, it takes a fraction of the time.  The speedup
> is more than 2000x.
> 
> Why?
> 
> --
> Steinar
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Steinar Midtskogen
Hello,

I have a table with "unix_time" as primary key and I want to get the
minimum and maximum values of "unix_time".  When I do:

  SELECT min(unix_time), max(unix_time) from table;

it is very slow.  It takes about 250ms, nearly everything in the
step() call.

However, if I do:

  SELECT min(unix_time) FROM table; SELECT max(unix_time) FROM table;

to get the same values, it takes a fraction of the time.  The speedup
is more than 2000x.

Why?

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users