Re: [sqlite] Why the high cost of a double sort

2017-08-20 Thread Clemens Ladisch
Simon Slavin wrote:
> On 19 Aug 2017, at 10:48pm, Cecil Westerhof wrote:
>> I was also told that you never should put a sort on a view. Is that true,
>> or a bit to strong?
>
> Generally, you put the ORDER BY on the SELECT you’re using the consult
> the VIEW.  Technically speaking a VIEW is just a set of records and the
> ORDER BY should be a last-minute thing just before presentation of the
> results.

The SQL standard and  say:
| If a SELECT statement that returns more than one row does not have an
| ORDER BY clause, the order in which the rows are returned is undefined.

ORDER BY clauses in any subqueries/views/CTEs are not guaranteed to keep
the order in the outermost query.  It's possible for the database to
optimize the query by using some index or cache, and that can result in
a different order.

> But it can be useful for someone submitting commands manually to create
> a VIEW with ORDER BY.  For instance, using the command-line shell to
> investigate weird results, or for those cases where you’re using SQLite
> like a spreadsheet to arrive at a one-off result.  So I think an outright
> ban is a little strong.

In practice, when the outer query is implemented by simply scanning over
the inner query's rows, the order will not change.  But there is no
guarantee that the implementation will always stay the same.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why the high cost of a double sort

2017-08-19 Thread Simon Slavin


On 19 Aug 2017, at 10:48pm, Cecil Westerhof  wrote:

> I was also told that you never should put a sort on a view. Is that true,
> or a bit to strong?

Generally, you put the ORDER BY on the SELECT you’re using the consult the 
VIEW.  Technically speaking a VIEW is just a set of records and the ORDER BY 
should be a last-minute thing just before presentation of the results.

But it can be useful for someone submitting commands manually to create a VIEW 
with ORDER BY.  For instance, using the command-line shell to investigate weird 
results, or for those cases where you’re using SQLite like a spreadsheet to 
arrive at a one-off result.  So I think an outright ban is a little strong.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why the high cost of a double sort

2017-08-19 Thread Cecil Westerhof
2017-08-19 22:03 GMT+02:00 Keith Medcalf :

>
> When you run the second query, the optimizer does not know that the data
> returned from the view is already sorted into the "correct" order and the
> inclusion of the "order by" in the view prevents the query flattener from
> optimizing out (re-writing) the query into a single query (eliminating the
> view).  Without flattening, the view is simply a "black box" that returns
> some rows in an arbitrary (completely unsorted) order.
>
> If you remove the "order by" from the view, then the query flattener will
> be able to flatten the query and re-write and execute as if you had entered:
>
>   SELECT used
> FROM proverbs
>WHERE CAST(used AS INT) <> 0
> ORDER BY used
>LIMIT 1
>

​Thanks. I learned something.

By the way, I should have used:
SELECT MIN(used)
FROM   usedProverbs

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why the high cost of a double sort

2017-08-19 Thread Cecil Westerhof
2017-08-19 20:37 GMT+02:00 Clemens Ladisch :

> Cecil Westerhof wrote:
> > I have the following query:
> > SELECT used
> > FROM   usedProverbs
> > LIMIT  1
> >
> > The view useProverbs is defined as:
> > CREATE VIEW usedProverbs AS
> > SELECT   *
> > FROM proverbs
> > WHERECAST(used AS INT) <> 0
> > ORDER BY used ASC
> >
> > But I am told this is not clear and that I should use:
> > SELECT   used
> > FROM usedProverbs
> > ORDER BY used
> > LIMIT1
> >
> > But when I use those in DBBrowser, the first takes almost always 0 ms,
> > while the second takes between 13 and 16 ms. Why does the second one take
> > so much more time?
>
> Because only the first sort can be optimized away with an index
> (which you did not mention).
>

​I see that there is an index on used yes:
CREATE INDEX `proverbs_used_idx` ON `proverbs` (`used` )

That explains it. It has been some time I defined the table and forgot it.
​



> If you believe what you're told (that ORDER BY must be in the
> outermost query), then remove the ORDER BY from the view.
>

​I did and that worked.

I was also told that you never should put a sort on a view. Is that true,
or a bit to strong?
​

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why the high cost of a double sort

2017-08-19 Thread Keith Medcalf

When you run the second query, the optimizer does not know that the data 
returned from the view is already sorted into the "correct" order and the 
inclusion of the "order by" in the view prevents the query flattener from 
optimizing out (re-writing) the query into a single query (eliminating the 
view).  Without flattening, the view is simply a "black box" that returns some 
rows in an arbitrary (completely unsorted) order.

If you remove the "order by" from the view, then the query flattener will be 
able to flatten the query and re-write and execute as if you had entered:

  SELECT used
FROM proverbs
   WHERE CAST(used AS INT) <> 0
ORDER BY used
   LIMIT 1

This of course depends on the version of SQLite because the capabilities of 
flattening views varied a bit from version to version.

See section 10 of https://sqlite.org/optoverview.html
for a long list of things that affect the query flattening/re-write capability.

Your query with a view is equivalent to 

SELECT used FROM (SELECT * FROM proverbs WHERE CAST(used AS INT) <> 0 ORDER BY 
used ASC) ORDER BY used LIMIT 1

As you can see from the list, the subquery ORDER BY precludes flattening the 
query.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Cecil Westerhof
>Sent: Saturday, 19 August, 2017 05:37
>To: SQLite mailing list
>Subject: [sqlite] Why the high cost of a double sort
>
>I have the following query:
>SELECT used
>FROM   usedProverbs
>LIMIT  1
>
>The view useProverbs is defined as:
>CREATE VIEW usedProverbs AS
>SELECT   *
>FROM proverbs
>WHERECAST(used AS INT) <> 0
>ORDER BY used ASC
>
>But I am told this is not clear and that I should use:
>SELECT   used
>FROM usedProverbs
>ORDER BY used
>LIMIT1
>
>But when I use those in DBBrowser, the first takes almost always 0
>ms,
>while the second takes between 13 and 16 ms. Why does the second one
>take
>so much more time?
>
>The definition of proverbs:
>CREATE TABLE proverbs (
>proverbID   INTEGER PRIMARY KEY AUTOINCREMENT,
>proverb TEXTNOT NULL UNIQUE,
>usedTEXTDEFAULT 'notUsed'
>);
>
>​To be sure I also executed it in sqlite3 after .timer on. Here the
>first
>one almost always gives​:
>Run Time: real 0.000 user 0.00 sys 0.00
>and sometimes:
>Run Time: real 0.001 user 0.00 sys 0.00
>
>The second one gives mostly:
>Run Time: real 0.006 user 0.004000 sys 0.00
>but sometimes it get as low as:
>Run Time: real 0.003 user 0.004000 sys 0.00
>
>​Both are using 3.16.2​. Would 3.20.0 make a difference? Then I could
>install that one and see the results from that. (The JDBC uses
>3.20.2.)
>
>--
>Cecil Westerhof
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why the high cost of a double sort

2017-08-19 Thread Clemens Ladisch
Cecil Westerhof wrote:
> I have the following query:
> SELECT used
> FROM   usedProverbs
> LIMIT  1
>
> The view useProverbs is defined as:
> CREATE VIEW usedProverbs AS
> SELECT   *
> FROM proverbs
> WHERECAST(used AS INT) <> 0
> ORDER BY used ASC
>
> But I am told this is not clear and that I should use:
> SELECT   used
> FROM usedProverbs
> ORDER BY used
> LIMIT1
>
> But when I use those in DBBrowser, the first takes almost always 0 ms,
> while the second takes between 13 and 16 ms. Why does the second one take
> so much more time?

Because only the first sort can be optimized away with an index
(which you did not mention).


If you believe what you're told (that ORDER BY must be in the
outermost query), then remove the ORDER BY from the view.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Why the high cost of a double sort

2017-08-19 Thread Cecil Westerhof
I have the following query:
SELECT used
FROM   usedProverbs
LIMIT  1

The view useProverbs is defined as:
CREATE VIEW usedProverbs AS
SELECT   *
FROM proverbs
WHERECAST(used AS INT) <> 0
ORDER BY used ASC

But I am told this is not clear and that I should use:
SELECT   used
FROM usedProverbs
ORDER BY used
LIMIT1

But when I use those in DBBrowser, the first takes almost always 0 ms,
while the second takes between 13 and 16 ms. Why does the second one take
so much more time?

The definition of proverbs:
CREATE TABLE proverbs (
proverbID   INTEGER PRIMARY KEY AUTOINCREMENT,
proverb TEXTNOT NULL UNIQUE,
usedTEXTDEFAULT 'notUsed'
);

​To be sure I also executed it in sqlite3 after .timer on. Here the first
one almost always gives​:
Run Time: real 0.000 user 0.00 sys 0.00
and sometimes:
Run Time: real 0.001 user 0.00 sys 0.00

The second one gives mostly:
Run Time: real 0.006 user 0.004000 sys 0.00
but sometimes it get as low as:
Run Time: real 0.003 user 0.004000 sys 0.00

​Both are using 3.16.2​. Would 3.20.0 make a difference? Then I could
install that one and see the results from that. (The JDBC uses 3.20.2.)

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users