RE: [sqlite] CROSS keyword disables certain join optimizations

2005-09-12 Thread Joe Wilson
--- "Mrs. Brisby" <[EMAIL PROTECTED]> wrote:
> That all said, it seems like this problem is already solved- SQLite does
> the right thing after ANALYZE is called. Perhaps it wouldn't be too
> difficult to update the statistics that ANALYZE collects in after
> COMMIT, or perhaps after a COMMIT that runs longer than a particular
> amount of time.
> 
> This would obsolete the need for a control, and as I've said, controls
> are bad.

I do agree that controls generally suck.
But we live in an imperfect world, and sometimes we
know better than the tools we are using and have to 
intervene and steer them in a better direction.

The auto ANALYZE might work fine on small databases,
but not all of us would want to incur the cost of using 
ANALYZE when it is not required. (I could tell you a related
story about Sun Java engineers who thought mutex locking
all API accesses to collections classes in Java 1.0 was a
brilliant idea. This mis-feature turned out to be brilliantly
stupid when people realized that the locking had to be
done at a higer level - in the application level - in order 
for it to be useful.  All the automatic locking served to 
do was slow down the general use of the class, but provided 
little or no benefit).

The Sqlite user now has the choice to go with the Cadillac 
approach (ANALYZE) - a smooth ride, not always so good on gas, 
but very comfortable. Or the user may choose to drive with 
a manual transmission (CROSS JOIN) and obtain better and more
predictable control of the vehicle. Choice is a good thing.

On a related topic, it seems that Postgres 8.x has added
a pragma "join_collapse_limit" to the CROSS JOIN mix to deal 
with this manual join order thing:

 http://www.postgresql.org/docs/current/interactive/explicit-joins.html

 SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;

 With join_collapse_limit = 1, this forces the planner to 
 join A to B before joining them to other tables...

Even more controls...


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


Re: [sqlite] CROSS keyword disables certain join optimizations

2005-09-12 Thread Cam Crews
> People who visit a website that looks like junk _NEVER_ say "oh my web
> browser is being a piece of shit."

...unless they're the same people who designed it ;)


RE: [sqlite] CROSS keyword disables certain join optimizations

2005-09-12 Thread Mrs. Brisby
On Mon, 2005-09-12 at 07:15 -0400, Ned Batchelder wrote:
> I'm not sure if we agree or disagree.  You say:
> 
> > I don't think the number of optimizer controls should grow. 
> > It should be kept as close to zero as possible.
> 
> But then you say:
> 
> > If [the optimizer] can make a mistake, I want a simple
> > and _unobtrusive_ way to correct _just_that_mistake_ for
> > _just_that_query_.
> 
> As the optimizer grows, the number of mistakes it could make will grow. You
> seem to want a control for each mistake.  The number of controls will grow.

I don't want a control. A control is a conscious device that I want no
part of. Perhaps swapping the order of entries, or performing some bogus
event, etc. These are things that the SQLite list can mention, that can
always be considered SQLite oddities, but that users shouldn't be
concerned with trying "on their own".

These controls _are_bugs_. Period. Comments and pragmas have this nasty
tendency to stick around longer than necessary.

I realize that if the Pg people have decided _this_particular_ control
is necessary- that is, the bug cannot be fixed, then pragmas are not the
way to go, but instead SQL itself needs to be fixed.

I like, however, to follow the conventions set up by others, so I think
CROSS isn't such a bad idea as a result.


That all said, it seems like this problem is already solved- SQLite does
the right thing after ANALYZE is called. Perhaps it wouldn't be too
difficult to update the statistics that ANALYZE collects in after
COMMIT, or perhaps after a COMMIT that runs longer than a particular
amount of time.

This would obsolete the need for a control, and as I've said, controls
are bad.


> As a few of us have mentioned, other database engines have used inline
> pragmas in the form of specially formatted comments:

Other database engines are wrong so...?

Other database engines support [] for attribute boundaries or backslash
as an escaping character. I suppose these should be used too? After all,
they solve a very specific non-problem at a horrible expense to the
user, so they must be good right?

Note what I am saying: _optimizer_ controls are bad. They make the user
miserable because they lead the user to believe that their SQL is
invalid or incorrect instead of Pg or DB2 or MSSQL or MySQL or SQLite
being incapable.

gcc -O2 makes faster code as long as it's not smaller. Why isn't it
default?

find -noleaf on the converse works around a bug in the kernel. Why not
fix the kernel? -noleaf should do NOTHING on current systems.

Why do I need to intermix -O0 on some non-speed sensitive code because
of gcc bugs, and why should I the programmer bother?

I am much more willing to accept and workaround the failures of gcc,
some unixes, and SQLite than I am to have by own intelligence insulted
by a program I know very little about.

... after all, if I try a new program and it tells me I performed some
kind of syntax invocation error, I tend to believe it. I think _I_ must
have done something wrong. Controls simply reinforce this mind-think.


> select /* sqlite_stupid */ A.parent, A.child from /* sqlite_stupid */
> revision_ancestry ...
> 
> New keywords will also work.  The comments have the advantage that they are
> "portable": you can pass this exact query to any database.  Only SQLite will
> interpret the comments.  If SQLite invents a new STUPID keyword (or
> SQLITE_STUPID), the query will only be parsable by SQLite.

Agreed. That's exactly why I recommend them over code-hiding.

Code-hiding has the distinct advantage of keeping bugs around. People
love the fact that they can hide CSS from MSIE so that they can pretend
to have fully conforming XHTML1.1+CSS web sites that look absolutely
stunning in some latest version of Firefox,

but they don't change the fact that MSIE had a bug in it. People who
visit a website that looks like junk _NEVER_ say "oh my web browser is
being a piece of shit."

Instead, by forcing these bugs to be deliberate and visible, we prevent
people from deploying broken code to broken systems, but STILL allow
people who "have no other option" to proceed with the full knowledge
that a future SQLite release will require they change their code.


> Granted, using pragma comments in queries is unlikely in a system where
> queries are fed to multiple DBMS's, but at least it would be possible.

Which brings me back to PostgreSQL - which was commented as not having a
special pragma for this but instead a sly little hack identical to the
one SQLite is using. People who target PostgreSQL will already have had
the fix.


> -Original Message-
> From: Mrs. Brisby [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, 11 September, 2005 10:07 PM
> To: sqlite-users@sqlite.org
> Subject: RE: [sqlite] CROSS keyword disables certa

RE: [sqlite] CROSS keyword disables certain join optimizations

2005-09-12 Thread Ned Batchelder
I'm not sure if we agree or disagree.  You say:

> I don't think the number of optimizer controls should grow. 
> It should be kept as close to zero as possible.

But then you say:

> If [the optimizer] can make a mistake, I want a simple
> and _unobtrusive_ way to correct _just_that_mistake_ for
> _just_that_query_.

As the optimizer grows, the number of mistakes it could make will grow. You
seem to want a control for each mistake.  The number of controls will grow.

As a few of us have mentioned, other database engines have used inline
pragmas in the form of specially formatted comments:

select /* sqlite_stupid */ A.parent, A.child from /* sqlite_stupid */
revision_ancestry ...

New keywords will also work.  The comments have the advantage that they are
"portable": you can pass this exact query to any database.  Only SQLite will
interpret the comments.  If SQLite invents a new STUPID keyword (or
SQLITE_STUPID), the query will only be parsable by SQLite.

Granted, using pragma comments in queries is unlikely in a system where
queries are fed to multiple DBMS's, but at least it would be possible.

--Ned.
http://nedbatchelder.com
 

-Original Message-
From: Mrs. Brisby [mailto:[EMAIL PROTECTED] 
Sent: Sunday, 11 September, 2005 10:07 PM
To: sqlite-users@sqlite.org
Subject: RE: [sqlite] CROSS keyword disables certain join optimizations

On Sat, 2005-09-10 at 21:38 -0400, Ned Batchelder wrote:
> Rather than overload an existing SQL keyword, would it be possible to
> provide pragmas to control the optimizer?  Assigning meanings to
particular
> combinations of SQL queries won't scale as the number of optimizer
controls
> grows.

I don't think the number of optimizer controls should grow. It should be
kept as close to zero as possible.

The query optimizer isn't (and shouldn't be) designed to help out people
who don't know SQL, but should be for people who know SQL but don't know
SQLite.

When SQLite does this optimization of table reorganization, it's doing
so because it doesn't have a real optimal way to answer the
query-as-stated but if the query is transformed into something seemingly
(and sometimes algebraically) equivalent, then it does so it can.

Consider a mythical SQL engine that doesn't reorder where terms before
searching an index of the form "a,b".

It can answer queries like:
WHERE a='abc' AND b='def';

but give it:
WHERE b='def' AND a='abc';

and it could take forever. Anyone reasonably aware of the various SQL
engines out there would undoubtedly be aware of engines that DO treat
them as identical, but here is one that isn't.

Given that this is indeed a _query_optimization_ do you really want a
control for this?

While I'm sure many SQL engines don't put this nonsense in the block of
code that looks for query optimizations, I want the query optimizer to
do the best job that it can. If it can make a mistake, I want a simple
and _unobtrusive_ way to correct _just_that_mistake_ for
_just_that_query_.

I think anyone close to the various SQL working groups-that-be should
consider a "STUPID" operator that takes the next term (whatever that
might mean to the SQL engine) and stops any query optimization on that
term. The STUPID keyword would be allowed everywhere. It'll improve the
readability of SQL as it is:
SELECT STUPID A.parent, STUPID A.child STUPID FROM STUPID
revision_ancestry STUPID AS STUPID A ...


I don't like CROSS, but it is the closest thing to what I might find
acceptable.

:)




RE: [sqlite] CROSS keyword disables certain join optimizations

2005-09-11 Thread Mrs. Brisby
On Sat, 2005-09-10 at 21:38 -0400, Ned Batchelder wrote:
> Rather than overload an existing SQL keyword, would it be possible to
> provide pragmas to control the optimizer?  Assigning meanings to particular
> combinations of SQL queries won't scale as the number of optimizer controls
> grows.

I don't think the number of optimizer controls should grow. It should be
kept as close to zero as possible.

The query optimizer isn't (and shouldn't be) designed to help out people
who don't know SQL, but should be for people who know SQL but don't know
SQLite.

When SQLite does this optimization of table reorganization, it's doing
so because it doesn't have a real optimal way to answer the
query-as-stated but if the query is transformed into something seemingly
(and sometimes algebraically) equivalent, then it does so it can.

Consider a mythical SQL engine that doesn't reorder where terms before
searching an index of the form "a,b".

It can answer queries like:
WHERE a='abc' AND b='def';

but give it:
WHERE b='def' AND a='abc';

and it could take forever. Anyone reasonably aware of the various SQL
engines out there would undoubtedly be aware of engines that DO treat
them as identical, but here is one that isn't.

Given that this is indeed a _query_optimization_ do you really want a
control for this?

While I'm sure many SQL engines don't put this nonsense in the block of
code that looks for query optimizations, I want the query optimizer to
do the best job that it can. If it can make a mistake, I want a simple
and _unobtrusive_ way to correct _just_that_mistake_ for
_just_that_query_.

I think anyone close to the various SQL working groups-that-be should
consider a "STUPID" operator that takes the next term (whatever that
might mean to the SQL engine) and stops any query optimization on that
term. The STUPID keyword would be allowed everywhere. It'll improve the
readability of SQL as it is:
SELECT STUPID A.parent, STUPID A.child STUPID FROM STUPID
revision_ancestry STUPID AS STUPID A ...


I don't like CROSS, but it is the closest thing to what I might find
acceptable.

:)



Re: [sqlite] CROSS keyword disables certain join optimizations

2005-09-11 Thread Mrs. Brisby
On Sat, 2005-09-10 at 19:25 -0400, D. Richard Hipp wrote:
> Beginning with SQLite 3.2.3, the query optimizer has had
> the ability to reorder tables in the FROM clause if it thinks
> that doing so will make the query run faster.  This has caused
> a few problems for some folks.  To ameliorate those
> difficulties, I have now modified the query optimizer so
> that it will no longer reorder tables in a join if the join
> is constructed using the CROSS keyword.  For additional
> detail see
> 
>   http://www.sqlite.org/cvstrac/tktview?tn=1414
>   http://www.sqlite.org/cvstrac/wiki?p=QueryPlans
> 
> Thoughtful feedback on this design choice will be appreciated.
> I'd like to do a release of version 3.2.6 containing this and
> other changes on or before September 16.
> 
> The rational behind using the CROSS keyword to disable an
> optimization is that the CROSS keyword is perfectly
> valid SQL syntax but nobody ever uses it so I figured 
> we can put it to use to help control the optimizer without
> creating any incompatibilities.

The ticket mentions that postgres uses the same heuristic? Is there some
Pg documentation that demonstrates this?



Re: [sqlite] CROSS keyword disables certain join optimizations

2005-09-11 Thread Joe Wilson
I vote to not control manual table join order with PRAGMAs, as
PRAGMAs are too course-grained to be of any practical use in
complicated queries. For example, what happens if you want to
force a join on just a couple of tables in the FROM clause,
but not others? It is not possible without an a set of very
awkward query-specific PRAGMA commands. (For that matter, I'm
not a fan of Oracle optimization hints in comments either,
although it is a lesser evil.)

I think the CROSS JOIN syntax is an ideal way to manually control
the join order for a number of reasons:

* it's backwards compatible with older versions of Sqlite 
  (i.e., it will not degrade performance prior to 3.2.3
  - the queries would run exactly as they did before)
* the very simple CROSS JOIN syntax - merely replace commas
  in the from clause with " CROSS JOIN "
* CROSS JOIN is ANSI SQL compatible
* CROSS JOIN is almost never used in practise, so its 
  explicit use suggests the intention to manually force a 
  join order.
* Very elegant - just a few lines of code to implement it in 
  Sqlite.
* Postgres already uses this exact same CROSS JOIN heuristic 
  to manually control the join order of tables, so there is 
  a precedent: 

http://www.postgresql.org/docs/7.1/interactive/explicit-joins.html 

SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
 
forces the planner to join A to B before joining them to other tables...


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


RE: [sqlite] CROSS keyword disables certain join optimizations

2005-09-11 Thread Mark de Vries
On Sat, 10 Sep 2005, Ned Batchelder wrote:
> Rather than overload an existing SQL keyword, would it be possible to
> provide pragmas to control the optimizer?  Assigning meanings to particular
> combinations of SQL queries won't scale as the number of optimizer controls
> grows.

I would like to second this opinion. I think a pragma to tell SQLite
wether or not 'I realy know what I'm dooing' and want help or not is the
preferred method.

I don't like the idea of overloading a syntax to have add-on non-obvious
implications/meaning.

Regards,
Mark.

> For example, some databases use specially-formatted comments within the SQL
> query to control the internals of the system.  It isn't portable, but
> neither is this new meaning of the CROSS keyword.
>
> --Ned.
> http://nedbatchelder.com
>
>
> -Original Message-
> From: Kervin L. Pierre [mailto:[EMAIL PROTECTED]
> Sent: Saturday, 10 September, 2005 9:08 PM
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] CROSS keyword disables certain join optimizations
>
> Darren Duncan wrote:
> > At 7:25 PM -0400 9/10/05, D. Richard Hipp wrote:
> >
> > Well, that's fine as long as CROSS still continues to mean and do what
> > it has always meant, which is that you explicitly want the result set of
>
> If I understand the issue correctly, it does.
>
> "FROM a, b" is usually equivalent to
> "FROM a CROSS JOIN b" in most databases.  With
> the new fix, the first form gives you the
> optimized query, whilst the second form turns
> it off.  But you should get the same results.
>
> ...I think :)
>
> Regards,
> Kervin
>
>
>
>


Regards,
Mark



Re: [sqlite] CROSS keyword disables certain join optimizations

2005-09-10 Thread Andrew Piskorski
On Sat, Sep 10, 2005 at 07:25:48PM -0400, D. Richard Hipp wrote:

> difficulties, I have now modified the query optimizer so
> that it will no longer reorder tables in a join if the join
> is constructed using the CROSS keyword.  For additional

This is a one-off to control one particular feature (table reordering)
of the optimizer.  Is it likely that in the future, perhaps as the
optimizer grows more complex and featurful, that programmers will want
to be able to give further such directives or hints to the optimizer?

If the answer is yes, then maybe it would make more sense to provide
an actual syntax or language for giving SQLite such hints, probably by
embedding them into specially formatted SQL comments (which is
Oracle's approach).

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/


RE: [sqlite] CROSS keyword disables certain join optimizations

2005-09-10 Thread Ned Batchelder
Rather than overload an existing SQL keyword, would it be possible to
provide pragmas to control the optimizer?  Assigning meanings to particular
combinations of SQL queries won't scale as the number of optimizer controls
grows.

For example, some databases use specially-formatted comments within the SQL
query to control the internals of the system.  It isn't portable, but
neither is this new meaning of the CROSS keyword. 

--Ned.
http://nedbatchelder.com
 

-Original Message-
From: Kervin L. Pierre [mailto:[EMAIL PROTECTED] 
Sent: Saturday, 10 September, 2005 9:08 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] CROSS keyword disables certain join optimizations

Darren Duncan wrote:
> At 7:25 PM -0400 9/10/05, D. Richard Hipp wrote:
> 
> Well, that's fine as long as CROSS still continues to mean and do what 
> it has always meant, which is that you explicitly want the result set of 

If I understand the issue correctly, it does.

"FROM a, b" is usually equivalent to
"FROM a CROSS JOIN b" in most databases.  With
the new fix, the first form gives you the
optimized query, whilst the second form turns
it off.  But you should get the same results.

...I think :)

Regards,
Kervin





Re: [sqlite] CROSS keyword disables certain join optimizations

2005-09-10 Thread Kervin L. Pierre

Darren Duncan wrote:

At 7:25 PM -0400 9/10/05, D. Richard Hipp wrote:

Well, that's fine as long as CROSS still continues to mean and do what 
it has always meant, which is that you explicitly want the result set of 


If I understand the issue correctly, it does.

"FROM a, b" is usually equivalent to
"FROM a CROSS JOIN b" in most databases.  With
the new fix, the first form gives you the
optimized query, whilst the second form turns
it off.  But you should get the same results.

...I think :)

Regards,
Kervin




Re: [sqlite] CROSS keyword disables certain join optimizations

2005-09-10 Thread Darren Duncan

At 7:25 PM -0400 9/10/05, D. Richard Hipp wrote:

The rational behind using the CROSS keyword to disable an
optimization is that the CROSS keyword is perfectly
valid SQL syntax but nobody ever uses it so I figured
we can put it to use to help control the optimizer without
creating any incompatibilities.


Well, that's fine as long as CROSS still continues to mean and do 
what it has always meant, which is that you explicitly want the 
result set of "SELECT * FROM foo CROSS JOIN bar" to be every foo row 
crossed with every bar row.  This actually is used in real 
situations, even though it is less common than an INNER or OUTER 
join.  See SQL:2003, 7.7 "". -- Darren Duncan