[ 
https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Knut Anders Hatlen updated DERBY-4416:
--------------------------------------

    Attachment: d4416-1a.stat
                d4416-1a.diff

Here's an updated patch with the visitor renamed to ConstantExpressionVisitor, 
the method in ValueNode renamed to evaluateConstantExpression(), and with 
updated comments.

Description of the other changes from the previous patch:

1) The invocation of the visitor is moved to 
DMLStatementNode.optimizeStatement(). The previous patch called it from 
SelectNode and JoinNode on WHERE/HAVING/ON clauses. As mentioned by Bryan, this 
transformation can be useful in other parts of the tree as well (as seen in 
VALUES 1+3, although evaluation of arithmetic operations hasn't implemented). I 
think it's cleaner to have the invocation at one single location. Also, since 
there might be sub-queries inside WHERE clauses, the old patch might traverse 
some parts of the tree many times.

The visitor now walks the entire query tree between the preprocessing phase and 
the optimization phase. Doing it as late as possible is advantageous because 
then other simplifications (like the NOT elimination performed in preprocess()) 
increase the chances of finding expressions with a know value, but it should be 
performed before optimization so that the optimizer can take advantage of the 
more accurate selectivity predictions.

2) JoinNode: Made acceptChildren() call accept() on joinPredicates. The 
previous patch visited the JoinNode a little earlier, when the ON clause was 
still represented by joinClause, whereas this patch visits the JoinNode after 
the predicates have been moved to joinPredicates. There are also other children 
of JoinNode that are not visited, but I didn't do anything with them for now.

3) BinaryRelationalOperatorNode: Added rewriting for less-than and less-equals, 
which were forgotten in the previous patch.

4) logop.out: Accept that a statement that used to fail with arithmetic 
overflow now succeeds because the arithmetic isn't performed at runtime. The 
statement that fails is
> select x from s where 2147483647 + 10 = 2 and (1=2);
whereas this almost identical one is expected to succeed
> select x from s where (1=2) and 2147483647 + 10 = 2;
I'm not aware of anything in the SQL standard that requires us to evaluate the 
arithmetic expression first, so I think it's equally OK to pass the first 
statement as it is to pass the second statement. The comments in the test also 
indicate this.

4) specjPlans.out: Removed unneeded ProjectRestrictResultSet from two of the 
query plans (they were there to enforce the restriction 1=1).

5) outerjoin.out: Updated query plan from nested loop join to hash join in a 
join with 1=1 in the ON clause. New plan looks OK and the results are still 
correct. The plan probably changed because 1=1 and TRUE have different 
(estimated) selectivity.

I haven't run all the regression tests on the latest revision of the patch. 
Will report back when I have the results.

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: d4416-1a.diff, d4416-1a.stat, replaceExpressions.diff
>
>
> In the lack of the boolean data type, Derby forces you to use expressions 
> like 1=1 and 1<>1 to express true and false. Generated SQL statements also 
> tend to use such expressions, and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, 
> ProjectRestrictNode and PredicateList are able to eliminate predicates, and 
> in some cases the entire ProjectRestrictNode, if the predicate contains 
> constant true or false values. However, during parsing and compilation, 
> expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit 
> from the boolean optimization code. This leads to more complex, and possibly 
> less efficient, byte code being generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 
> (true), since they will always match no rows when false and all rows when 
> true. The expression 1=1 does however get it's selectivity from the = 
> operator, which means that it'll be 0.1. The same selectivity is assigned to 
> 1=0. Other operators have different selectivity, so 2<3 has the selectivity 
> 0.33, even though the actual selectivity of the expression is the same as 1=1 
> and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you 
> change 2<3 to 1=1 in a WHERE clause. See 
> http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%[email protected]%3e
>  for an example of that.
> If we could go through the query tree and replace occurrences of comparisons 
> between constant values with a boolean constant at bind time, such queries 
> would end up with simpler byte code, and the selectivity passed to the 
> optimizer would be more accurate, possibly resulting in a better plan being 
> chosen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to