[jira] Commented: (DERBY-3926) Incorrect ORDER BY caused by index

Mamta A. Satoor (JIRA) Mon, 13 Apr 2009 12:32:17 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698508#action_12698508
 ]


Mamta A. Satoor commented on DERBY-3926:
----------------------------------------

I have done more debugging on the simpler query with optimizer overrides as 
shown below
SELECT table1.id, m0.value, m1.value FROM  --DERBY-PROPERTIES joinOrder=FIXED 
table2  m1 -- DERBY-PROPERTIES index=key3
,  table2 m0 -- DERBY-PROPERTIES index=key3
, table1 
WHERE table1.id=m0.id AND m0.name='PageSequenceId' AND table1.id=m1.id AND 
m1.name='PostComponentId' AND m1.value='21857' ORDER BY m0.value;

For the query above, when the optimizer will start considering the join orders, 
it is going to assume table2 m1 is 0, table2 m0 is 1 and table1 is 3. So, when 
I say in my analysis below, that optimizer is working with say join order 
0,1,-1, what I mean is optimizer is considering the join order 
m1,m0,-1. -1 means we are not taking into consideration any table for that 
position at this point. With that in mind, let me share what is happening and 
what part of it I don't understand.

For the query above, optimizer first considers 0,-1,-1 and in 
OptimizerImpl.costBasedCostOptimizable line 2248 which is shown as below
if (requiredRowOrdering.sortRequired(currentRowOrdering,assignedTableMap) == 
RequiredRowOrdering.NOTHING_REQUIRED)
The 2 params to the call above are set as follows
At this point, currentRowOrdering       RowOrderingImpl  (id=62)        
        alwaysOrderedOptimizables       Vector<E>  (id=92)      
                elementCount    0       
        columnsAlwaysOrdered    ColumnOrdering  (id=94) 
                columns Vector<E>  (id=158)     
                myDirection     3       
                tables  Vector<E>  (id=159)     
        currentColumnOrdering   null    
        ordering        Vector<E>  (id=96)      
                elementCount    0       
        unorderedOptimizables   Vector<E>  (id=97)      
                elementCount    0       
(ColumnsAlwaysOrdered in currentRowOrdering is set to Direction: 3 Table 0, 
Column 3 Table 0, Column 2)
The 2nd param assignedTableMap just has {0} because we are only considering 0 
in the join order.

We return for the call above with true and that qualifies us to consider sort 
avoidance for m1. One of my question is should m1 get quailified to have sort 
avoidance set to true when we are not really ordering on any column that comes 
from m1?

Moving on, once we finish with 0,-1,-1 where we decided that sorting can be 
avoided for m1, we move to the join order 0,1,-1. Again, we come to the code 
mentioned above for 0,1,-1
if (requiredRowOrdering.sortRequired(currentRowOrdering,assignedTableMap) == 
RequiredRowOrdering.NOTHING_REQUIRED)
If the earlier decision for considering sort avoidance true for m1 was 
correct(for join order 0,-1,-1), we should definitely return sort avoidance 
false for join order 0,1,-1 because with m1 as the outermost table, we can't 
avoid sorting on m0. The 2 params at this point in code are as follows for the 
call above
The first param is currentRowOrdering   RowOrderingImpl  (id=62)        
        alwaysOrderedOptimizables       Vector<E>  (id=99)      
                elementCount    0       
        columnsAlwaysOrdered    ColumnOrdering  (id=156)        
                columns Vector<E>  (id=245)     
                myDirection     3       
                tables  Vector<E>  (id=246)     
        currentColumnOrdering   ColumnOrdering  (id=157)        
                columns Vector<E>  (id=252)     
                myDirection     1       
                tables  Vector<E>  (id=255)     
        ordering        Vector<E>  (id=103)     
                elementCount    1       
                elementData     Object[10]  (id=259)    
                modCount        3       
        unorderedOptimizables   Vector<E>  (id=104)     
                elementCount    0       
(ColumnsAlwaysOrdered in currentRowOrdering is set to Direction: 3 Table 0, 
Column 3 Table 0, Column 2 Table 1, Column 2)
(currentColumnOrdering in currentRowOrdering is set to Direction: 1 Table 1, 
Column 3)
(ordering in currentRowOrdering is set to [Direction: 1 Table 1, Column 3])
The 2nd param assignedTableMap just has {0,1} because we are only considering 0 
and 1 in the join order.

Going further deep in 0,1,-1 join order, the call above leads to following line 
in impl.sql.compile.OrderByList.sortRequired(RowOrdering, JBitSet) line: 535    
if ( ! rowOrdering.alwaysOrdered(cr.getTableNumber())) 

In this if block, we say that sorting is not required. In my mind, we should 
required sorting because the outermost table m1 is not going to be able to 
satisfy the ordering requirement for m0.value and hence we can't consider sort 
avoidance when m1 is outermost. But of course, code is deciding that sort 
avoidance is ok in this case. I will highly appreciate if someone can help me 
understand this piece of code which ends up making the decision for sorting 
required or not required. I think the code is mostly centered in 
org.apache.derby.impl.sql.compile.RowOrderingImple (protocol class 
org.apache.derby.iapi.sql.compile.RowOrdering). There is not much of javadoc at 
the class level which is making it hard me to grasp what this class is supposed 
to do and what are the different memebers in it. I am going to debug further on 
my own to try to understand it but any help from community will be appreciated.

> Incorrect ORDER BY caused by index
> ----------------------------------
>
>                 Key: DERBY-3926
>                 URL: https://issues.apache.org/jira/browse/DERBY-3926
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.1.3.3, 10.2.3.0, 10.3.3.1, 10.4.2.0
>            Reporter: Tars Joris
>         Attachments: derby-reproduce.zip
>
>
> I think I found a bug in Derby that is triggered by an index on a large 
> column: VARCHAR(1024). I know it  is generally not a good idea to have an 
> index on such a large column.
> I have a table (table2) with a column "value", my query orders on this column 
> but the result is not sorted. It is sorted if I remove the index on that 
> column.
> The output of the attached script is as follows (results should be ordered on 
> the middle column):
> ID                  |VALUE        |VALUE
> ----------------------------------------------
> 2147483653          |000002       |21857
> 2147483654          |000003       |21857
> 4294967297          |000001       |21857
> While I would expect:
> ID                  |VALUE        |VALUE
> ----------------------------------------------
> 4294967297          |000001       |21857
> 2147483653          |000002       |21857
> 2147483654          |000003       |21857
> This is the definition:
> CREATE TABLE table1 (id BIGINT NOT NULL, PRIMARY KEY(id));
> CREATE INDEX key1 ON table1(id);
> CREATE TABLE table2 (id BIGINT NOT NULL, name VARCHAR(40) NOT NULL, value 
> VARCHAR(1024), PRIMARY KEY(id, name));
> CREATE UNIQUE INDEX key2 ON table2(id, name);
> CREATE INDEX key3 ON table2(value);
> This is the query:
> SELECT table1.id, m0.value, m1.value
> FROM table1, table2 m0, table2 m1
> WHERE table1.id=m0.id
> AND m0.name='PageSequenceId'
> AND table1.id=m1.id
> AND m1.name='PostComponentId'
> AND m1.value='21857'
> ORDER BY m0.value;
> The bug can be reproduced by just executing the attached script with the 
> ij-tool.
> Note that the result of the query becomes correct when enough data is 
> changed. This prevented me from creating a smaller example.
> See the attached file "derby-reproduce.zip" for sysinfo, derby.log and 
> script.sql.
> Michael Segel pointed out:
> "It looks like its hitting the index ordering on id,name from table 2 and is 
> ignoring the order by clause."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3926) Incorrect ORDER BY caused by index

Reply via email to