[jira] Updated: (DERBY-3926) Incorrect ORDER BY caused by index

Mamta A. Satoor (JIRA) Mon, 11 May 2009 10:26:13 -0700

     [ 
https://issues.apache.org/jira/browse/DERBY-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mamta A. Satoor updated DERBY-3926:
-----------------------------------

    Attachment: DERBY3926_notforcheckin_patch1_051109_stat.txt
                DERBY3926_notforcheckin_patch1_051109_diff.txt

I have attached a patch (not intended for checkin) 
DERBY3926_notforcheckin_patch1_051109_diff.txt based on the pseudocode that I 
posted last week. It fixes the problem query in question but when I run 
wisconsin test, I see that now we are adding sort nodes on top of few queries. 
I am trying to understand if it makes sense for us to have the additional sort 
node case by case. The first case I am looking at seems like should not get a 
sort node when the patch is choosing to add one. The query for that case is as 
follows
        select * from TENKTUP1, TENKTUP2
         where TENKTUP1.unique1 = TENKTUP2.unique1
         order by TENKTUP1.unique1;

The query plan for the above query shows tektup1 to be the outermost query so I 
am not sure why we need the sort node on the top. I will look further into it. 
If anyone has time to look at the patch, I will greatly appreciate it. There 
are no comments(minimal) for the new code. I will work on adding some comments 
and repost the patch to make it easier to read but the code should correspond 
fairly straightforward to the psuedo code posted last week.

The old query plan for the query above from wisconsin is as follows
ij> values SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
1                                                                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Statement Name: 
        C
Statement Text: 
        select * from TENKTUP1, TENKTUP2
         where TENKTUP1.unique1 = TENKTUP2.unique1
         order by TENKTUP1.unique1
Parse Time: 0
Bind Time: 0
Optimize Time: 0
Generate Time: 0
Compile Time: 0
Execute Time: 0
Begin Compilation Timestamp : null
End Compilation Timestamp : null
Begin Execution Timestamp : null
End Execution Timestamp : null
Statement Execution Plan Text: 
Nested Loop Exists Join ResultSet:
<filtered number of opens>
<filtered rows seen from the left>
<filtered rows seen from the right>
Rows filtered = 0
<filtered rows returned>
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
Left result set:
        Index Row to Base Row ResultSet for TENKTUP1:
        <filtered number of opens>
        <filtered rows seen>
        Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15}
                constructor time (milliseconds) = 0
                open time (milliseconds) = 0
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                Index Scan ResultSet for TENKTUP1 using index TK1UNIQUE1 at 
serializable isolation level using share table locking chosen by the optimizer
                <filtered number of opens>
                <filtered rows seen>
                Rows filtered = 0
                Fetch Size = 1
                        constructor time (milliseconds) = 0
                        open time (milliseconds) = 0
                        next time (milliseconds) = 0
                        close time (milliseconds) = 0
                scan information: 
                        Bit set of columns fetched={1}
                        Number of columns fetched=1
                        Number of deleted rows visited=0
                        <filtered number of pages visited>
                        <filtered number of rows qualified>
                        <filtered number of rows visited>
                        Scan type=btree
                        Tree height=2
                        start position:         None
                        stop position:  None
                        qualifiers:None
Right result set:
        Index Row to Base Row ResultSet for TENKTUP2:
        <filtered number of opens>
        <filtered rows seen>
        Columns accessed from heap = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15}
                constructor time (milliseconds) = 0
                open time (milliseconds) = 0
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                Index Scan ResultSet for TENKTUP2 using index TK2UNIQUE1 at 
serializable isolation level using share row locking chosen by the optimizer
                <filtered number of opens>
                <filtered rows seen>
                Rows filtered = 0
                Fetch Size = 1
                        constructor time (milliseconds) = 0
                        open time (milliseconds) = 0
                        next time (milliseconds) = 0
                        close time (milliseconds) = 0
                scan information: 
                        Bit set of columns fetched=All
                        Number of columns fetched=2
                        Number of deleted rows visited=0
                        <filtered number of pages visited>
                        <filtered number of rows qualified>
                        <filtered number of rows visited>
                        Scan type=btree
                        Tree height=2
                        start position: 
        >= on first 1 column(s).
        Ordered null semantics on the following columns: 0 
                        stop position: 
        > on first 1 column(s).
        Ordered null semantics on the following columns: 0 
                        qualifiers:None



The new query plan after my changes is as follows
ij> values SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
1                                                                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Statement Name:
        null
Statement Text:
        select * from TENKTUP1, TENKTUP2
         where TENKTUP1.unique1 = TENKTUP2.unique1
         order by TENKTUP1.unique1

Parse Time: 0
Bind Time: 0
Optimize Time: 0
Generate Time: 0
Compile Time: 0
Execute Time: 0
Begin Compilation Timestamp : null
End Compilation Timestamp : null
Begin Execution Timestamp : null
End Execution Timestamp : null
Statement Execution Plan Text:
Sort ResultSet:
Number of opens = 1
Rows input = 10000
Rows returned = 10000
Eliminate duplicates = false
In sorted order = false
Sort information:
        Number of merge runs=3
        Number of rows input=10000
        Number of rows output=10000
        Size of merge runs=[3215, 3215, 3215]
        Sort type=external
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        optimizer estimated row count:        10005.00
        optimizer estimated cost:        73930.40

Source result set:
        Nested Loop Exists Join ResultSet:
        Number of opens = 1
        Rows seen from the left = 10000
        Rows seen from the right = 10000
        Rows filtered = 0
        Rows returned = 10000
                constructor time (milliseconds) = 0
                open time (milliseconds) = 0
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                optimizer estimated row count:        10005.00
                optimizer estimated cost:        73930.40

        Left result set:
                Table Scan ResultSet for TENKTUP1 at read committed isolation 
level using instantaneous share row locking chosen by the optimizer
                Number of opens = 1
                Rows seen = 10000
                Rows filtered = 0
                Fetch Size = 16
                        constructor time (milliseconds) = 0
                        open time (milliseconds) = 0
                        next time (milliseconds) = 0
                        close time (milliseconds) = 0
                        next time in milliseconds/row = 0

                scan information:
                        Bit set of columns fetched=All
                        Number of columns fetched=16
                        Number of pages visited=771
                        Number of rows qualified=10000
                        Number of rows visited=10000
                        Scan type=heap
                        start position:null                    
                        stop position:null                    
                        qualifiers:None
                        optimizer estimated row count:        10005.00
                        optimizer estimated cost:        14870.88

        Right result set:
                Index Row to Base Row ResultSet for TENKTUP2:
                Number of opens = 10000
                Rows seen = 10000
                Columns accessed from heap = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
11, 12, 13, 14, 15}
                        constructor time (milliseconds) = 0
                        open time (milliseconds) = 0
                        next time (milliseconds) = 0
                        close time (milliseconds) = 0
                        optimizer estimated row count:        10005.00
                        optimizer estimated cost:        59059.52

                        Index Scan ResultSet for TENKTUP2 using index 
TK2UNIQUE1 at read committed isolation level using share row locking chosen by 
the optimizer
                        Number of opens = 10000
                        Rows seen = 10000
                        Rows filtered = 0
                        Fetch Size = 1
                                constructor time (milliseconds) = 0
                                open time (milliseconds) = 0
                                next time (milliseconds) = 0
                                close time (milliseconds) = 0
                                next time in milliseconds/row = 0

                        scan information:
                                Bit set of columns fetched=All
                                Number of columns fetched=2
                                Number of deleted rows visited=0
                                Number of pages visited=20000
                                Number of rows qualified=10000
                                Number of rows visited=10000
                                Scan type=btree
                                Tree height=2
                                start position:
        >= on first 1 column(s).
        Ordered null semantics on the following columns:0
                                stop position:
        > on first 1 column(s).
        Ordered null semantics on the following columns:0
                                qualifiers:None
                                optimizer estimated row count:        10005.00
                                optimizer estimated cost:        59059.52


> Incorrect ORDER BY caused by index
> ----------------------------------
>
>                 Key: DERBY-3926
>                 URL: https://issues.apache.org/jira/browse/DERBY-3926
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.1.3.3, 10.2.3.0, 10.3.3.1, 10.4.2.0
>            Reporter: Tars Joris
>            Assignee: Mamta A. Satoor
>         Attachments: d3926_repro.sql, derby-reproduce.zip, 
> DERBY3926_notforcheckin_patch1_051109_diff.txt, 
> DERBY3926_notforcheckin_patch1_051109_stat.txt, script3.sql, 
> script3WithUserFriendlyIndexNames.sql, test-script.zip
>
>
> I think I found a bug in Derby that is triggered by an index on a large 
> column: VARCHAR(1024). I know it  is generally not a good idea to have an 
> index on such a large column.
> I have a table (table2) with a column "value", my query orders on this column 
> but the result is not sorted. It is sorted if I remove the index on that 
> column.
> The output of the attached script is as follows (results should be ordered on 
> the middle column):
> ID                  |VALUE        |VALUE
> ----------------------------------------------
> 2147483653          |000002       |21857
> 2147483654          |000003       |21857
> 4294967297          |000001       |21857
> While I would expect:
> ID                  |VALUE        |VALUE
> ----------------------------------------------
> 4294967297          |000001       |21857
> 2147483653          |000002       |21857
> 2147483654          |000003       |21857
> This is the definition:
> CREATE TABLE table1 (id BIGINT NOT NULL, PRIMARY KEY(id));
> CREATE INDEX key1 ON table1(id);
> CREATE TABLE table2 (id BIGINT NOT NULL, name VARCHAR(40) NOT NULL, value 
> VARCHAR(1024), PRIMARY KEY(id, name));
> CREATE UNIQUE INDEX key2 ON table2(id, name);
> CREATE INDEX key3 ON table2(value);
> This is the query:
> SELECT table1.id, m0.value, m1.value
> FROM table1, table2 m0, table2 m1
> WHERE table1.id=m0.id
> AND m0.name='PageSequenceId'
> AND table1.id=m1.id
> AND m1.name='PostComponentId'
> AND m1.value='21857'
> ORDER BY m0.value;
> The bug can be reproduced by just executing the attached script with the 
> ij-tool.
> Note that the result of the query becomes correct when enough data is 
> changed. This prevented me from creating a smaller example.
> See the attached file "derby-reproduce.zip" for sysinfo, derby.log and 
> script.sql.
> Michael Segel pointed out:
> "It looks like its hitting the index ordering on id,name from table 2 and is 
> ignoring the order by clause."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (DERBY-3926) Incorrect ORDER BY caused by index

Reply via email to