[
https://issues.apache.org/jira/browse/DERBY-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mamta A. Satoor updated DERBY-3926:
-----------------------------------
Attachment: DERBY3926_notforcheckin_patch1_051109_stat.txt
DERBY3926_notforcheckin_patch1_051109_diff.txt
I have attached a patch (not intended for checkin)
DERBY3926_notforcheckin_patch1_051109_diff.txt based on the pseudocode that I
posted last week. It fixes the problem query in question but when I run
wisconsin test, I see that now we are adding sort nodes on top of few queries.
I am trying to understand if it makes sense for us to have the additional sort
node case by case. The first case I am looking at seems like should not get a
sort node when the patch is choosing to add one. The query for that case is as
follows
select * from TENKTUP1, TENKTUP2
where TENKTUP1.unique1 = TENKTUP2.unique1
order by TENKTUP1.unique1;
The query plan for the above query shows tektup1 to be the outermost query so I
am not sure why we need the sort node on the top. I will look further into it.
If anyone has time to look at the patch, I will greatly appreciate it. There
are no comments(minimal) for the new code. I will work on adding some comments
and repost the patch to make it easier to read but the code should correspond
fairly straightforward to the psuedo code posted last week.
The old query plan for the query above from wisconsin is as follows
ij> values SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Statement Name:
C
Statement Text:
select * from TENKTUP1, TENKTUP2
where TENKTUP1.unique1 = TENKTUP2.unique1
order by TENKTUP1.unique1
Parse Time: 0
Bind Time: 0
Optimize Time: 0
Generate Time: 0
Compile Time: 0
Execute Time: 0
Begin Compilation Timestamp : null
End Compilation Timestamp : null
Begin Execution Timestamp : null
End Execution Timestamp : null
Statement Execution Plan Text:
Nested Loop Exists Join ResultSet:
<filtered number of opens>
<filtered rows seen from the left>
<filtered rows seen from the right>
Rows filtered = 0
<filtered rows returned>
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
Left result set:
Index Row to Base Row ResultSet for TENKTUP1:
<filtered number of opens>
<filtered rows seen>
Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15}
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
Index Scan ResultSet for TENKTUP1 using index TK1UNIQUE1 at
serializable isolation level using share table locking chosen by the optimizer
<filtered number of opens>
<filtered rows seen>
Rows filtered = 0
Fetch Size = 1
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
scan information:
Bit set of columns fetched={1}
Number of columns fetched=1
Number of deleted rows visited=0
<filtered number of pages visited>
<filtered number of rows qualified>
<filtered number of rows visited>
Scan type=btree
Tree height=2
start position: None
stop position: None
qualifiers:None
Right result set:
Index Row to Base Row ResultSet for TENKTUP2:
<filtered number of opens>
<filtered rows seen>
Columns accessed from heap = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15}
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
Index Scan ResultSet for TENKTUP2 using index TK2UNIQUE1 at
serializable isolation level using share row locking chosen by the optimizer
<filtered number of opens>
<filtered rows seen>
Rows filtered = 0
Fetch Size = 1
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
scan information:
Bit set of columns fetched=All
Number of columns fetched=2
Number of deleted rows visited=0
<filtered number of pages visited>
<filtered number of rows qualified>
<filtered number of rows visited>
Scan type=btree
Tree height=2
start position:
>= on first 1 column(s).
Ordered null semantics on the following columns: 0
stop position:
> on first 1 column(s).
Ordered null semantics on the following columns: 0
qualifiers:None
The new query plan after my changes is as follows
ij> values SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Statement Name:
null
Statement Text:
select * from TENKTUP1, TENKTUP2
where TENKTUP1.unique1 = TENKTUP2.unique1
order by TENKTUP1.unique1
Parse Time: 0
Bind Time: 0
Optimize Time: 0
Generate Time: 0
Compile Time: 0
Execute Time: 0
Begin Compilation Timestamp : null
End Compilation Timestamp : null
Begin Execution Timestamp : null
End Execution Timestamp : null
Statement Execution Plan Text:
Sort ResultSet:
Number of opens = 1
Rows input = 10000
Rows returned = 10000
Eliminate duplicates = false
In sorted order = false
Sort information:
Number of merge runs=3
Number of rows input=10000
Number of rows output=10000
Size of merge runs=[3215, 3215, 3215]
Sort type=external
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
optimizer estimated row count: 10005.00
optimizer estimated cost: 73930.40
Source result set:
Nested Loop Exists Join ResultSet:
Number of opens = 1
Rows seen from the left = 10000
Rows seen from the right = 10000
Rows filtered = 0
Rows returned = 10000
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
optimizer estimated row count: 10005.00
optimizer estimated cost: 73930.40
Left result set:
Table Scan ResultSet for TENKTUP1 at read committed isolation
level using instantaneous share row locking chosen by the optimizer
Number of opens = 1
Rows seen = 10000
Rows filtered = 0
Fetch Size = 16
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
next time in milliseconds/row = 0
scan information:
Bit set of columns fetched=All
Number of columns fetched=16
Number of pages visited=771
Number of rows qualified=10000
Number of rows visited=10000
Scan type=heap
start position:null
stop position:null
qualifiers:None
optimizer estimated row count: 10005.00
optimizer estimated cost: 14870.88
Right result set:
Index Row to Base Row ResultSet for TENKTUP2:
Number of opens = 10000
Rows seen = 10000
Columns accessed from heap = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15}
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
optimizer estimated row count: 10005.00
optimizer estimated cost: 59059.52
Index Scan ResultSet for TENKTUP2 using index
TK2UNIQUE1 at read committed isolation level using share row locking chosen by
the optimizer
Number of opens = 10000
Rows seen = 10000
Rows filtered = 0
Fetch Size = 1
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
next time in milliseconds/row = 0
scan information:
Bit set of columns fetched=All
Number of columns fetched=2
Number of deleted rows visited=0
Number of pages visited=20000
Number of rows qualified=10000
Number of rows visited=10000
Scan type=btree
Tree height=2
start position:
>= on first 1 column(s).
Ordered null semantics on the following columns:0
stop position:
> on first 1 column(s).
Ordered null semantics on the following columns:0
qualifiers:None
optimizer estimated row count: 10005.00
optimizer estimated cost: 59059.52
> Incorrect ORDER BY caused by index
> ----------------------------------
>
> Key: DERBY-3926
> URL: https://issues.apache.org/jira/browse/DERBY-3926
> Project: Derby
> Issue Type: Bug
> Components: SQL
> Affects Versions: 10.1.3.3, 10.2.3.0, 10.3.3.1, 10.4.2.0
> Reporter: Tars Joris
> Assignee: Mamta A. Satoor
> Attachments: d3926_repro.sql, derby-reproduce.zip,
> DERBY3926_notforcheckin_patch1_051109_diff.txt,
> DERBY3926_notforcheckin_patch1_051109_stat.txt, script3.sql,
> script3WithUserFriendlyIndexNames.sql, test-script.zip
>
>
> I think I found a bug in Derby that is triggered by an index on a large
> column: VARCHAR(1024). I know it is generally not a good idea to have an
> index on such a large column.
> I have a table (table2) with a column "value", my query orders on this column
> but the result is not sorted. It is sorted if I remove the index on that
> column.
> The output of the attached script is as follows (results should be ordered on
> the middle column):
> ID |VALUE |VALUE
> ----------------------------------------------
> 2147483653 |000002 |21857
> 2147483654 |000003 |21857
> 4294967297 |000001 |21857
> While I would expect:
> ID |VALUE |VALUE
> ----------------------------------------------
> 4294967297 |000001 |21857
> 2147483653 |000002 |21857
> 2147483654 |000003 |21857
> This is the definition:
> CREATE TABLE table1 (id BIGINT NOT NULL, PRIMARY KEY(id));
> CREATE INDEX key1 ON table1(id);
> CREATE TABLE table2 (id BIGINT NOT NULL, name VARCHAR(40) NOT NULL, value
> VARCHAR(1024), PRIMARY KEY(id, name));
> CREATE UNIQUE INDEX key2 ON table2(id, name);
> CREATE INDEX key3 ON table2(value);
> This is the query:
> SELECT table1.id, m0.value, m1.value
> FROM table1, table2 m0, table2 m1
> WHERE table1.id=m0.id
> AND m0.name='PageSequenceId'
> AND table1.id=m1.id
> AND m1.name='PostComponentId'
> AND m1.value='21857'
> ORDER BY m0.value;
> The bug can be reproduced by just executing the attached script with the
> ij-tool.
> Note that the result of the query becomes correct when enough data is
> changed. This prevented me from creating a smaller example.
> See the attached file "derby-reproduce.zip" for sysinfo, derby.log and
> script.sql.
> Michael Segel pointed out:
> "It looks like its hitting the index ordering on id,name from table 2 and is
> ignoring the order by clause."
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.