[
https://issues.apache.org/jira/browse/DERBY-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12474895
]
A B commented on DERBY-47:
--------------------------
In one of my previous comments I mentioned that d47_engine_doNOTCommit_v1.patch
has a problem where the addition of an InListResultSet to the FromList causes
all of the SUR (Scrollable Updatable Result set) tests that use an IN-clause to
fail. More specifically, the presence of the InListResultSet causes Derby to
downgrade the cursor scrollability to "CONCUR_READ_ONLY".
I was eventually able to track down the cause of this behavior: whether or not
Derby downgrades a result set to CONCUR_READ_ONLY depends on the value returned
by the execution time result set in the "isForUpdate()" method. The default
(as defined in NoPutResultSetImpl) is to return false.
In the case of the SUR tests (prior to my changes) we were getting a
ScrollInsensitiveResultSet on top of a TableScanResultSet. The former gets its
"isForUpdate()" value from the latter, and the latter correctly returns "true"
to indicate that the result set is for update and thus no downgrade is needed.
With d47_engine_doNotCommit_v1.patch applied, though, we add an InListResultSet
to the FromList, which ultimately gives us a JoinResultSet at execution time.
The JoinResultSet class does not define an "isForUpdate()" method, so we just
return the default--which is "false". That causes the cursor concurrency to be
downgraded to CONCUR_READ_ONLY.
To see what would happen I forced JoinResultSet to return "true" and then there
was an ASSERT failure because JoinResultSets are not expected to be used for
update/delete. The relevant code is in execute/JoinResultSet.java; in the
"getRowLocation()" method we see the following comment:
* A join is combining rows from two sources, so it has no
* single row location to return; just return a null.
My guess is that, since the decision to disallow update/delete on a
JoinResultSet was intentional, trying to code around that restriction is a bad
idea. Or at the very least, it would require a lot more investigation and/or
work.
Instead of pursuing that potentially treacherous path, I was able to come up
with some logic that checks to see if a result set is updatable at compile time
and, if so, to skip the IN-list rewrite. Early testing suggests that this is
viable solution.
HOWEVER, as the list of "exceptions" to the IN-list (_v1.patch) rewrite grew, I
started to wonder if there wasn't some other solution to DERBY-47 that would
accomplish the same thing, minus all of the exception cases.
A few days later I came across some queries involving multiple IN-list
predicates for a single SELECT statement. It turns out that many of those
queries return incorrect results (duplicate rows) and/or run much more slowly
with the _v1 patch than without.
The fact that there are so many "exceptions to the rule" combined with the
undesirable behavior in the face of multiple IN-lists prompted me to abandon my
initial idea of internally adding an InListResultSetNode to the user's FROM
list (d47_engine_doNOTCommit_v1.patch).
Instead I have been working on an alternate approach to the problem. This
second approach, like the first, is based on a comment made by someone else on
this issue. This time, though, the comment in question is from James Synge and
is as follows:
> The goal of the re-write is to avoid the source of DERBY-47,
> which is that we don't have a multi-probe index scan, only
> range index scans, and worse still, when there are parameters
> involved, the range is the entire index.
In short, I decided to work with the execution-time result sets to see if it is
possible to enforce some kind of "multi-probing" for indexes. That is to say,
instead of scanning a range of values on the index, we want to make it so that
Derby probes the index N times, where N is the size of the IN-list. For each
probe, then, we'll get all rows for which the target column's value equals the
N-th value in the IN-list.
Effectively, this comes down to the "Marker" strategy in
Derby47Performance.java, where we evaluate an equality predicate, "column = ?",
N times. Except of course that, unlike with the Marker strategy, we do the N
evaluations internally instead of making the user do it.
A high-level description of how this approach will work is as follows:
- If any value in the IN-list is not a constant AND is not a parameter,
then do processing as usual (no optimizations, no rewrite). Notice
that with this approach we *do* still perform the optimization if
IN-list values are parameters. That is not the case for the current
Derby rewrite (i.e. without any changes for DERBY-47).
Otherwise:
- During preprocessing, replace the IN-list with an equality predicate of
the form "column = ?". I.e. the right operand will be a parameter node,
the left operand will be whatever column reference belongs to the IN-list
(hereafter referred to as the "target column"). We call this new, internal
predicate an IN-list "probe predicate" because it will be the basis of
the multi-probing logic at execution time.
- During costing, the equality predicate "column = ?" will be treated as
a start/stop key by normal optimizer processing (no changes necessary).
If the predicate is deemed a start/stop key for the *first* column in
an index, then we'll multiply the estimated cost of "column = ?" by
the size of the IN-list, to account for the fact that we're actually
evaluating the predicate N times (not just once).
If the predicate is not a start/stop key for the first column in
the index, then we do not adjust the cost. See below for why.
- After we have found the best plan (including a best conglomerate choice),
check to see if the IN-list probe predicate is a start/stop key on the
first column in the chosen conglomerate. In order for that to be the
case the conglomerate must be an index (because we don't have start/stop
keys on table conglomerates). If the probe predicate is not a
start/stop key for the *first* column in the index, then we "revert" the
probe
predicate back to its original IN-list form and evaluate it as a "normal"
InListOperatorNode. In this way we are effectively "giving up" on the
multi-
probing approach. This is why we don't change the cost for such probe
predicates (mentioned above).
If the probe predicate *is* a start/stop key on the first column in
the index conglomerate, then leave it as it is.
- When it comes time to generate the byte code, look to see if we have
a probe predicate that is a start/stop key on the first column in the
chosen conglomerate. If so, generate an array of DataValueDescriptors
to hold the values from the corresponding IN-list and pass that array
to the underlying execution-time result set (i.e. to TableScanResultSet).
Then generate the probe predicate as a "normal" start/stop key for the
scan. This will serve as a "place-holder" for the IN-list values
at execution time.
- Finally, at execution time, instead of using a single start key and a
single stop key to define a scan range, we iterate through the IN-list
values and for each non-duplicate value V[i] (0 <= i < N) we treat V[i]
as both a start key *and* a stop key. Or put another way, we plug
V[i] into the "column = ?" predicate and retrieve all matching rows.
In this way we are "probing" the index for all rows having value V[i]
in the target column. Once all rows matching V[i] have been returned,
we then grab the next IN-list value, V[i+1], and we reposition the
scan (by calling "reopenCore()"), this time using V[i+1] as the start
and stop key (i.e. plugging V[i+1] into "column = ?"). This will
return all rows having value V[1+1] in the target column.
Continue this process until all values in the IN-list have been
exhausted. When we reach that point, we are done.
As a simple example, assume our query is of the form:
select ... from admin.changes where id in (1, 20000)
During preprocessing we will effectively change this to be:
select ... from admin.changes where id = ?
where "id = ?" is our IN-list "probe predicate". Note that we must make sure
the optimizer recognizes "id = ?" as a disguised IN-list operator, as opposed
to a true relational operator. The reason is that, while we do treat the probe
predicate as a "fake" relational operator so that it can be seen as a
start/stop key for the relevant indexes, there are many operations (ex.
transitive closure) that should *not* be done on a probe predicate. So we have
to be able to distinguish a probe predicate from other relational predicates.
With the probe predicate in place the optimizer will determine that it is a
start/stop key for any index having column "ID" as the first column, and thus
the optimizer is more likely to choose one of those indexes over a table scan.
If we assume the optimizer decides to use an index on ID for the query, then
we'll generate the IN-list values (1, 20000) and we will pass them to the
underlying index scan. We'll then generate the probe predicate "column = ?" as
a start and stop key for the scan.
At execution, then, the scan will first open up the index by using "1" as the
start and stop key for the scan (or put another way, by plugging "1" into the
probe predicate, "column = ?"). That will return all rows having ID equal to
"1". Then when that scan ends, we'll reopen the scan a second time, this time
using "20000" as the start and stop key, therefore returning all the 20000's.
Meanwhile, all result sets higher up in the result set tree will just see a
stream of rows from the index where ID only equals 1 or 20000.
This works for IN-lists with parameters, as well, because by the time execution
starts we know what values the parameters hold.
The first draft of code for implementing all of this is pretty much done, but I
plan to post it in increments for ease of review. If all goes well, nothing
should change functionally until the final patch, which will be the
preprocessing patch that does the actual creation of the "probe predicate". At
that point all of the machinery should be in place to recognize the probe
predicate and function accordingly.
I ran Derby47PerformanceTest with this "multi-probing" approach and the numbers
were similar to what I saw with the _v1 approach (more details coming later).
Unlike the _v1 approach, though, the multi-probing approach works with
subqueries, left outer joins, and scrollable updatable result sets. In
addition, all of the test queries that I came up with involving multiple
IN-lists run correctly with the multi-probing approach, and in many cases run
quite a bit more quickly--neither of which was true for the _v1 approach. And
yes, I do hope to add those test cases to the nightlies as part of my work on
this issue.
Incremental patches are forthcoming. In the meantime, if anyone has any
comments/suggestions on the approach outlined above, I would appreciated the
feedback...
> Some possible improvements to IN optimization
> ---------------------------------------------
>
> Key: DERBY-47
> URL: https://issues.apache.org/jira/browse/DERBY-47
> Project: Derby
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 10.0.2.0
> Environment: all
> Reporter: Sunitha Kambhampati
> Assigned To: A B
> Attachments: d47_engine_doNotCommit_v1.patch,
> d47_engine_doNotCommit_v1.stat, derby-47-performance-data.txt,
> derby-47-performance-data.txt, Derby47PerformanceTest.java,
> Derby47PerformanceTest.java, InListOperatorNode.java,
> QueryPlanUniqueIndexAndWordIndexOneTerm.txt,
> QueryPlanUniqueIndexAndWordIndexTwoTerms.txt,
> QueryPlanUniqueIndexOnlyOneTerm.txt, QueryPlanUniqueIndexOnlyTwoTerms.txt,
> readlocks.diff, readlocks_withContext.diff
>
>
> Consider a simple case of -
> A table tbl has 10000 rows, there is a primary key index on i1
> and the query in question is
> select * from tbl where i1 in (-1,100000)
> derby does a table scan of the entire table even though the "IN" list has
> only two values and the comparison is on a field that has an index.
> Briefly looking at the code, it seems like we insert a between and use the IN
> list to get the start and stop values for the scan. Thus the range of the
> values in the "IN" list here plays an important role.
> Thus if the query was changed to select * from tbl where i1 in (-1, 1), an
> index scan would be chosen.
> It would be nice if we could do something clever in this case where there is
> clearly an index on the field and the number of values in the IN list is
> known. Maybe use the rowcount estimate and the IN list size to do some
> optimizations.
> - consider the length of the "IN" list to do searches on the table. ie use
> the IN list values to do index key searches on the table,
> -or try to convert it to a join. Use the "IN" list values to create a
> temporary table and do a join. It is most likely that the optimizer will
> choose the table with "IN" list here as the outer table in the join and thus
> will do key searches on the larger table.
> -------------------------------------------------------------------
> some query plans that I logged using derby.language.logQueryPlan=true for
> some similar queries:
> Table has ascending values from 0 - 9999 for i1. primary key index on i1.
> GMT Thread[UT0,5,main] (XID = 19941), (SESSIONID = 0), select * from
> scanfixed where i1 in (-1,9999,9998,9997,9996,9995,9994,9993,9992,9991,9990)
> ******* Project-Restrict ResultSet (2):
> Number of opens = 1
> Rows seen = 10000
> Rows filtered = 9990
> restriction = true
> projection = false
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> restriction time (milliseconds) = 0
> projection time (milliseconds) = 0
> optimizer estimated row count: 750.38
> optimizer estimated cost: 8579.46
> Source result set:
> Table Scan ResultSet for SCANFIXED at read committed isolation level
> using instantaneous share row locking chosen by the optimizer
> Number of opens = 1
> Rows seen = 10000
> Rows filtered = 0
> Fetch Size = 16
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> next time in milliseconds/row = 0
> scan information:
> Bit set of columns fetched=All
> Number of columns fetched=9
> Number of pages visited=417
> Number of rows qualified=10000
> Number of rows visited=10000
> Scan type=heap
> start position:
> null stop position:
> null qualifiers:
> Column[0][0] Id: 0
> Operator: <=
> Ordered nulls: false
> Unknown return value: false
> Negate comparison result: false
> Column[0][1] Id: 0
> Operator: <
> Ordered nulls: false
> Unknown return value: true
> Negate comparison result: true
> optimizer estimated row count: 750.38
> optimizer estimated cost: 8579.46
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> l
> 2004-10-14 18:59:47.577 GMT Thread[UT0,5,main] (XID = 19216), (SESSIONID =
> 0), select * from scanfixed where i1 in
> (9999,9998,9997,9996,9995,9994,9993,9992,9991,9990) ******* Project-Restrict
> ResultSet (3):
> Number of opens = 1
> Rows seen = 10
> Rows filtered = 0
> restriction = true
> projection = true
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> restriction time (milliseconds) = 0
> projection time (milliseconds) = 0
> optimizer estimated row count: 4.80
> optimizer estimated cost: 39.53
> Source result set:
> Index Row to Base Row ResultSet for SCANFIXED:
> Number of opens = 1
> Rows seen = 10
> Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6, 7, 8}
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> optimizer estimated row count: 4.80
> optimizer estimated cost: 39.53
> Index Scan ResultSet for SCANFIXED using index SCANFIXEDX at
> read committed isolation level using instantaneous share row locking chosen
> by the optimizer
> Number of opens = 1
> Rows seen = 10
> Rows filtered = 0
> Fetch Size = 16
> constructor time (milliseconds) = 0
> open time (milliseconds) = 0
> next time (milliseconds) = 0
> close time (milliseconds) = 0
> next time in milliseconds/row = 0
> scan information:
> Bit set of columns fetched=All
> Number of columns fetched=2
> Number of deleted rows visited=0
> Number of pages visited=2
> Number of rows qualified=10
> Number of rows visited=10
> Scan type=btree
> Tree height=2
> start position:
> >= on first 1 column(s).
> Ordered null semantics on the following columns:
> stop position:
> > on first 1 column(s).
> Ordered null semantics on the following columns:
> qualifiers:
> None
> optimizer estimated row count: 4.80
> optimizer estimated cost: 39.53
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.