[jira] Updated: (DERBY-642) SELECT MAX doesn't use indices optimally

Knut Anders Hatlen (JIRA) Fri, 25 Feb 2011 04:50:06 -0800

     [ 
https://issues.apache.org/jira/browse/DERBY-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Knut Anders Hatlen updated DERBY-642:
-------------------------------------

    Attachment: derby-642-3a-waitBeforeRetry.diff

Attaching derby-642-3a-waitBeforeRetry.diff which implements an optimization 
mentioned in an earlier comment, and also discussed in DERBY-884.

When the left sibling cannot be latched immediately, and we get a WaitError, we 
now try to get the latch on the left sibling again, possibly waiting for it, 
before we reposition in the B-tree. We save the position and release the latch 
on the current page before we do this, so waiting for the latch shouldn't be 
causing any deadlocks. We also release the latch we've just obtained on the 
left sibling before we do the repositioning, since we don't know if that's 
still the page we want to move to (anything may have happened while we didn't 
hold any latches).

This change avoids the case where a scan gets a WaitError, then immediately 
retries just to get a new WaitError. It may need to try many times before it's 
actually able to reposition and move to the left sibling. The changes in the 
patch makes the scan wait until the other thread has released the latch, and 
then it's much more likely that it'll be able to retry the operation without 
getting a WaitError.

Running BTreeMaxScanTest with the patch and tracing turned on, I see that the 
amount of log produced by the tracing is reduced by about 80%, which indicates 
that the scans spend less time on unsuccessful retries.

> SELECT MAX doesn't use indices optimally
> ----------------------------------------
>
>                 Key: DERBY-642
>                 URL: https://issues.apache.org/jira/browse/DERBY-642
>             Project: Derby
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions: 10.2.1.6
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: derby-642-1a.diff, derby-642-1b-withTests.diff, 
> derby-642-2a-test-serializable.diff, derby-642-3a-waitBeforeRetry.diff
>
>
> I tried running SELECT MAX on an indexed column in a big (8 GB)
> table. It took 12 minutes, which is about 12 minutes more than I
> expected.
> After a bit of investigation, I found out that a full index scan was
> performed because all the rows referenced from the rightmost B-tree
> node were actually deleted.
> Possible improvements:
>   1. Implement backwards scan in the B-trees (this is also suggested
>      in the comments in BTreeMaxScan).
>   2. Go up to the parent node and down to the next leaf node on the
>      left side, and continue until a valid max value is found. In
>      Derby, traversing up in a B-tree is not allowed, but would it be
>      ok to go up if the latches were kept on the higher-level nodes in
>      the tree? Of course, this would have negative impact on
>      concurrency.
>   3. Right-to-left traversal on the leaf level is possible (using
>      ControlRow.getLeftSibling()), but will throw an exception if the
>      page cannot be latched without waiting. It is therefore possible
>      to try a right-to-left search for the max value, and only fall
>      back to the left-to-right approach if a conflict arises.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (DERBY-642) SELECT MAX doesn't use indices optimally

Reply via email to