> Your suggestion was to reconstruct the index from original
> table data on a rollback.  For a large transaction that touches
> most pages of the index, this can give (at best) a 2:1 speedup.
> But the other side, the potential slowdown is extreme.

Yeah, there is that drawback. Other DBMSs avoid this problem by some
(decidedly NOT simple) mechanisms such as keeping before images in
main memory, keeping semantic records of changes in main memory,
keeping page serial numbers (to aid recovery), etc.

A decision can be made dynamically to stop recording before-images
(say, because some threshold percentage of the index's pages are being
changed, or because your main memory buffer is full), and transition
to my suggested approach.

Presently if P% of the pages are modified, 2P% pages are written for a
commit, and 3P% for a rollback. With the dynamic scheme, these change
to
  smaller of 2P or P + X
  smaller of 3P or P + X + 100
percent, respectively, where X is the threshold.

If the threshold was 25% then you'd never write more than 125% of the
size of the index for a committed transaction, and never write more
than 225% for a rolled-back transaction. Note that the present scheme
writes up to 200% for a committed and 300% for a rolled-back
transaction. Here is a table comparing the approaches for various
percentage of index pages modified. The left column is the percentage
of index pages modified, the right columns are the percentage of index
pages written for each of the approaches and situations (present c, r
and dynamic c' and r' for committed and rolled-back).

          25% 50%       25% 50%
  %    c   c'  c'    r   r'  r'
 10   20  20  20    30  30  30
 20   40  40  40    60  60  60
 30   60  55  60    90 155  90
 40   80  65  80   120 165 120
 50  100  75 100   150 175 200
 60  120  85 110   180 185 210
 70  140  95 120   210 195 220
 80  160 105 130   240 205 230
 90  180 115 140   270 215 240
100  200 125 150   300 225 250

Note that the dynamic approach is always better or equivalent for
committed transactions, and only worse for rolled-back transactions
when the percentage of pages modified is between X and (X+1)/2 where X
is the threshold. Finding an optimal value for X is application
dependent, unfortunately.

Another approach...

One database I implemented used newly allocated pages for all updates
to indexes, relying on the original versions to remain intact so
rollback was "free" (not counting the reclaiming of the free space).
Commit was implemented by writing the pointer to the root of the index
BTree to point to the new pages. The advantage of this method is
journaling for free. The drawback to this method is that even small
changes require writing the BTree along an entire path from leaf to
root so a single disk write can commit the index changes, and a free
space recovery mechanism. Overall it was very reliable and effective.

e


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to