> Your suggestion was to reconstruct the index from original > table data on a rollback. For a large transaction that touches > most pages of the index, this can give (at best) a 2:1 speedup. > But the other side, the potential slowdown is extreme.
Yeah, there is that drawback. Other DBMSs avoid this problem by some (decidedly NOT simple) mechanisms such as keeping before images in main memory, keeping semantic records of changes in main memory, keeping page serial numbers (to aid recovery), etc. A decision can be made dynamically to stop recording before-images (say, because some threshold percentage of the index's pages are being changed, or because your main memory buffer is full), and transition to my suggested approach. Presently if P% of the pages are modified, 2P% pages are written for a commit, and 3P% for a rollback. With the dynamic scheme, these change to smaller of 2P or P + X smaller of 3P or P + X + 100 percent, respectively, where X is the threshold. If the threshold was 25% then you'd never write more than 125% of the size of the index for a committed transaction, and never write more than 225% for a rolled-back transaction. Note that the present scheme writes up to 200% for a committed and 300% for a rolled-back transaction. Here is a table comparing the approaches for various percentage of index pages modified. The left column is the percentage of index pages modified, the right columns are the percentage of index pages written for each of the approaches and situations (present c, r and dynamic c' and r' for committed and rolled-back). 25% 50% 25% 50% % c c' c' r r' r' 10 20 20 20 30 30 30 20 40 40 40 60 60 60 30 60 55 60 90 155 90 40 80 65 80 120 165 120 50 100 75 100 150 175 200 60 120 85 110 180 185 210 70 140 95 120 210 195 220 80 160 105 130 240 205 230 90 180 115 140 270 215 240 100 200 125 150 300 225 250 Note that the dynamic approach is always better or equivalent for committed transactions, and only worse for rolled-back transactions when the percentage of pages modified is between X and (X+1)/2 where X is the threshold. Finding an optimal value for X is application dependent, unfortunately. Another approach... One database I implemented used newly allocated pages for all updates to indexes, relying on the original versions to remain intact so rollback was "free" (not counting the reclaiming of the free space). Commit was implemented by writing the pointer to the root of the index BTree to point to the new pages. The advantage of this method is journaling for free. The drawback to this method is that even small changes require writing the BTree along an entire path from leaf to root so a single disk write can commit the index changes, and a free space recovery mechanism. Overall it was very reliable and effective. e --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]