On 9/11/07, Gregory Stark <[EMAIL PROTECTED]> wrote: > > > > > You could mark such tuples with LP_DELETE. That would also let other > transactions quickly tot up how much space would be available if they were > to > run PageRepairFragmentation. > > > >
IMHO we are making full circles here. We have already tried LP_DELETE techniques and moved away to simplify things. We also tried reusing dead space without running PageRepairFragmentation. Each of these techniques worked just fine with slightly different performance characteristics. What we now have is a simplified algorithm which is much easier to follow and is safer, yet giving us a very good performance boost. I am not sure if this is the right time to throw new ideas because we would never be sure as what we are doing would be the most optimal solution. Would it help if we go with some solution right now, get rest of the review process done and then use the feedback during beta testing to tune things ? We may have far more data points at that time to choose one technique over other. And we would also know what areas to focus on. I am also worried that by focusing too much on this issue we may overlook some other correctness issue in the patch. >From whatever we have discussed so far, IMHO we should do the following things and let rest of the review process proceed - Defragment a page only when the free space left in the page is not enough to accommodate even a single tuple (use average tuple length for this decision). This would mean we might be defragmenting even though there is no immediate UPDATE to the page. But we can treat this as fillfactor which allows us to provision for the next UPDATE coming to the page. Since we are defragmenting when the page is almost full hopefully we would reclaim good amount of space in the page and won't call defragmentation for next few UPDATEs. We already have mechanism to track average tuple request size in relcache. May be we can have some relcache invalidation to keep the information in sync (send invalidation when the average request size changes by say 5%) - Avoid pruning chains in every index or seq lookup. But if the chain becomes longer than X tuples, mark the page to be pruned in the next lookup. We can choose to separate prune and defragmentation and only do pruning in this case. But I would prefer to keep them together for now. - Track the minimum xmin in the page header to avoid repeated (wasted) attempts to prune a Prunable page in the presence of long running transactions. We can save rest of the techniques for beta testing period or 8.4. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com