Tom Lane wrote: > "Heikki Linnakangas" <[EMAIL PROTECTED]> writes: >> Tom Lane wrote: >>> I'd still like to think about whether we >>> can be smarter about when to invoke pruning, but that's a small enough >>> issue that the patch can go in without it. > >> Yeah. I'm doing some micro-benchmarking, and the attached test case is >> much slower with HOT. It's spending a lot of time trying to prune, only >> to find out that it can't. > > Not sure if that's an appropriate description or not. oprofile > (on a dual Xeon running Fedora 6) shows me this: > > ... > samples % symbol name > 1070003 29.8708 LWLockAcquire > 1015097 28.3380 LWLockRelease > 283514 7.9147 heap_page_prune > ... > so basically it's all about the locking. Maybe the problem is that with > HOT we lock the buffer too often? heap_page_prune_opt is designed to > not take the buffer lock unless there's a good probability of needing > to prune, but maybe that's not working as intended.
If you look at the callgraph, you'll see that those LWLockAcquire/Release calls are coming from HeapTupleSatisfiesVacuum -> TransactionIdIsInProgress, which keeps trashing the ProcArrayLock. A "if(TransactionIdIsCurrentTransactionId(xid)) return true;" check in TransactionIdIsInProgress would speed that up, but I wonder if there's a more general solution to make HeapTupleSatisfiesVacuum cheaper. For example, we could cache the in-progress status of tuples. > Shouldn't we be able to prune rows that have been inserted and deleted > by the same transaction? I'd have hoped to see this example use only > one heap page ... Well maybe, but that's a separate issue. Wouldn't we need the "snapshot bookkeeping" we've discussed in the past, to notice that there's no snapshot in our own backend that needs to see the tuples? Nevertheless, the fruitless pruning attempts would still hurt us in slightly more complex scenarios, even if we fixed the above. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match