Simon Riggs <[EMAIL PROTECTED]> writes: > My first proposal is to add an extra parameter onto the > index_bulk_delete() call - ntuples. If ntuples == 0 then btbulkdelete() > will avoid scanning and return immediately. If a scan occurs, then we > keep track of how many tuples have been marked deleted and stop the scan > when we have reached this number.
This seems reasonable. I think you overstate the value of an early stop, but it can't hurt given the minimal cost of making the check. > 1) avoid the scan when there are no dead tuples, and ignore the > possibility that a VACUUM might be doing a follow-up scan to remove > previously deleted tuples. I think this is a pretty bad idea. Your analysis assumes that the two scans are equally expensive, which is about as far from true as is possible. In the first place, the first scan is in index order while the second is in physical order (hence can benefit from readahead). In the second place, the first scan iterates over each item on each page, where the second does not. So what you're proposing is to install an unavoidable disk space bloat mechanism in order to save only a small part of the cost of VACUUM. (What I'd really like to see is a way to do the first scan in physical order, but we'd need a different locking approach AFAICS --- see the notes in nbtree/README.) Keeping track of whether there's any work pending for the second pass doesn't seem too unreasonable to me. It was left undone in the original coding on the KISS principle, but it could certainly be done. I'd still like to see some demonstration first that this part of index vacuuming is a significant cost compared to the heap and first-pass index costs. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings