Zeugswetter Andreas SB <[EMAIL PROTECTED]> writes:
> It was my understanding, that the heap xtid is part of the key now,
It is not.
There was some discussion of doing that, but it fell down on the little
problem that in normal index-search cases you *don't* know the heap tid
you are looking for.
> And in above case, the keys (since identical except xtid) will stick close
> together, thus caching will be good.
Even without key-collision problems, deleting N tuples out of a total of
M index entries will require search costs like this:
bulk delete in linear scan way:
O(M) I/O costs (read all the pages)
O(M log N) CPU costs (lookup each TID in sorted list)
successive index probe way:
O(N log M) I/O costs for probing index
O(N log M) CPU costs for probing index (key comparisons)
For N << M, the latter looks like a win, but you have to keep in mind
that the constant factors hidden by the O() notation are a lot different
in the two cases. In particular, if there are T indexentries per page,
the former I/O cost is really M/T * sequential read cost whereas the
latter is N log M * random read cost, yielding a difference in constant
factors of probably a thousand or two. You get some benefit in the
latter case from caching the upper btree levels, but that's by
definition not a large part of the index bulk. So where's the breakeven
point in reality? I don't know but I suspect that it's at pretty small
N. Certainly far less than one percent of the table, whereas I would
think that people would try to schedule VACUUMs at an interval where
they'd be reclaiming several percent of the table.
So, as I said to Hiroshi, this alternative looks to me like a possible
future refinement, not something we need to do in the first version.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster