Alvaro Herrera <alvhe...@2ndquadrant.com> writes: > Tom Lane wrote: >> However, the patch misses an >> important part of such an abstraction layer by not also converting >> catalog-related simple_heap_delete() calls into some sort of >> CatalogTupleDelete() operation. It is certainly a peculiarity of >> PG heaps that deletions don't require any immediate index work --- most >> other storage engines would need that. >> I propose that we should finish the job by inventing CatalogTupleDelete(), >> which for the moment would be a trivial wrapper around >> simple_heap_delete(), maybe just a macro for it. >> >> If there's no objections I'll go make that happen in a day or two.
> Sounds good. So while I was working on this I got quite unhappy with the already-committed patch: it's a leaky abstraction in more ways than this, and it's created a possibly-serious performance regression for large objects (and maybe other places). The source of both of those problems is that in some places, we did CatalogOpenIndexes and then used the CatalogIndexState for multiple tuple inserts/updates before doing CatalogCloseIndexes. The patch dealt with these either by not touching them, just leaving the simple_heap_insert/update calls in place (thus failing to create any abstraction), or by blithely ignoring the optimization and doing s/simple_heap_insert/CatalogTupleInsert/ anyway. For example, in inv_api.c we are now doing a CatalogOpenIndexes/CatalogCloseIndexes cycle for each chunk of the large object ... and just to add insult to injury, the now-useless open/close calls outside the loop are still there. I think what we ought to do about this is invent additional API functions, say Oid CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, CatalogIndexState indstate); void CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup, CatalogIndexState indstate); and use these in place of simple_heap_foo plus CatalogIndexInsert in the places where this optimization had been applied. An alternative but much more complicated fix would be to get rid of the necessity for callers to worry about this at all, by caching a CatalogIndexState in the catalog's relcache entry. That might be worth doing eventually (because it would allow sharing index info collection across unrelated operations) but I don't want to do it today. Objections, better naming ideas? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers