On 2013-06-27 18:18:50 -0400, Tom Lane wrote: > Alvaro Herrera <alvhe...@2ndquadrant.com> writes: > > I'm looking at the combined patches 0003-0005, which are essentially all > > about adding a function to obtain relation OID from (tablespace, > > filenode). It takes care to look through the relation mapper, and uses > > a new syscache underneath for performance. > > > One question about this patch, originally, was about the usage of > > that relfilenode syscache. It is questionable because it would be the > > only syscache to apply on top of a non-unique index. > > ... which, I assume, is on top of a pg_class index that doesn't exist > today. Exactly what is the argument that says performance of this > function is sufficiently critical to justify adding both the maintenance > overhead of a new pg_class index, *and* a broken-by-design syscache?
Ok, so this requires some context. When we do the changeset extraction we build a mvcc snapshot that for every heap wal record is consistent with one made at the time the record has been inserted. Then, when we've built that snapshot, we can use it to turn heap wal records into the representation the user wants: For that we first need to know which table a change comes from, since otherwise we obviously cannot interpret the HeapTuple that's essentially contained in the wal record without it. Since we have a correct mvcc snapshot we can query pg_class for (tablespace, relfilenode) to get back the relation. When we know the relation, the user (i.e. the output pluggin) can use normal backend code to transform the HeapTuple into the target representation, e.g. SQL, since we can build a TupleDesc. Since the syscaches are synchronized with the built snapshot normal output functions can be used. What that means is that for every heap record in the target database in the WAL we need to query pg_class to turn the relfilenode into a pg_class.oid. So, we can easily replace syscache.c with some custom caching code, but I don't think it's realistic to get rid of that index. Otherwise we need to cache the entire pg_class in memory which doesn't sound enticing. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers