On Sun, Feb 19, 2017 at 3:52 PM, Pavan Deolasee <pavan.deola...@gmail.com> wrote: > This particular case of corruption results in a heap tuple getting indexed > by a wrong key (or to be precise, indexed by its old value). So the only way > to detect the corruption is to look at each index key and check if it > matches with the corresponding heap tuple. We could write some kind of self > join that can use a sequential scan and an index-only scan (David Rowley had > suggested something of that sort internally here), but we can't guarantee > index-only scan on a table which is being concurrently updated. So not sure > even that will catch every possible case.
Oh, so the problem isn't index entries that are altogether missing? I guess I was confused. You can certainly guarantee an index-only scan if you write the validation code in C rather than using SQL. I think the issue is that if the table is large enough that keeping a TID -> index value mapping in a hash table is impractical, there's not going to be a real efficient strategy for this. Ignoring the question of whether you use the main executor for this or just roll your own code, your options for a large table are (1) a multi-batch hash join, (2) a nested loop, and (3) a merge join. (2) is easy to implement but will generate a ton of random I/O if the table is not resident in RAM. (3) is most suitable for very large tables but takes more work to code, and is also likely to be a lot slower for small tables than a hash or nestloop-based approach. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers