On Fri, Apr 6, 2018 at 8:55 AM, Pavan Deolasee <pavan.deola...@gmail.com> wrote:
> > > On Fri, Apr 6, 2018 at 2:34 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > >> a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes: >> >> [... still waiting for the result, I will return with what it said >> >> when the server does ...] >> >> > It did eventually finish, with the same result: >> >> Huh. So what we have here, apparently, is that regular MVCC snapshots >> think there is exactly one copy of the 1698936148/0 row, but TOAST fetches >> think there is more than one. This is darn odd, not least because we >> never do UPDATEs in toast tables, only inserts and deletes, so there >> certainly shouldn't be update chains there. >> >> It seems like you've got some corner case wherein SnapshotToast sees a row >> that isn't visible according to MVCC --- probably a row left over from >> some previous cycle of life. That is, I'm imagining the OID counter >> wrapped around and we've reused a toast OID, but for some reason there's >> still a row in the table with that OID. I'm not sure offhand how we could >> get into such a state. Alvaro, does this ring any bells (remembering that >> this is 9.3)? >> > > FWIW one of our support customers reported a very similar TOAST table > corruption issue last week which nearly caused an outage. After a lot of > analysis, I think I've now fully understood the reasons behind the > corruption, the underlying bug(s) and possible remedy. I am currently > working on writing a reproducible test case to demonstrate the problem and > writing the fix. More details on that soon. > > I've posted a reproducer and a proposed fix to -hackers [1] In the particular case that I investigated, a database crash/recovery was involved. But I think we should be able to create a scenario where OID wrap-around or a standby promotion triggers the problem. I don't know if any of that was involved in the cases reported on this thread, but I've a strong suspicion that the underlying bug is probably the same. Thanks, Pavan [1] https://www.postgresql.org/message-id/CABOikdOgWT2hHkYG3Wwo2cyZJq2zf s1FH0FgX-%3Dh4OLosXHf9w%40mail.gmail.com -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services