On Wed, May 13, 2020 at 5:33 PM Peter Geoghegan <p...@bowt.ie> wrote: > Do you recall seeing corruption resulting in segfaults in production?
I have seen that, I believe. I think it's more common to fail with errors about not being able to palloc>1GB, not being able to look up an xid or mxid, etc. but I am pretty sure I've seen multiple cases involving seg faults, too. Unfortunately for my credibility, I can't remember the details right now. > I personally don't recall seeing that. If it happened, the segfaults > themselves probably wouldn't be the main concern. I don't really agree. Hypothetically speaking, suppose you corrupt your only copy of a critical table in such a way that every time you select from it, the system seg faults. A user in this situation might ask questions like: 1. How did my table get corrupted? 2. Why do I only have one copy of it? 3. How do I retrieve the non-corrupted portion of my data from that table and get back up and running? In the grand scheme of things, #1 and #2 are the most important questions, but when something like this actually happens, #3 tends to be the most urgent question, and it's a lot harder to get the uncorrupted data out if the system keeps crashing. Also, a seg fault tends to lead customers to think that the database has a bug, rather than that the database is corrupted. Slightly off-topic here, but I think our error reporting in this area is pretty lame. I've learned over the years that when a customer reports that they get a complaint about a too-large memory allocation every time they access a table, they've probably got a corrupted varlena header. However, that's extremely non-obvious to a typical user. We should try to report errors indicative of corruption in a way that gives the user some clue that corruption has happened. Peter made a stab at improving things there by adding errcode(ERRCODE_DATA_CORRUPTED) in a bunch of places, but a lot of users will never see the error code, only the message, and a lot of corruption produces still produces errors that weren't changed by that commit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company