On Thu, Aug 11, 2022 at 1:48 AM Matthias van de Meent <boekewurm+postg...@gmail.com> wrote: > I think I understand your reasoning, but I don't agree with the > conclusion. The attached patch 0002 does fix that skew too, at what I > consider negligible cost. 0001 is your patch with a new version > number.
Your patch added allowSystemTableMods to one of the tests. I guess that this was an oversight? > I'm fine with your patch as is, but would appreciate it if known > estimate mistakes would also be fixed. Why do you think that this particular scenario/example deserves special attention? As I've acknowledged already, it is true that your scenario is one in which we provably give a less accurate estimate, based on already-available information. But other than that, I don't see any underlying principle that would be violated by my original patch (any kind of principle, held by anybody). reltuples is just an estimate. I was thinking of going your way on this, purely because it didn't seem like there'd be much harm in it (why not just handle your case and be done with it?). But I don't think that it's a good idea now. reltuples is usually derived by ANALYZE using a random sample, so the idea that tuple density can be derived accurately enough from a random sample is pretty baked in. You're talking about a case where ignoring just one page ("sampling" all but one of the pages) *isn't* good enough. It just doesn't seem like something that needs to be addressed -- it's quite awkward to do so. Barring any further objections, I plan on committing the original version tomorrow. > An alternative solution could be doing double-vetting, where we ignore > tuples_scanned if <2% of pages AND <2% of previous estimated tuples > was scanned. I'm not sure that I've understood you, but I think that you're talking about remembering more information (in pg_class), which is surely out of scope for a bug fix. -- Peter Geoghegan