On Tue, Mar 1, 2022 at 1:46 PM Robert Haas <robertmh...@gmail.com> wrote: > I think that this is not really a description of an algorithm -- and I > think that it is far from clear that the third "in-between" category > does not need to exist.
But I already described the algorithm. It is very simple mechanistically -- though that in itself means very little. As I have said multiple times now, the hard part is assessing what the implications are. And the even harder part is making a judgement about whether or not those implications are what we generally want. > I think findings like this are very unconvincing. TPC-C may be unrealistic in certain ways, but it is nevertheless vastly more realistic than pgbench. pgbench is really more of a stress test than a benchmark. The main reasons why TPC-C is interesting here are *very* simple, and would likely be equally true with TPC-E (just for example) -- even though TPC-E is a very different benchmark kind of OLTP workload overall. TPC-C (like TPC-E) features a diversity of transaction types, some of which are more complicated than others -- which is strictly more realistic than having only one highly synthetic OLTP transaction type. Each transaction type doesn't necessarily modify the same tables in the same way. This leads to natural diversity among tables and among transactions, including: * The typical or average number of distinct XIDs per heap page varies significantly among each table. There are way fewer distinct XIDs per "order line" table heap page than there are per "order" table heap page, for the obvious reason. * Roughly speaking, there are various different ways that free space management ought to work in a system like Postgres. For example it is necessary to make a "fragmentations vs space utilization" trade-off with the new orders table. * There are joins in some of the transactions! Maybe TPC-C is a crude approximation of reality, but it nevertheless exercises relevant parts of the system to a significant degree. What else would you expect me to use, for a project like this? To a significant degree the relfrozenxid tracking stuff is interesting because tables tend to have natural differences like the ones I have highlighted on this thread. How could that not be the case? Why wouldn't we want to take advantage of that? There might be some danger in over-optimizing for this particular benchmark, but right now that is so far from being the main problem that the idea seems strange to me. pgbench doesn't need the FSM, at all. In fact pgbench doesn't even really need VACUUM (except for antiwraparound), once heap fillfactor is lowered to 95 or so. pgbench simply isn't relevant, *at all*, except perhaps as a way of measuring regressions in certain synthetic cases that don't benefit. > TPC-C (or any > benchmark really) is so simple as to be a terrible proxy for what > vacuuming is going to look like on real-world systems. Doesn't that amount to "no amount of any kind of testing or benchmarking will convince me of anything, ever"? There is more than one type of real-world system. I think that TPC-C is representative of some real world systems in some regards. But even that's not the important point for me. I find TPC-C generally interesting for one reason: I can clearly see that Postgres does things in a way that just doesn't make much sense, which isn't particularly fundamental to how VACUUM works. My only long term goal is to teach Postgres to *avoid* various pathological cases exhibited by TPC-C (e.g., the B-Tree "split after new tuple" mechanism from commit f21668f328 *avoids* a pathological case from TPC-C). We don't necessarily have to agree on how important each individual case is "in the real world" (which is impossible to know anyway). We only have to agree that what we see is a pathological case (because some reasonable expectation is dramatically violated), and then work out a fix. I don't want to teach Postgres to be clever -- I want to teach it to avoid being stupid in cases where it exhibits behavior that really cannot be described any other way. You seem to talk about some of this work as if it was just as likely to have a detrimental effect elsewhere, for some equally plausible workload, which will have a downside that is roughly as bad as the advertised upside. I consider that very unlikely, though. Sure, regressions are quite possible, and a real concern -- but regressions *like that* are unlikely. Avoiding doing what is clearly the wrong thing just seems to work out that way, in general. -- Peter Geoghegan