On 9/17/13 6:10 PM, Andres Freund wrote:
What if we maintained XID stats for ranges of pages in a separate >fork? Call it the XidStats fork. Presumably the interesting pieces >would be min(xmin) and max(xmax) for pages that aren't all visible. If >we did that at a granularity of, say, 1MB worth of pages[1] we're >talking 8 bytes per MB, or 1 XidStats page per GB of heap. (Worst case >alignment bumps that up to 2 XidStats pages per GB of heap.)
Yes, I have thought about similar ideas as well, but I came to the conclusion that it's not worth it. If you want to make the boundaries precise and the xidstats fork small, you're introducing new contention points because every DML will need to make sure it's correct.
Actually, that's not true... the XidStats only need to be "relatively" precise. IE: within a few hundred or thousand XIDs. So for example, you'd only need to attempt an update if the XID already stored was more than a few hundred/thousand/whatever XIDs away from your XID. If it's any closer don't even bother to update. That still leaves potential for thundering herd on the fork buffer lock if you've got a ton of DML on one table across a bunch of backends, but there might be other ways around that. For example, if you know you can update the XID with a CPU-atomic instruction, you don't need to lock the page.
Also, the amount of code that would require seems to be bigger than justified by the increase of precision when to vacuum.
That's very possibly true. I haven't had a chance to see how much VM bits help reduce vacuum overhead yet, so I don't have anything to add on this front. Perhaps others might. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers