Re: [PERFORM] Setting vacuum_freeze_min_age really low
HAndres, > Well. For one you haven't proven that the changed setting actually > improves performance. So the comparison isn't really valid. We will I agree that I haven't proven this yet, but that doesn't make it invalid. Just unproven. I agree that performance testing is necessary ... and the kind of performance testing which generated freeze activity, which makes it harder. > I think you're missing the fact that we don't neccessarily dirty pages, > just because vacuum visits them. In a mostly insert workload its not > uncommon that vacuum doesn't change anything. In many scenarios the Hmmm. But does vacuum visit the pages anyway, in that case? > b) freezing tuples requires a xlog_heap_freeze wal record to be >emitted. If we don't freeze, we don't need to emit it. Oh, that's annoying. > I think I have said that before, but anyway: I think as long as we need > to regularly walk the whole relation for correctness there isn't much > hope to get this into an acceptable state. If we would track the oldest > xid in a page in a 'freeze map' we could make much of this more > efficient and way more scalable to bigger data volumes. Yeah, or come up with some way to eliminate freezing entirely. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Setting vacuum_freeze_min_age really low
On 2013-05-13 13:21:54 -0400, Robert Haas wrote: > On Sun, May 12, 2013 at 8:50 AM, Andres Freund wrote: > > [ a response that I entirely agree with ] > > +1 to all that. > It's maybe worth noting that it's probably fairly uncommon for vacuum > to read a page and not dirty it, because if the page is all-visible, > we won't read it. But only if 50(?)+ pages are marked all-visible in one go, otherwise we afair won't skip afair. And we don't skip them at all during full table vacuums. > And if it's not all-visible, and there's nothing > else interesting to do with it, we'll probably make it all-visible, > which will dirty it. It can happen, if for example we vacuum a page > with no dead tuples while the inserting transaction is still running, > or committed but not yet all-visible. Of course, in those cases we > won't be able to freeze, either. IIRC the actual values below which we freeze are always computed relative to GetOldestXmin() (and have to, otherwise rows will suddently appear visible). In many, many environment thats lagging behind quite a bit. Longrunning user transactions, pg_dump, hot_standby_feedback, vacuum_defer_cleanup_age... Also, even if the *whole* page isn't all visible because e.g. there just was another row inserted we still freeze individual rows. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Setting vacuum_freeze_min_age really low
On Sun, May 12, 2013 at 8:50 AM, Andres Freund wrote: > [ a response that I entirely agree with ] +1 to all that. It's maybe worth noting that it's probably fairly uncommon for vacuum to read a page and not dirty it, because if the page is all-visible, we won't read it. And if it's not all-visible, and there's nothing else interesting to do with it, we'll probably make it all-visible, which will dirty it. It can happen, if for example we vacuum a page with no dead tuples while the inserting transaction is still running, or committed but not yet all-visible. Of course, in those cases we won't be able to freeze, either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Setting vacuum_freeze_min_age really low
Hi Josh, On 2013-05-11 16:28:32 -0700, Josh Berkus wrote: > > That, and Tom's concern about forensics, which I understand to be the > > larger sticking point. > > I don't buy the idea that we should cause regular recurring performance > issues for all of our users in order to aid diagnosing the kind of > issues which happen 1% of the time to 2% of our users. Well. For one you haven't proven that the changed setting actually improves performance. So the comparison isn't really valid. We will still need full table vacuums to be able to change relfrozenxids. Also, he small percentages are the cases where the shit really hit the fan. Making sure you have at least some chance of a) diagnosing the issue b) recovering data is a pretty good thing. > > So, if the table's age is less than vacuum_freeze_table_age, we'll > > only scan pages not already marked all-visible. Regardless of vfma, > > we probably won't freeze much. > Right, but the pages which were dirtied *anyway* will get frozen. I think you're missing the fact that we don't neccessarily dirty pages, just because vacuum visits them. In a mostly insert workload its not uncommon that vacuum doesn't change anything. In many scenarios the first time vacuum visits a page it cannot yet me marked "all-visible" yet so we will visit again soon after anyway. And after that there will be regular full table vacuums. > > It will also often enough lead to a page being frozen repeatedly which > > causes unneccessary IO and WAL traffic. If a page contains pages from > > several transactions its not unlikely that some tuples are older and > > some are newer than vfma. That scenario isn't unlikely because of two > > scenarios: > > Nobody has yet explained to me where this extra WAL and IO traffic would > come from. vfma only takes effect if the page is being vacuumed > *anyway*. There's multiple points here: a) we don't necessarily write/dirty anything if vacuum doesn't find anything to do b) freezing tuples requires a xlog_heap_freeze wal record to be emitted. If we don't freeze, we don't need to emit it. > And if the page is being vacuumed anyway, the page is being > rewritten anyway, and it doesn't matter how many changes we make on that > page, except as far as CPU time is concerned. As far as IO is > concerned, an 8K page is an 8K page. No? Sure, *if* we writeout the page, it doesn't matter at all whether we changed one byte or all of them. Unless it also requires extra xlog records to be emitted. > The only time I can imagine this resulting in extra IO is if vacuum is > regularly visiting pages which don't have any other work to do, but do > have tuples which could be frozen if vfma was lowered. I would tend to > think that this would be a tiny minority of pages, but testing may be > the only way to answer that. INSERT only produces workloads like that. > > When a page contains freezable items, as determined by freeze_min_age, > > and we are doing a full table scan we won't skip buffers that we can't > > lock for cleanup. Instead we will wait and then lock them for > > cleanup. So I think this would be rather noticeably impact the speed of > > vacuum (since it waits more often) and concurrency (since we lock more > > buffers than before, even if they are actively used). > > Well, that behavior sounds like something we should maybe fix, > regardless of whether we're lowering the default vfma or not. Well, that's easier said than done ;) I wonder if we couldn't do the actual freezeing - not the dead tuple deletion - without a cleanup but just with an exclusive lock? I think I have said that before, but anyway: I think as long as we need to regularly walk the whole relation for correctness there isn't much hope to get this into an acceptable state. If we would track the oldest xid in a page in a 'freeze map' we could make much of this more efficient and way more scalable to bigger data volumes. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Setting vacuum_freeze_min_age really low
Robert, Andres, > That, and Tom's concern about forensics, which I understand to be the > larger sticking point. I don't buy the idea that we should cause regular recurring performance issues for all of our users in order to aid diagnosing the kind of issues which happen 1% of the time to 2% of our users. > So, if the table's age is less than vacuum_freeze_table_age, we'll > only scan pages not already marked all-visible. Regardless of vfma, > we probably won't freeze much. Right, but the pages which were dirtied *anyway* will get frozen. > On the other hand, if the table's age is at least > vacuum_freeze_table_age, we'll scan the whole table and freeze a lotta > stuff all at once. Again, whether vfma is high or low won't matter > much: it's definitely less than vacuum_freeze_table_age. Right. > Basically, I would guess that both the costs and the benefits of > changing this are pretty small. It would be nice to hear from someone > who has tried it, though. Well, I have, but I don't exactly have empirical testing results from it. That's really the sticking point here: can we measurably demonstrate that lowering vfma makes autovacuum freeze happen less often, and do less work when it does? Realistically, I think that's waiting on me having time to do some lengthy performance testing. > It will also often enough lead to a page being frozen repeatedly which > causes unneccessary IO and WAL traffic. If a page contains pages from > several transactions its not unlikely that some tuples are older and > some are newer than vfma. That scenario isn't unlikely because of two > scenarios: Nobody has yet explained to me where this extra WAL and IO traffic would come from. vfma only takes effect if the page is being vacuumed *anyway*. And if the page is being vacuumed anyway, the page is being rewritten anyway, and it doesn't matter how many changes we make on that page, except as far as CPU time is concerned. As far as IO is concerned, an 8K page is an 8K page. No? The only time I can imagine this resulting in extra IO is if vacuum is regularly visiting pages which don't have any other work to do, but do have tuples which could be frozen if vfma was lowered. I would tend to think that this would be a tiny minority of pages, but testing may be the only way to answer that. > When a page contains freezable items, as determined by freeze_min_age, > and we are doing a full table scan we won't skip buffers that we can't > lock for cleanup. Instead we will wait and then lock them for > cleanup. So I think this would be rather noticeably impact the speed of > vacuum (since it waits more often) and concurrency (since we lock more > buffers than before, even if they are actively used). Well, that behavior sounds like something we should maybe fix, regardless of whether we're lowering the default vfma or not. --Josh Berkus -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Setting vacuum_freeze_min_age really low
On 2013-05-09 12:09:04 -0400, Robert Haas wrote: > On Mon, Mar 25, 2013 at 4:31 PM, Josh Berkus wrote: > > In the past, setting vacuum_freeze_min_age (vfma) really low (say to > > 1 or 5) would have caused lots of extra writing work due to > > dirtying extra pages for freezing. This has been our stated reason to > > keep vfma high, despite the obvious advantage of freezing tuples while > > they're still in the cache. > > That, and Tom's concern about forensics, which I understand to be the > larger sticking point. FWIW I found having sensible xmin/xmax repeatedly really useful for debugging problems. Most problems don't get noticed within minutes so loosing evidence that fast is bad. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Setting vacuum_freeze_min_age really low
Hi, On 2013-03-25 13:31:17 -0700, Josh Berkus wrote: > In the past, setting vacuum_freeze_min_age (vfma) really low (say to > 1 or 5) would have caused lots of extra writing work due to > dirtying extra pages for freezing. This has been our stated reason to > keep vfma high, despite the obvious advantage of freezing tuples while > they're still in the cache. > > With the visibility map, though, vfma should only be dirtying pages > which vacuum is already visiting because there's dirty tuples on the > page. That is, pages which vacuum will probably dirty anyway, freezing > or not. (This is assuming one has applied the 9.2.3 update.) > > Given that, it seems like the cost of lowering vfma *should* be > marginal. The only extra work done by a lower vfma should be: > > 1. extra cpu time to put in the froxenXIDs on vacuumed pages, and > 2. dirtying the minority of pages which vacuum decided to scan, but not > write to. It will also often enough lead to a page being frozen repeatedly which causes unneccessary IO and WAL traffic. If a page contains pages from several transactions its not unlikely that some tuples are older and some are newer than vfma. That scenario isn't unlikely because of two scenarios: - INSERT/UPDATE reusing space on older pages where tuples have been deleted. - When a backend extends a relation that page is *not* known to have free space to other relations. Until vacuum comes along for the first time only this backend will use its space. Given that busy clusters frequently burn loads of xids per second it is not uncommon to have a wide range of xids on such a page. > And are there other costs I'm not thinking of? I think (but am not 100% sure right now) it would have another rather big cost: When a page contains freezable items, as determined by freeze_min_age, and we are doing a full table scan we won't skip buffers that we can't lock for cleanup. Instead we will wait and then lock them for cleanup. So I think this would be rather noticeably impact the speed of vacuum (since it waits more often) and concurrency (since we lock more buffers than before, even if they are actively used). Makes sense? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Setting vacuum_freeze_min_age really low
On Mon, Mar 25, 2013 at 4:31 PM, Josh Berkus wrote: > In the past, setting vacuum_freeze_min_age (vfma) really low (say to > 1 or 5) would have caused lots of extra writing work due to > dirtying extra pages for freezing. This has been our stated reason to > keep vfma high, despite the obvious advantage of freezing tuples while > they're still in the cache. That, and Tom's concern about forensics, which I understand to be the larger sticking point. > With the visibility map, though, vfma should only be dirtying pages > which vacuum is already visiting because there's dirty tuples on the > page. That is, pages which vacuum will probably dirty anyway, freezing > or not. (This is assuming one has applied the 9.2.3 update.) I think this is probably not accurate, although I'll defer to someone with more real-world experience. I'd guess that it's uncommon for actively updated data and very-rarely-updated data to be mixed together on the same pages with any real regularity. IOW, the dirty pages probably don't have anything on them that can be frozen anyway. So, if the table's age is less than vacuum_freeze_table_age, we'll only scan pages not already marked all-visible. Regardless of vfma, we probably won't freeze much. On the other hand, if the table's age is at least vacuum_freeze_table_age, we'll scan the whole table and freeze a lotta stuff all at once. Again, whether vfma is high or low won't matter much: it's definitely less than vacuum_freeze_table_age. Basically, I would guess that both the costs and the benefits of changing this are pretty small. It would be nice to hear from someone who has tried it, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Setting vacuum_freeze_min_age really low
Folks, In the past, setting vacuum_freeze_min_age (vfma) really low (say to 1 or 5) would have caused lots of extra writing work due to dirtying extra pages for freezing. This has been our stated reason to keep vfma high, despite the obvious advantage of freezing tuples while they're still in the cache. With the visibility map, though, vfma should only be dirtying pages which vacuum is already visiting because there's dirty tuples on the page. That is, pages which vacuum will probably dirty anyway, freezing or not. (This is assuming one has applied the 9.2.3 update.) Given that, it seems like the cost of lowering vfma *should* be marginal. The only extra work done by a lower vfma should be: 1. extra cpu time to put in the froxenXIDs on vacuumed pages, and 2. dirtying the minority of pages which vacuum decided to scan, but not write to. The second point is the one where I'm not sure how to evaluate. How likely, as of 9.2, is vacuum to visit a page and not dirty it? And are there other costs I'm not thinking of? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance