Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-23 Thread Josh Berkus
HAndres,

 Well. For one you haven't proven that the changed setting actually
 improves performance. So the comparison isn't really valid. We will

I agree that I haven't proven this yet, but that doesn't make it
invalid.  Just unproven.

I agree that performance testing is necessary ... and the kind of
performance testing which generated freeze activity, which makes it harder.

 I think you're missing the fact that we don't neccessarily dirty pages,
 just because vacuum visits them. In a mostly insert workload its not
 uncommon that vacuum doesn't change anything. In many scenarios the

Hmmm.  But does vacuum visit the pages anyway, in that case?

 b) freezing tuples requires a xlog_heap_freeze wal record to be
emitted. If we don't freeze, we don't need to emit it.

Oh, that's annoying.

 I think I have said that before, but anyway: I think as long as we need
 to regularly walk the whole relation for correctness there isn't much
 hope to get this into an acceptable state. If we would track the oldest
 xid in a page in a 'freeze map' we could make much of this more
 efficient and way more scalable to bigger data volumes.

Yeah, or come up with some way to eliminate freezing entirely.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-13 Thread Robert Haas
On Sun, May 12, 2013 at 8:50 AM, Andres Freund and...@2ndquadrant.com wrote:
 [ a response that I entirely agree with ]

+1 to all that.

It's maybe worth noting that it's probably fairly uncommon for vacuum
to read a page and not dirty it, because if the page is all-visible,
we won't read it.  And if it's not all-visible, and there's nothing
else interesting to do with it, we'll probably make it all-visible,
which will dirty it.  It can happen, if for example we vacuum a page
with no dead tuples while the inserting transaction is still running,
or committed but not yet all-visible.  Of course, in those cases we
won't be able to freeze, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-13 Thread Andres Freund
On 2013-05-13 13:21:54 -0400, Robert Haas wrote:
 On Sun, May 12, 2013 at 8:50 AM, Andres Freund and...@2ndquadrant.com wrote:
  [ a response that I entirely agree with ]
 
 +1 to all that.

 It's maybe worth noting that it's probably fairly uncommon for vacuum
 to read a page and not dirty it, because if the page is all-visible,
 we won't read it.

But only if 50(?)+ pages are marked all-visible in one go, otherwise we
afair won't skip afair. And we don't skip them at all during full table
vacuums.

  And if it's not all-visible, and there's nothing
 else interesting to do with it, we'll probably make it all-visible,
 which will dirty it.  It can happen, if for example we vacuum a page
 with no dead tuples while the inserting transaction is still running,
 or committed but not yet all-visible.  Of course, in those cases we
 won't be able to freeze, either.

IIRC the actual values below which we freeze are always computed
relative to GetOldestXmin() (and have to, otherwise rows will suddently
appear visible). In many, many environment thats lagging behind quite a
bit. Longrunning user transactions, pg_dump, hot_standby_feedback,
vacuum_defer_cleanup_age...

Also, even if the *whole* page isn't all visible because e.g. there just
was another row inserted we still freeze individual rows.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-12 Thread Andres Freund
Hi Josh,

On 2013-05-11 16:28:32 -0700, Josh Berkus wrote:
  That, and Tom's concern about forensics, which I understand to be the
  larger sticking point.
 
 I don't buy the idea that we should cause regular recurring performance
 issues for all of our users in order to aid diagnosing the kind of
 issues which happen 1% of the time to 2% of our users.

Well. For one you haven't proven that the changed setting actually
improves performance. So the comparison isn't really valid. We will
still need full table vacuums to be able to change relfrozenxids. Also,
he small percentages are the cases where the shit really hit the
fan. Making sure you have at least some chance of a) diagnosing the
issue b) recovering data is a pretty good thing.

  So, if the table's age is less than vacuum_freeze_table_age, we'll
  only scan pages not already marked all-visible.  Regardless of vfma,
  we probably won't freeze much.
 
 Right, but the pages which were dirtied *anyway* will get frozen.

I think you're missing the fact that we don't neccessarily dirty pages,
just because vacuum visits them. In a mostly insert workload its not
uncommon that vacuum doesn't change anything. In many scenarios the
first time vacuum visits a page it cannot yet me marked all-visible
yet so we will visit again soon after anyway. And after that there will
be regular full table vacuums.

  It will also often enough lead to a page being frozen repeatedly which
  causes unneccessary IO and WAL traffic. If a page contains pages from
  several transactions its not unlikely that some tuples are older and
  some are newer than vfma. That scenario isn't unlikely because of two
  scenarios:
 
 Nobody has yet explained to me where this extra WAL and IO traffic would
 come from.  vfma only takes effect if the page is being vacuumed
 *anyway*.

There's multiple points here:
a) we don't necessarily write/dirty anything if vacuum doesn't find
   anything to do
b) freezing tuples requires a xlog_heap_freeze wal record to be
   emitted. If we don't freeze, we don't need to emit it.

  And if the page is being vacuumed anyway, the page is being
 rewritten anyway, and it doesn't matter how many changes we make on that
 page, except as far as CPU time is concerned.  As far as IO is
 concerned, an 8K page is an 8K page.  No?

Sure, *if* we writeout the page, it doesn't matter at all whether we
changed one byte or all of them. Unless it also requires extra xlog
records to be emitted.

 The only time I can imagine this resulting in extra IO is if vacuum is
 regularly visiting pages which don't have any other work to do, but do
 have tuples which could be frozen if vfma was lowered.  I would tend to
 think that this would be a tiny minority of pages, but testing may be
 the only way to answer that.

INSERT only produces workloads like that.

  When a page contains freezable items, as determined by freeze_min_age,
  and we are doing a full table scan we won't skip buffers that we can't
  lock for cleanup. Instead we will wait and then lock them for
  cleanup. So I think this would be rather noticeably impact the speed of
  vacuum (since it waits more often) and concurrency (since we lock more
  buffers than before, even if they are actively used).
 
 Well, that behavior sounds like something we should maybe fix,
 regardless of whether we're lowering the default vfma or not.

Well, that's easier said than done ;)

I wonder if we couldn't do the actual freezeing - not the dead tuple
deletion - without a cleanup but just with an exclusive lock?

I think I have said that before, but anyway: I think as long as we need
to regularly walk the whole relation for correctness there isn't much
hope to get this into an acceptable state. If we would track the oldest
xid in a page in a 'freeze map' we could make much of this more
efficient and way more scalable to bigger data volumes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-11 Thread Josh Berkus
Robert, Andres,

 That, and Tom's concern about forensics, which I understand to be the
 larger sticking point.

I don't buy the idea that we should cause regular recurring performance
issues for all of our users in order to aid diagnosing the kind of
issues which happen 1% of the time to 2% of our users.

 So, if the table's age is less than vacuum_freeze_table_age, we'll
 only scan pages not already marked all-visible.  Regardless of vfma,
 we probably won't freeze much.

Right, but the pages which were dirtied *anyway* will get frozen.

 On the other hand, if the table's age is at least
 vacuum_freeze_table_age, we'll scan the whole table and freeze a lotta
 stuff all at once.  Again, whether vfma is high or low won't matter
 much: it's definitely less than vacuum_freeze_table_age.

Right.

 Basically, I would guess that both the costs and the benefits of
 changing this are pretty small.  It would be nice to hear from someone
 who has tried it, though.

Well, I have, but I don't exactly have empirical testing results from
it.  That's really the sticking point here: can we measurably
demonstrate that lowering vfma makes autovacuum freeze happen less
often, and do less work when it does?  Realistically, I think that's
waiting on me having time to do some lengthy performance testing.

 It will also often enough lead to a page being frozen repeatedly which
 causes unneccessary IO and WAL traffic. If a page contains pages from
 several transactions its not unlikely that some tuples are older and
 some are newer than vfma. That scenario isn't unlikely because of two
 scenarios:

Nobody has yet explained to me where this extra WAL and IO traffic would
come from.  vfma only takes effect if the page is being vacuumed
*anyway*.  And if the page is being vacuumed anyway, the page is being
rewritten anyway, and it doesn't matter how many changes we make on that
page, except as far as CPU time is concerned.  As far as IO is
concerned, an 8K page is an 8K page.  No?

The only time I can imagine this resulting in extra IO is if vacuum is
regularly visiting pages which don't have any other work to do, but do
have tuples which could be frozen if vfma was lowered.  I would tend to
think that this would be a tiny minority of pages, but testing may be
the only way to answer that.

 When a page contains freezable items, as determined by freeze_min_age,
 and we are doing a full table scan we won't skip buffers that we can't
 lock for cleanup. Instead we will wait and then lock them for
 cleanup. So I think this would be rather noticeably impact the speed of
 vacuum (since it waits more often) and concurrency (since we lock more
 buffers than before, even if they are actively used).

Well, that behavior sounds like something we should maybe fix,
regardless of whether we're lowering the default vfma or not.

--Josh Berkus






-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-09 Thread Robert Haas
On Mon, Mar 25, 2013 at 4:31 PM, Josh Berkus j...@agliodbs.com wrote:
 In the past, setting vacuum_freeze_min_age (vfma) really low (say to
 1 or 5) would have caused lots of extra writing work due to
 dirtying extra pages for freezing.  This has been our stated reason to
 keep vfma high, despite the obvious advantage of freezing tuples while
 they're still in the cache.

That, and Tom's concern about forensics, which I understand to be the
larger sticking point.

 With the visibility map, though, vfma should only be dirtying pages
 which vacuum is already visiting because there's dirty tuples on the
 page.  That is, pages which vacuum will probably dirty anyway, freezing
 or not.  (This is assuming one has applied the 9.2.3 update.)

I think this is probably not accurate, although I'll defer to someone
with more real-world experience.  I'd guess that it's uncommon for
actively updated data and very-rarely-updated data to be mixed
together on the same pages with any real regularity.  IOW, the dirty
pages probably don't have anything on them that can be frozen anyway.

So, if the table's age is less than vacuum_freeze_table_age, we'll
only scan pages not already marked all-visible.  Regardless of vfma,
we probably won't freeze much.

On the other hand, if the table's age is at least
vacuum_freeze_table_age, we'll scan the whole table and freeze a lotta
stuff all at once.  Again, whether vfma is high or low won't matter
much: it's definitely less than vacuum_freeze_table_age.

Basically, I would guess that both the costs and the benefits of
changing this are pretty small.  It would be nice to hear from someone
who has tried it, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-09 Thread Andres Freund
Hi,

On 2013-03-25 13:31:17 -0700, Josh Berkus wrote:
 In the past, setting vacuum_freeze_min_age (vfma) really low (say to
 1 or 5) would have caused lots of extra writing work due to
 dirtying extra pages for freezing.  This has been our stated reason to
 keep vfma high, despite the obvious advantage of freezing tuples while
 they're still in the cache.
 
 With the visibility map, though, vfma should only be dirtying pages
 which vacuum is already visiting because there's dirty tuples on the
 page.  That is, pages which vacuum will probably dirty anyway, freezing
 or not.  (This is assuming one has applied the 9.2.3 update.)
 
 Given that, it seems like the cost of lowering vfma *should* be
 marginal.  The only extra work done by a lower vfma should be:
 
 1. extra cpu time to put in the froxenXIDs on vacuumed pages, and
 2. dirtying the minority of pages which vacuum decided to scan, but not
 write to.

It will also often enough lead to a page being frozen repeatedly which
causes unneccessary IO and WAL traffic. If a page contains pages from
several transactions its not unlikely that some tuples are older and
some are newer than vfma. That scenario isn't unlikely because of two
scenarios:
- INSERT/UPDATE reusing space on older pages where tuples have been
  deleted.
- When a backend extends a relation that page is *not* known to have
  free space to other relations. Until vacuum comes along for the first
  time only this backend will use its space. Given that busy clusters
  frequently burn loads of xids per second it is not uncommon to have a
  wide range of xids on such a page.

 And are there other costs I'm not thinking of?

I think (but am not 100% sure right now) it would have another rather
big cost:
When a page contains freezable items, as determined by freeze_min_age,
and we are doing a full table scan we won't skip buffers that we can't
lock for cleanup. Instead we will wait and then lock them for
cleanup. So I think this would be rather noticeably impact the speed of
vacuum (since it waits more often) and concurrency (since we lock more
buffers than before, even if they are actively used).

Makes sense?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Setting vacuum_freeze_min_age really low

2013-05-09 Thread Andres Freund
On 2013-05-09 12:09:04 -0400, Robert Haas wrote:
 On Mon, Mar 25, 2013 at 4:31 PM, Josh Berkus j...@agliodbs.com wrote:
  In the past, setting vacuum_freeze_min_age (vfma) really low (say to
  1 or 5) would have caused lots of extra writing work due to
  dirtying extra pages for freezing.  This has been our stated reason to
  keep vfma high, despite the obvious advantage of freezing tuples while
  they're still in the cache.
 
 That, and Tom's concern about forensics, which I understand to be the
 larger sticking point.

FWIW I found having sensible xmin/xmax repeatedly really useful for
debugging problems. Most problems don't get noticed within minutes so
loosing evidence that fast is bad.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Setting vacuum_freeze_min_age really low

2013-03-25 Thread Josh Berkus
Folks,

In the past, setting vacuum_freeze_min_age (vfma) really low (say to
1 or 5) would have caused lots of extra writing work due to
dirtying extra pages for freezing.  This has been our stated reason to
keep vfma high, despite the obvious advantage of freezing tuples while
they're still in the cache.

With the visibility map, though, vfma should only be dirtying pages
which vacuum is already visiting because there's dirty tuples on the
page.  That is, pages which vacuum will probably dirty anyway, freezing
or not.  (This is assuming one has applied the 9.2.3 update.)

Given that, it seems like the cost of lowering vfma *should* be
marginal.  The only extra work done by a lower vfma should be:

1. extra cpu time to put in the froxenXIDs on vacuumed pages, and
2. dirtying the minority of pages which vacuum decided to scan, but not
write to.

The second point is the one where I'm not sure how to evaluate.  How
likely, as of 9.2, is vacuum to visit a page and not dirty it?  And are
there other costs I'm not thinking of?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance