Re: [HACKERS] XIDs and big boxes again ...

2008-05-12 Thread Hans-Juergen Schoenig

Joshua D. Drake wrote:

Hans-Juergen Schoenig wrote:


regards, tom lane
  



overhead is not an issue here - if i lose 10 or 15% i am totally fine 
as long as i can reduce vacuum overhead to an absolute minimum.

overhead will vary with row sizes anyway - this is not the point.


I am not buying this argument. If you have a 5TB database, I am going 
to assume you put it on enterprise class hardware. Enterprise class 
hardware can handle the I/O required to appropriately run vacuum.


We have a customer that is constantly running 5 autovacuum workers on 
only 28 spindles. We are in the process of upgrading them to 50 
spindles at which point I will likely try 10 autovacuum workers.





i forgot to mention - i am on 8.1 here.
so, VACUUM is not so smart yet.

my changes are pretty much random I/O - so tuple header does not 
contribute to a lot more I/O as i have to read entire blocks anway.

this is why i said - it is not that kind of an issue.

and no, updating is not a 5 min task ...

   hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XIDs and big boxes again ...

2008-05-12 Thread Gregory Stark
Hans-Juergen Schoenig [EMAIL PROTECTED] writes:

 i forgot to mention - i am on 8.1 here.
 so, VACUUM is not so smart yet.

So even if we added 64-bit xids it wouldn't be useful to you. You would have
to update (at which point you get all the other improvements which make it
less useful.) Or at the very least rebuild with the patch and dump and reload
which is just as hard.

 my changes are pretty much random I/O - so tuple header does not contribute to
 a lot more I/O as i have to read entire blocks anway.
 this is why i said - it is not that kind of an issue.

TPCC experiments show that even on random access you get the same performance
hit from bloat. I'm not entirely sure what the mechanism is unless it's simply
the cache hit rate being hurt by the wasted memory.

 and no, updating is not a 5 min task ...

I do hope you mean 8.1.11 btw. Updating your binaries should be a 5 minute job
and there are real bugs fixed in those releases.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XIDs and big boxes again ...

2008-05-12 Thread Matthew T. O'Connor

Hans-Juergen Schoenig wrote:
i suggest to introduce a --with-long-xids flag which would give me 62 / 
64 bit XIDs per vacuum on the entire database.

this should be fairly easy to implement.
i am not too concerned about the size of the tuple header here - if we 
waste 500 gb of storage here i am totally fine.


As you say later in the thread, you are on 8.1.  Alot of work has gone 
into reducing the effect, impact and frequency of XID wrap around  and 
vacuuming since then.  In 8.3 transactions that don't actually update a 
table no long use a real XID and autovacuum you no longer need a 
database wide vacuum to solve the XID wraparound problem, so I think the 
answer is upgrade to 8.3 and see if you still have this problem.


Matthew O'Connor

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XIDs and big boxes again ...

2008-05-11 Thread Gregory Stark
Hans-Juergen Schoenig [EMAIL PROTECTED] writes:

 my DB is facing around 600mio transaction a month. 85% of those contain at
 least some small modification so I cannot save on XIDs.

What's a mio? Assuming it's short for million I don't see the problem. The
transaction horizon is 2 *billion*. So as long as you can complete a vacuum of
every table once every 3 months you should be fine.

 my problem is that I cannot VACUUM FREEZE my 150k tables where most of the 
 data
 is as I have a couple of thousand transactions a day modifying this data.

vacuum freeze doesn't take any locks. But in any case vacuum freeze would only
extend the vacuum horizon by 100k so it would hardly make any difference.

 but, i also have troubles to prevent myself from transaction wraparound as it
 is pretty boring to vacuum that much data under heavy load - with some useful
 vacuum delay it just takes too long.

How long is too long?

 i basically have to vacuum the entire database too often to get spare XIDs.

How often is too often? 

 i suggest to introduce a --with-long-xids flag which would give me 62 / 64 bit
 XIDs per vacuum on the entire database.
 this should be fairly easy to implement.
 i am not too concerned about the size of the tuple header here - if we waste
 500 gb of storage here i am totally fine.

Keep in mind you're proposing to make everything run 3% slower instead of
using that 3% i/o bandwidth headroom to run vacuum outside the critical path.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XIDs and big boxes again ...

2008-05-11 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 ... Keep in mind you're proposing to make everything run 3% slower instead of
 using that 3% i/o bandwidth headroom to run vacuum outside the critical path.

I think that's actually understating the problem.  Assuming this is a
64-bit machine (which it had better be, if you want XID to be 64 bits...)
then the effective increase in tuple header size is not just 12 bytes
but 16 bytes, due to alignment padding.  Greg's 3% overhead number is
only on-target if your average row width is presently about 530 bytes.
It could easily be a whole lot less than that, and the overhead
proportionally higher.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XIDs and big boxes again ...

2008-05-11 Thread Hans-Juergen Schoenig

Tom Lane wrote:

Gregory Stark [EMAIL PROTECTED] writes:
  

... Keep in mind you're proposing to make everything run 3% slower instead of
using that 3% i/o bandwidth headroom to run vacuum outside the critical path.



I think that's actually understating the problem.  Assuming this is a
64-bit machine (which it had better be, if you want XID to be 64 bits...)
then the effective increase in tuple header size is not just 12 bytes
but 16 bytes, due to alignment padding.  Greg's 3% overhead number is
only on-target if your average row width is presently about 530 bytes.
It could easily be a whole lot less than that, and the overhead
proportionally higher.

regards, tom lane
  



overhead is not an issue here - if i lose 10 or 15% i am totally fine as 
long as i can reduce vacuum overhead to an absolute minimum.

overhead will vary with row sizes anyway - this is not the point.

the point is that you don't want to potentially vacuum a table when only 
a handful of records has been changed.


   many thanks,

  hans

--
Cybertec Schönig  Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XIDs and big boxes again ...

2008-05-11 Thread Tom Lane
Hans-Juergen Schoenig [EMAIL PROTECTED] writes:
 overhead is not an issue here - if i lose 10 or 15% i am totally fine as 
 long as i can reduce vacuum overhead to an absolute minimum.

I cannot see the sanity of taking a ~10% hit on all I/O activity
(especially foreground queries) to avoid having background vacuuming
going on --- at least assuming that we can keep the impact of vacuuming
below 10%, which I should hope that we could.  What your problem sounds
like to me is that you need a smarter autovacuum scheduler.  Some of the
map-fork ideas we've discussed would also help, by allowing vacuum to
skip pages that're known to contain only frozen tuples --- your large
low-turnover tables would probably have a lot of those.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XIDs and big boxes again ...

2008-05-11 Thread Joshua D. Drake

Hans-Juergen Schoenig wrote:


regards, tom lane
  



overhead is not an issue here - if i lose 10 or 15% i am totally fine as 
long as i can reduce vacuum overhead to an absolute minimum.

overhead will vary with row sizes anyway - this is not the point.


I am not buying this argument. If you have a 5TB database, I am going to 
assume you put it on enterprise class hardware. Enterprise class 
hardware can handle the I/O required to appropriately run vacuum.


We have a customer that is constantly running 5 autovacuum workers on 
only 28 spindles. We are in the process of upgrading them to 50 spindles 
at which point I will likely try 10 autovacuum workers.




the point is that you don't want to potentially vacuum a table when only 
a handful of records has been changed.


Right, generally speaking 20% is reasonable, although I tend to be much 
more aggressive and try to keep it at 10%.


Sincerely,

Joshua D. Drake


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers