Re: [HACKERS] Vacuum and Transactions

2005-10-05 Thread Hannu Krosing
On T, 2005-10-04 at 11:10 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  The catch is that there are some other very active structures (like
  pg_listener for Slony) which after a couple of hours without vacuuming
  will quickly have the DB at an unreasonably high load (low tens) which
  seems to all but halt the vacuum on the large structure.
 
 Yeah.  We desperately need to reimplement listen/notify :-( ... that
 code was never designed to handle high event rates.

Sure. But it handles amazingly well event rates up to a few hundred
events per second - given that pg_listener is cleaned up often enough.
Above a few hundred eps it starts geting stuck on locks.

It also seems that Slony can be modified to not use LISTEN/NOTIFY in
high load situations (akin to high performance network cards, which
switch from interrupt driven mode to polling mode if number of packets
per second reaches certain thresolds).

Unfortunately Slony and Listen/Notify is not the only place where high-
update rate tables start to suffer from vacuums inability to clean out
dead tuples when working in parallel with other slower vacuums. In real
life there are other database tasks which also need some tables to stay
small, while others must be huge in order to work effectively. Putting
small and big tables in different databases and using dblink-like
functionality when accessing them is one solution for such cases, but it
is rather ugly :(

-- 
Hannu Krosing [EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Vacuum and Transactions

2005-10-05 Thread Hannu Krosing
On T, 2005-10-04 at 00:26 -0400, Rod Taylor wrote:
 As I understand it vacuum operates outside of the regular transaction
 and if you stop it (SIGTERM, or pg_cancel_backend()) some of the work it
 accomplished will be kept when it rolls back.
 
 For large structures with a ton of dead entries (which I seem to have a
 case), running vacuum takes long enough that high-churn structures begin
 to experience difficulties.
 
 Is it reasonable to cancel and restart the vacuum process periodically
 (say every 12 hours) until it manages to complete the work? It takes
 about 2 hours to do the table scan, and should get in about 10 hours of
 index work each round.

It seems that the actual work done by LAZY VACUUM is not rolled back
when you kill the backend doing the vacuum (though VACUUM is quite hart
to kill, and may require KILL -9 to accomplis, with all the downsides of
kill -9).

So, yes, as a last resort you can kill VACUUM (or rather limit its
lifetime by set statement_timeout = XXX) and get some work done in
each run. It only makes sense if the timeout is big enough for vacuum to
complete the first scan (collect dead tuples) over the heap and then do
some actual work. For table with 3 indexes the timeout must be at least
(1.st heap scan + 3 indexscans with no work + some portion of 2nd
(cleanuout) heap scan )  to ever get the table completely cleaned up.

 The vacuum ignores vacuum transaction concept looks handy right now.

There is a patch for 8.1 in PATCHES list (postponed to 8.2 :( ). This
can be backported to 8.0 quite easily.

-- 
Hannu Krosing [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Vacuum and Transactions

2005-10-05 Thread Zeugswetter Andreas DAZ SD

  Is it reasonable to cancel and restart the vacuum process
periodically 
  (say every 12 hours) until it manages to complete the work? It takes

  about 2 hours to do the table scan, and should get in about 10 hours

  of index work each round.

If we started the vacuum with the indexes, remembered a lowest xid per
index,
we could then vacuum the heap up to the lowest of those xids, no ?
We could then also vacuum each index separately.

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Vacuum and Transactions

2005-10-05 Thread Gaetano Mendola
Rod Taylor wrote:
 I have maintenace_work_mem set to about 1GB in size.

Isn't a bit too much ?


Regards
Gaetano Mendola



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Vacuum and Transactions

2005-10-05 Thread Rod Taylor
  The vacuum ignores vacuum transaction concept looks handy right now.
 
 There is a patch for 8.1 in PATCHES list (postponed to 8.2 :( ). This
 can be backported to 8.0 quite easily.

Understood. I've seen them, but until they're well tested in the newest
version I won't be using them in a production environment.

I do appreciate the goal and look forward to this concept being applied
or a method of splitting up the work vacuum needs to do, in the future.

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Vacuum and Transactions

2005-10-05 Thread Rod Taylor
On Wed, 2005-10-05 at 09:53 +0300, Hannu Krosing wrote:
 On T, 2005-10-04 at 11:10 -0400, Tom Lane wrote:
  Rod Taylor [EMAIL PROTECTED] writes:
   The catch is that there are some other very active structures (like
   pg_listener for Slony) which after a couple of hours without vacuuming
   will quickly have the DB at an unreasonably high load (low tens) which
   seems to all but halt the vacuum on the large structure.
  
  Yeah.  We desperately need to reimplement listen/notify :-( ... that
  code was never designed to handle high event rates.
 
 Sure. But it handles amazingly well event rates up to a few hundred
 events per second - given that pg_listener is cleaned up often enough.

Accomplishing the pg_listener cleanup often enough can be difficult in
some circumstances.

 It also seems that Slony can be modified to not use LISTEN/NOTIFY in
 high load situations (akin to high performance network cards, which
 switch from interrupt driven mode to polling mode if number of packets
 per second reaches certain thresolds).

I have other items in this database with high churn as well. Slony was
just an example.

-- 


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Vacuum and Transactions

2005-10-05 Thread Chris Browne
[EMAIL PROTECTED] (Hannu Krosing) writes:
 It also seems that Slony can be modified to not use LISTEN/NOTIFY in
 high load situations (akin to high performance network cards, which
 switch from interrupt driven mode to polling mode if number of packets
 per second reaches certain thresolds).

Yeah, I want to do some more testing of that; it should be easy to
improve the abuse of pg_listener a whole lot.

 Unfortunately Slony and Listen/Notify is not the only place where
 high- update rate tables start to suffer from vacuums inability to
 clean out dead tuples when working in parallel with other slower
 vacuums. In real life there are other database tasks which also need
 some tables to stay small, while others must be huge in order to
 work effectively. Putting small and big tables in different
 databases and using dblink-like functionality when accessing them is
 one solution for such cases, but it is rather ugly :(

That eliminates the ability to utilize transactions on things that
ought to be updated in a single transaction...
-- 
output = (cbbrowne @ ntlug.org)
http://cbbrowne.com/info/lsf.html
MS-Windows: Proof that P.T. Barnum was correct. 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Vacuum and Transactions

2005-10-04 Thread Simon Riggs
On Tue, 2005-10-04 at 00:26 -0400, Rod Taylor wrote:
 As I understand it vacuum operates outside of the regular transaction
 and if you stop it (SIGTERM, or pg_cancel_backend()) some of the work it
 accomplished will be kept when it rolls back.
 
 For large structures with a ton of dead entries (which I seem to have a
 case), running vacuum takes long enough that high-churn structures begin
 to experience difficulties.
 
 Is it reasonable to cancel and restart the vacuum process periodically
 (say every 12 hours) until it manages to complete the work? It takes
 about 2 hours to do the table scan, and should get in about 10 hours of
 index work each round.

That is what I've had to recommend in extreme cases, with some success. 

For a non-FULL VACUUM, all of the database changes it does will be kept,
though that is not the only cost, as you indicate.  However, you're
right to question it since it does have some downsides like not
correctly updating statistics at the end of the run.

I wouldn't try this with VACUUM FULL. In that case, I'd VACUUM first,
then when all dead-rows are gone go for the VACUUM FULL; but I would
find another way round that, like a CTAS.

The problem is that VACUUM doesn't emit enough messages for you to know
when it gets to the end of each phase, so you've not much clue about how
much of that 12 hours would be wasted. Though as you say, it seems
likely that much of it is worthwhile in the situation you describe.

The tipping point is when VACUUM finds more dead rows than fits within
maintenance_work_mem/(size of row pointer). Thats when we start to do
multiple passes of each of the indexes.

Maybe it would be good to have a VACUUM max-one-pass only command, to
allow you to break big VACUUMs down into smaller chunks. Or perhaps we
should have a trace_vacuum command as well to allow you to see where to
cancel it? (Put notices in lazy_vacuum_index and lazy_vacuum_heap).

Hope that helps.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Vacuum and Transactions

2005-10-04 Thread Rod Taylor
  Is it reasonable to cancel and restart the vacuum process periodically
 
 No.
 
 How big is that table, anyway?  Are you trying a VACUUM FULL, or plain
 vacuum?

It's only about 60GB in size but appears it has been missed for nearly a
month for vacuum and probably has a large percentage dead material (30%
or so).

I'm trying to run a plain vacuum and the Pg version is 8.0.3.

Building indexes on this structure also takes a significant amount of
time. I have maintenace_work_mem set to about 1GB in size.

The catch is that there are some other very active structures (like
pg_listener for Slony) which after a couple of hours without vacuuming
will quickly have the DB at an unreasonably high load (low tens) which
seems to all but halt the vacuum on the large structure.



Rightfully the table should be partitioned by time, but I haven't quite
figured out how to delete data from the old structure without increasing
the vacuum time required.

Another alternative I'm considering is to create new partial indexes for
the active structures, drop all of the old full table indexes and run
vacuum on that, then partition it.
-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Vacuum and Transactions

2005-10-04 Thread Tom Lane
Rod Taylor [EMAIL PROTECTED] writes:
 As I understand it vacuum operates outside of the regular transaction

It's a perfectly normal transaction.

 Is it reasonable to cancel and restart the vacuum process periodically

No.

How big is that table, anyway?  Are you trying a VACUUM FULL, or plain
vacuum?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Vacuum and Transactions

2005-10-04 Thread Tom Lane
Rod Taylor [EMAIL PROTECTED] writes:
 The catch is that there are some other very active structures (like
 pg_listener for Slony) which after a couple of hours without vacuuming
 will quickly have the DB at an unreasonably high load (low tens) which
 seems to all but halt the vacuum on the large structure.

Yeah.  We desperately need to reimplement listen/notify :-( ... that
code was never designed to handle high event rates.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Vacuum and Transactions

2005-10-03 Thread Rod Taylor
As I understand it vacuum operates outside of the regular transaction
and if you stop it (SIGTERM, or pg_cancel_backend()) some of the work it
accomplished will be kept when it rolls back.

For large structures with a ton of dead entries (which I seem to have a
case), running vacuum takes long enough that high-churn structures begin
to experience difficulties.

Is it reasonable to cancel and restart the vacuum process periodically
(say every 12 hours) until it manages to complete the work? It takes
about 2 hours to do the table scan, and should get in about 10 hours of
index work each round.

The vacuum ignores vacuum transaction concept looks handy right now.

-- 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly