Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated
On Mon, Jun 19, 2017 at 1:53 PM, Dmitry O Litvintsevwrote: > yes, we had to restart database 4 days ago (and vacuum has resumed on start). > I checked the log files and discovered that autovacuum on this table takes > > pages: 0 removed, 14072307 remain > tuples: 43524292 removed, 395006545 remain > buffer usage: -1493114028 hits, 107664973 misses, 30263658 dirtied > avg read rate: 1.604 MB/s, avg write rate: 0.451 MB/s > system usage: CPU 2055.81s/17710.94u sec elapsed 524356.57 sec > > 6 days. So it is perpetually being autovacuumed (which I assumed to be a good > thing) > > Table has 400M entries, 115 GB. > > I will try your suggestions in the test environment. > > Thank you, > Dmitry Once you get this sorted, look into using the checkpostgresql.pl script and a monitoring solution like zabbix or nagios to monitor things like transactions until wraparound etc so you don't wind up back here again. Best of luck in. Note that if you drop the vacuum delay to 0ms the vacuum will probably complete in a few hours tops. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated
yes, we had to restart database 4 days ago (and vacuum has resumed on start). I checked the log files and discovered that autovacuum on this table takes pages: 0 removed, 14072307 remain tuples: 43524292 removed, 395006545 remain buffer usage: -1493114028 hits, 107664973 misses, 30263658 dirtied avg read rate: 1.604 MB/s, avg write rate: 0.451 MB/s system usage: CPU 2055.81s/17710.94u sec elapsed 524356.57 sec 6 days. So it is perpetually being autovacuumed (which I assumed to be a good thing) Table has 400M entries, 115 GB. I will try your suggestions in the test environment. Thank you, Dmitry From: Jeff Janes <jeff.ja...@gmail.com> Sent: Monday, June 19, 2017 1:16 PM To: Dmitry O Litvintsev Cc: Andreas Kretschmer; pgsql-general@postgresql.org Subject: Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated On Mon, Jun 19, 2017 at 10:33 AM, Dmitry O Litvintsev <litvi...@fnal.gov<mailto:litvi...@fnal.gov>> wrote: Hi Since I have posted this nothing really changed. I am starting to panic (mildly). The source (production) runs : relname | mode | granted | substr| query_start | age +--+-+--+---+ t_inodes_iio_idx | RowExclusiveLock | t | autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound) | 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559 This is close to unreadable. You can use use \x to get output from psql which survives email more readably. Your first report was 6 days ago. Why is the job only 4 days old? Are you frequently restarting your production server, so that the vacuum job never gets a chance to finish? If so, that would explain your predicament. And how big is this table, that it takes at least 4 days to VACUUM? vacuum_cost_delay = 50ms That is a lot. The default value for this is 0. The default value for autovacuum_vacuum_cost_delay is 20, which is usually too high for giant databases. I think you are changing this in the wrong direction. Rather than increase vacuum_cost_delay, you need to decrease autovacuum_vacuum_cost_delay, so that you won't keep having problems in the future. On your test server, change vacuum_cost_delay to zero and then initiate a manual vacuum of the table. It will block on the autovacuum's lock, so then kill the autovacuum (best to have the manual vacuum queued up first, otherwise it will be race between when you start the manual vacuum, and when the autovacuum automatically restarts, to see who gets the lock). See how long it takes this unthrottled vacuum to run, and how much effect the IO it causes has on the performance of other tasks. If acceptable, repeat this on production (although really, I don't that you have much of a choice on whether the effect it is acceptable or not--it needs to be done.) Cheers, Jeff -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated
On Mon, 19 Jun 2017 17:33:23 + Dmitry O Litvintsevwrote: > > The test stand where I was to test schema upgrade is stuck cuz vacuum is > blocking. If you're in "panic mode" I would recommend cancelling the existing vacuum, running your upgrades, then immeditely running VACUUM FREEZE ANALYZE on that table to make up for cancelling the autovacuum. Note that the manual VACUUM may take quite a while, so run it in a screen session or something. Additionally, autovacuum is going to try to relaunch that vacuum pretty aggressively, so you might have to cancel it a few times (using pg_terminate_backend()) before your other processes are able to sneak in ahead of it. Once you're out of panic mode you can take some time to breathe and consider your options for reducing the issue in the future. I have to 2nd Alvaro's comments about the cost delay. Personally, I'd recommend setting vacuum_cost_delay to zero, unless your hardware is crap. In my recent experience, allowing vacuum to run full-bore is less intrustive on a busy database with good hardware than forcing it to take it's time. Unfortunately, changing it now isn't going to speed the current vacuum up any. Another comment: schema changes almost always need exclusive locks on tables that they're modifying. As a result, you really need to plan them out a bit. Anything could block a schema update, even a simple SELECT statement; so it's important to check the health of things before starting. While it's not _generally_ a good practice to interrupt autovacuum, it _can_ be done if the schema upgrade is necessary. Keep in mind that it's just going to start back up again, but hopefully your schema update will be done by then and it can do it's work without interfering with things. Another thing you can do is to monitor the transaction ID values (the Nagios check_postgres has a nice mode for monitoring this) and manually launch a VACUUM FREEZE ahead of autovacuum, so that _you_ can pick the time for it to run and not have it happen to crop up at the worst possible time ;) You might also find that things are easier to deal with if you tweak the autovacuum settings on this table to cause it to be vacuumed more frequently. In my experience, more frequent vacuums that do less work each time often lead to happier databases. See ALTER TABLE and the available settings to tweak autovacuum behavior. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated
On Mon, Jun 19, 2017 at 10:33 AM, Dmitry O Litvintsevwrote: > Hi > > Since I have posted this nothing really changed. I am starting to panic > (mildly). > > The source (production) runs : > > relname | mode | granted | > substr| > query_start | age > +--+ > -+-- > +---+ > t_inodes_iio_idx | RowExclusiveLock | t | > autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound) | > 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559 > This is close to unreadable. You can use use \x to get output from psql which survives email more readably. Your first report was 6 days ago. Why is the job only 4 days old? Are you frequently restarting your production server, so that the vacuum job never gets a chance to finish? If so, that would explain your predicament. And how big is this table, that it takes at least 4 days to VACUUM? vacuum_cost_delay = 50ms > That is a lot. The default value for this is 0. The default value for autovacuum_vacuum_cost_delay is 20, which is usually too high for giant databases. I think you are changing this in the wrong direction. Rather than increase vacuum_cost_delay, you need to decrease autovacuum_vacuum_cost_delay, so that you won't keep having problems in the future. On your test server, change vacuum_cost_delay to zero and then initiate a manual vacuum of the table. It will block on the autovacuum's lock, so then kill the autovacuum (best to have the manual vacuum queued up first, otherwise it will be race between when you start the manual vacuum, and when the autovacuum automatically restarts, to see who gets the lock). See how long it takes this unthrottled vacuum to run, and how much effect the IO it causes has on the performance of other tasks. If acceptable, repeat this on production (although really, I don't that you have much of a choice on whether the effect it is acceptable or not--it needs to be done.) Cheers, Jeff
Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated
Dmitry O Litvintsev wrote: > Hi > > Since I have posted this nothing really changed. I am starting to panic > (mildly). ... > vacuum_cost_delay = 50ms Most likely, this value is far too high. You're causing autovacuum to sleep for a very long time with this setting. Hard to say for certain without seeing the cost_limit value and the other related parameters, but it's most certainly what's causing you pain. The default of 20ms is already too high for most users. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated
Hi Since I have posted this nothing really changed. I am starting to panic (mildly). The source (production) runs : relname | mode | granted | substr| query_start | age +--+-+--+---+ t_inodes_iio_idx | RowExclusiveLock | t | autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound) | 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559 t_inodes_pkey | RowExclusiveLock | t | autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound) | 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559 | ExclusiveLock| t | autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound) | 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559 t_inodes | ShareUpdateExclusiveLock | t | autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound) | 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559 t_inodes_itype_idx | RowExclusiveLock | t | autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound) | 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559 t_inodes_imtime_idx| RowExclusiveLock | t | autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound) | 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559 Above does not impact production activity a lot. On the test stand (where I pg_basebackupped from production and also upgraded to 9.6) I see: relname | mode | granted | substr | query_start | age ---+--+-++---+ t_inodes | ShareUpdateExclusiveLock | t | autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 15:27:54.872154-05 | 5 days 20:59:22.769404 t_inodes_itype_idx| RowExclusiveLock | t | autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 15:27:54.872154-05 | 5 days 20:59:22.769404 t_inodes_imtime_idx | RowExclusiveLock | t | autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 15:27:54.872154-05 | 5 days 20:59:22.769404 t_inodes_iio_idx | RowExclusiveLock | t | autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 15:27:54.872154-05 | 5 days 20:59:22.769404 t_inodes_pkey | RowExclusiveLock | t | autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 15:27:54.872154-05 | 5 days 20:59:22.769404 | ExclusiveLock| t | autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 15:27:54.872154-05 | 5 days 20:59:22.769404 t_inodes | ShareUpdateExclusiveLock | f | ANALYZE; | 2017-06-13 15:27:59.781285-05 | 5 days 20:59:17.860273 | ExclusiveLock| t | ANALYZE; | 2017-06-13 15:27:59.781285-05 | 5 days 20:59:17.860273 The test stand where I was to test schema upgrade is stuck cuz vacuum is blocking. Production settings follow: version 9.3.9 max_connections = 512 shared_buffers = 8192MB temp_buffers = 1024MB work_mem = 512MB #maintenance_work_mem = 2048MB maintenance_work_mem = 4096MB #increased after 3 days of vacuum analyze running max_stack_depth = 2MB vacuum_cost_delay = 50ms synchronous_commit = off wal_buffers = 245MB wal_writer_delay = 10s checkpoint_segments = 64 checkpoint_completion_target = 0.9 random_page_cost = 2.0 effective_cache_size = 94GB wal_level = hot_standby hot_standby = on archive_mode = on archive_command = '/usr/loca/bin/wal_backup.sh %p %f' max_wal_senders = 4 wal_keep_segments = 1024 max_standby_streaming_delay = 7200s So, the problem : I cannot do schema change until vacuum has finished, and there seems to be no end in sight for vacuum to finish throwing off our software upgrade plans. Anything can be done here? Thanks, Dmitry From: Andreas Kretschmer <andr...@a-kretschmer.de> Sent: Tuesday, June 13, 2017 1:54 PM To: pgsql-general@postgresql.org; Dmitry O Litvintsev; pgsql-general@postgresql.org Subject: Re: [GENERAL] autovacuum holds exclusive lock on
Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated
Am 13. Juni 2017 20:04:04 MESZ schrieb Dmitry O Litvintsev: > >I >wraparound)| 2017- >| t | enstore | autovacuum: VACUUM public.t_inodes (to prevent >wraparound)| 2017-06-13 12:31:04.870064-05 | >00:28:50.276437 | 40672 >chimera | t_inodes | | >ShareUpdateExclusiveLock | t | enstore | autovacuum: VACUUM >public.t_inodes (to prevent wraparound)| 2017-06-13 >12:31:04.870064-05 | 00:28:50.276437 | 40672 > It is a autocacuum to prevent wraparound, you can't stop or avoid that. Regards, Andreas -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Thu, Feb 23, 2017 at 2:42 AM, Bill Moranwrote: > On Wed, 22 Feb 2017 13:19:11 -0800 > Jeff Janes wrote: > > > On Mon, Feb 20, 2017 at 5:40 PM, Merlin Moncure > wrote: > > > > > > On Thursday, February 16, 2017, Tom Lane wrote: > > > > > >> Tim Bellis writes: > > >> > Even though this is a read only query, is it also expected to be > > >> blocked behind the vacuum? Is there a way of getting indexes for a > table > > >> which won't be blocked behind a vacuum? > > >> > > >> It's not the vacuum that's blocking your read-only queries. It's the > > >> ALTER TABLE, which needs an exclusive lock in order to alter the > table's > > >> schema. The ALTER is queued waiting for the vacuum to finish, and > lesser > > >> lock requests queue up behind it. We could let the non-exclusive lock > > >> requests go ahead of the ALTER, but that would create a severe risk > of the > > >> ALTER *never* getting to run. > > >> > > >> I'd kill the ALTER and figure on trying again after the vacuum is > done. > > >> > > >> > > > I've been drilled by this and similar lock stacking issues enough > times to > > > make me near 100% sure deferring the ALTER would be the better choice > > > > > > > > This seems like a rather one-sided observation. How could you know how > > often the unimplemented behavior also would have "drilled" you, since it > is > > unimplemented? > > > > There was a proposal for a "LOCK TABLE...DEFERABLE" which would allow > other > > requestors jump the queue if they were compatible with the held lock. If > > that is implemented, then you would just manually lock the table > deferably > > before invoking the ALTER TABLE command, if that is the behavior you > wanted > > (but it wouldn't work for things that can't be run in transactions) > > This seems redundant to me. > > We already have LOCK ... NOWAIT. It's fairly trivial to write an upgrade > script that uses LOCK to explicitly lock tables that it's going to ALTER, > then busy-waits if the lock is not immediately grantable. > As fairly trivial as it is, I bet I would mess it up a few times before I got it right. And then it would probably still be wrong in corner cases. What if it fails not because the lock is unavailable, but for some obscure error other than the ones anticipated or encountered during testing? And busy-waiting is generally nasty and a waste of resources. > > The fact that so many ORMs and similar tools don't take advantage of that > functionality is rather depressing. > > In my experience, I've also seen heavily loaded systems that this wouldn't > work on, essentially because there is _always_ _some_ lock on every table. > This is a case where experienced developers are required to take some > extra time to coordinate their upgrades to work around the high load. But > the proposed LOCK TABLE ... DEFERRABLE wouldn't help there either, because > the ALTER would be deferred indefinitely. > True. That is why it would be available only upon request, not the new default. > Personally, I feel like the existing behavior is preferrable. Software > teams need to take the time to understand the locking implications of their > actions or they'll have nothing but trouble anyway. > > As I've seen time and again: writing an application that handles low load > and low concurrency is fairly trivial, but scaling that app up to high > load and/or high concurrency generally sorts out the truely brilliant > developers from the merely average. > So why not give the merely average some better tools? Cheers, Jeff
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Wed, Feb 22, 2017 at 3:19 PM, Jeff Janeswrote: > On Mon, Feb 20, 2017 at 5:40 PM, Merlin Moncure wrote: >> On Thursday, February 16, 2017, Tom Lane wrote: >>> >>> Tim Bellis writes: >>> > Even though this is a read only query, is it also expected to be >>> > blocked behind the vacuum? Is there a way of getting indexes for a table >>> > which won't be blocked behind a vacuum? >>> >>> It's not the vacuum that's blocking your read-only queries. It's the >>> ALTER TABLE, which needs an exclusive lock in order to alter the table's >>> schema. The ALTER is queued waiting for the vacuum to finish, and lesser >>> lock requests queue up behind it. We could let the non-exclusive lock >>> requests go ahead of the ALTER, but that would create a severe risk of >>> the >>> ALTER *never* getting to run. >>> >>> I'd kill the ALTER and figure on trying again after the vacuum is done. >> >> I've been drilled by this and similar lock stacking issues enough times to >> make me near 100% sure deferring the ALTER would be the better choice > > This seems like a rather one-sided observation. How could you know how > often the unimplemented behavior also would have "drilled" you, since it is > unimplemented? Well, that I can't really say, but at least in my case ALTER TABLE in the face of concurrent application activity can wait but locking tables for reading for an indefinite period will rapidly destabilize the system. An example of this usage is replacing partitions on a warehouse table. About half of my P1s over the last 12 months ago are relating to locking problem of some kind. So I end up during workarounds such as issuing "LOCK...NOWAIT" in a sleep loop :( or application restructuring, especially minimizing use of TRUNCATE. I do think instrumentation around locking behaviors would be helpful. Allowing (optionally) waiters to leapfrog in if they can clear would be wonderful as would being able to specify maximum wait timeouts inside a transaction. FWIW, I'm not sure this behavior makes sense attached to LOCK, I'd rather see them attached generally to SET TRANSACTION -- my 0.02$ (talk is cheap, etc). merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Wed, 22 Feb 2017 13:19:11 -0800 Jeff Janeswrote: > On Mon, Feb 20, 2017 at 5:40 PM, Merlin Moncure wrote: > > > > On Thursday, February 16, 2017, Tom Lane wrote: > > > >> Tim Bellis writes: > >> > Even though this is a read only query, is it also expected to be > >> blocked behind the vacuum? Is there a way of getting indexes for a table > >> which won't be blocked behind a vacuum? > >> > >> It's not the vacuum that's blocking your read-only queries. It's the > >> ALTER TABLE, which needs an exclusive lock in order to alter the table's > >> schema. The ALTER is queued waiting for the vacuum to finish, and lesser > >> lock requests queue up behind it. We could let the non-exclusive lock > >> requests go ahead of the ALTER, but that would create a severe risk of the > >> ALTER *never* getting to run. > >> > >> I'd kill the ALTER and figure on trying again after the vacuum is done. > >> > >> > > I've been drilled by this and similar lock stacking issues enough times to > > make me near 100% sure deferring the ALTER would be the better choice > > > > > This seems like a rather one-sided observation. How could you know how > often the unimplemented behavior also would have "drilled" you, since it is > unimplemented? > > There was a proposal for a "LOCK TABLE...DEFERABLE" which would allow other > requestors jump the queue if they were compatible with the held lock. If > that is implemented, then you would just manually lock the table deferably > before invoking the ALTER TABLE command, if that is the behavior you wanted > (but it wouldn't work for things that can't be run in transactions) This seems redundant to me. We already have LOCK ... NOWAIT. It's fairly trivial to write an upgrade script that uses LOCK to explicitly lock tables that it's going to ALTER, then busy-waits if the lock is not immediately grantable. The fact that so many ORMs and similar tools don't take advantage of that functionality is rather depressing. In my experience, I've also seen heavily loaded systems that this wouldn't work on, essentially because there is _always_ _some_ lock on every table. This is a case where experienced developers are required to take some extra time to coordinate their upgrades to work around the high load. But the proposed LOCK TABLE ... DEFERRABLE wouldn't help there either, because the ALTER would be deferred indefinitely. Personally, I feel like the existing behavior is preferrable. Software teams need to take the time to understand the locking implications of their actions or they'll have nothing but trouble anyway. As I've seen time and again: writing an application that handles low load and low concurrency is fairly trivial, but scaling that app up to high load and/or high concurrency generally sorts out the truely brilliant developers from the merely average. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Fri, Feb 17, 2017 at 10:36 AM, Tim Bellis <tim.bel...@metaswitch.com> wrote: > > > > > *From:* Jeff Janes [mailto:jeff.ja...@gmail.com] > *Sent:* 17 February 2017 02:59 > *To:* Tim Bellis <tim.bel...@metaswitch.com> > *Cc:* pgsql-general@postgresql.org > *Subject:* Re: [GENERAL] Autovacuum stuck for hours, blocking queries > > > > On Wed, Feb 15, 2017 at 9:30 AM, Tim Bellis <tim.bel...@metaswitch.com> > wrote: > > I have a postgres 9.3.4 database table which (intermittently but reliably) > gets into a state where queries get blocked indefinitely (at least for many > hours) behind an automatic vacuum. I was under the impression that vacuum > should never take any blocking locks for any significant period of time, > and so would like help resolving the issue. > > The process blocking the query is: > postgres 21985 11304 98 Feb13 ?1-14:20:52 postgres: autovacuum > worker process > which is running the query > autovacuum: VACUUM public. > > > > Are you sure it doesn't really say: > > > > autovacuum: VACUUM public. (to prevent wraparound) > > *[Tim Bellis] It doesn’t. I was using the query from * > *https://wiki.postgresql.org/wiki/Lock_Monitoring* > <https://wiki.postgresql.org/wiki/Lock_Monitoring>* and looking at the > ‘current_statement_in_blocking_process’ column. Is there a different query > I should be using?* > That query seems to be a bit mangled. At one time, it only found row-level locks. Someone changed that, but didn't remove the comment "these only find row-level locks, not object-level locks" Also, the "WHERE NOT blocked_locks.GRANTED" should perhaps be: WHERE NOT blocked_locks.GRANTED and blocking_locks.GRANTED; As it is, every waiting query reports that it is waiting on all of its fellow victims as well as the thing(s) actually blocking it. But my WHERE clause is not really correct either, as it is possible that it is one blocked thing is being blocked by a different blocked thing which is ahead of it in the queue, when without that intervening blocked requestor it could be immediately granted if its request mode is compatible with the held mode(s). I don't think there is a query that can reveal what is most immediately blocking it. But, I don't see how this explains what you see. An autovacuum without "(to prevent wraparound)" should not block anything for much more than a second (unless you changed deadlock_timeout) and should not be blocked by anything either as it just gives up on the operation if the lock is not immediately available. Cheers, Jeff
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Mon, Feb 20, 2017 at 5:40 PM, Merlin Moncurewrote: > > > On Thursday, February 16, 2017, Tom Lane wrote: > >> Tim Bellis writes: >> > Even though this is a read only query, is it also expected to be >> blocked behind the vacuum? Is there a way of getting indexes for a table >> which won't be blocked behind a vacuum? >> >> It's not the vacuum that's blocking your read-only queries. It's the >> ALTER TABLE, which needs an exclusive lock in order to alter the table's >> schema. The ALTER is queued waiting for the vacuum to finish, and lesser >> lock requests queue up behind it. We could let the non-exclusive lock >> requests go ahead of the ALTER, but that would create a severe risk of the >> ALTER *never* getting to run. >> >> I'd kill the ALTER and figure on trying again after the vacuum is done. >> >> > I've been drilled by this and similar lock stacking issues enough times to > make me near 100% sure deferring the ALTER would be the better choice > > This seems like a rather one-sided observation. How could you know how often the unimplemented behavior also would have "drilled" you, since it is unimplemented? There was a proposal for a "LOCK TABLE...DEFERABLE" which would allow other requestors jump the queue if they were compatible with the held lock. If that is implemented, then you would just manually lock the table deferably before invoking the ALTER TABLE command, if that is the behavior you wanted (but it wouldn't work for things that can't be run in transactions) Ideally each requestor would specify if they will hold the lock for a long timer or a short time. Them a short requestor which is blocked behind a long requestor could let other compatible-with-held requests jump over it. But once it was only blocked by short locks, it would reassert the normal order, so it can't get permanently blocked by a constantly overlapping stream of short locks. But how would you get all lock requestors to provide a reasonable estimate? Cheers, Jeff
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
-Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: 16 February 2017 22:40 To: Tim Bellis <tim.bel...@metaswitch.com> Cc: Adrian Klaver <adrian.kla...@aklaver.com>; pgsql-general@postgresql.org; Alvaro Herrera <alvhe...@2ndquadrant.com>; Scott Marlowe <scott.marl...@gmail.com> Subject: Re: [GENERAL] Autovacuum stuck for hours, blocking queries Tim Bellis <tim.bel...@metaswitch.com> writes: > > Even though this is a read only query, is it also expected to be blocked > > behind the vacuum? Is there a way of getting indexes for a table which > > won't be blocked behind a vacuum? > It's not the vacuum that's blocking your read-only queries. It's the ALTER > TABLE, which needs an exclusive lock in order to alter the table's schema. > The ALTER is queued waiting for the vacuum to finish, and lesser lock > requests queue up behind it. We could let the non-exclusive lock requests go > ahead of the ALTER, but that would create a severe risk of the ALTER *never* > getting to run. The lock monitoring query (https://wiki.postgresql.org/wiki/Lock_Monitoring) said that the blocking_pid and the current_statement_in_blocking_process for the queries reading the index data was the autovacuum, not the ALTER. Am I reading the output wrong? Does it not correctly represent the chain of locks? > I'd kill the ALTER and figure on trying again after the vacuum is done. > Also you might want to look into how you got into a situation where you have > an anti-wraparound vacuum that's taking so long to run. > You didn't do something silly like disable autovacuum did you? No, autovacuum is on (and this is an autovacuum which is in progress). But I will look at why I'm getting a blocking autovacuum. > regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Thursday, February 16, 2017, Tom Lanewrote: > Tim Bellis > writes: > > Even though this is a read only query, is it also expected to be blocked > behind the vacuum? Is there a way of getting indexes for a table which > won't be blocked behind a vacuum? > > It's not the vacuum that's blocking your read-only queries. It's the > ALTER TABLE, which needs an exclusive lock in order to alter the table's > schema. The ALTER is queued waiting for the vacuum to finish, and lesser > lock requests queue up behind it. We could let the non-exclusive lock > requests go ahead of the ALTER, but that would create a severe risk of the > ALTER *never* getting to run. > > I'd kill the ALTER and figure on trying again after the vacuum is done. > > I've been drilled by this and similar lock stacking issues enough times to make me near 100% sure deferring the ALTER would be the better choice merlin
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On 02/17/2017 11:54 PM, Michael Paquier wrote: On Sat, Feb 18, 2017 at 1:32 PM, Scott Marlowewrote: Yes it can. Truncate has been rollbackable for a while now. Per the docs: https://www.postgresql.org/docs/9.6/static/sql-truncate.html "TRUNCATE is transaction-safe with respect to the data in the tables: the truncation will be safely rolled back if the surrounding transaction does not commit." In short yes a transaction doing a truncate can be rollbacked. I think the part that confuses people into thinking it can not be rollbacked is this: "TRUNCATE is not MVCC-safe. After truncation, the table will appear empty to concurrent transactions, if they are using a snapshot taken before the truncation occurred. See Section 13.5 for more details." -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Sat, Feb 18, 2017 at 1:32 PM, Scott Marlowewrote: > Yes it can. Truncate has been rollbackable for a while now. Per the docs: https://www.postgresql.org/docs/9.6/static/sql-truncate.html "TRUNCATE is transaction-safe with respect to the data in the tables: the truncation will be safely rolled back if the surrounding transaction does not commit." In short yes a transaction doing a truncate can be rollbacked. -- Michael -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Fri, Feb 17, 2017 at 1:38 PM, Rakesh Kumarwrote: > LOCK TABLE yourtable ; > CREATE TEMPORARY TABLE keep AS SELECT * FROM yourtable WHERE ; > TRUNCATE yourtable; > INSERT INTO yourtable SELECT * from keep; > COMMIT; > === > the above snippet assumes truncate in PG can be in a transaction. In other > words, while truncate by itself > is atomic, it can't be rolled back. So in the above case, if "INSERT INTO > yourtable SELECT * from keep;" and > we rollback, will it rollback yourtable. Yes it can. Truncate has been rollbackable for a while now. begin; create table insert into table truncate old table . something goes wrong . rollback; Unless I misunderstand your meaning. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
LOCK TABLE yourtable ; CREATE TEMPORARY TABLE keep AS SELECT * FROM yourtable WHERE ; TRUNCATE yourtable; INSERT INTO yourtable SELECT * from keep; COMMIT; === the above snippet assumes truncate in PG can be in a transaction. In other words, while truncate by itself is atomic, it can't be rolled back. So in the above case, if "INSERT INTO yourtable SELECT * from keep;" and we rollback, will it rollback yourtable. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
Hi Tim, Am 2017-02-17 um 17:02 schrieb Tim Bellis: The DELETE operations only deletes rows from the > previous day. It's possible that there have been rows > added that day which ought not to be deleted, so > TRUNCATE wouldn't work. OK, then I'll try two other suggestions: - use table partitioning ( https://www.postgresql.org/docs/9.3/static/ddl-partitioning.html ) - if the number of rows you need to keep is small, you could try something like this: LOCK TABLE yourtable ; CREATE TEMPORARY TABLE keep AS SELECT * FROM yourtable WHERE ; TRUNCATE yourtable; INSERT INTO yourtable SELECT * from keep; COMMIT; Best regards, -hannes -Original Message- From: Hannes Erven [mailto:han...@erven.at] Sent: 17 February 2017 11:47 To: pgsql-general@postgresql.org Cc: Tim Bellis <tim.bel...@metaswitch.com> Subject: Re: [GENERAL] Autovacuum stuck for hours, blocking queries Hi Tim, Am 2017-02-15 um 18:30 schrieb Tim Bellis: I have a postgres 9.3.4 database table which (intermittently but reliably) > gets into a state where queries get blocked indefinitely > [..] Notes: - This database table is used for about 6 million row writes per day, > all of which are then deleted at the end of the day. If you are dumping the contents of the table anyways, why not use TRUNCATE instead of DELETE? It unlinks and recreates the table data files, requiring nearly zero IO and analyzing. Or even drop, and recreate the table with the correct new structure so you do not even need to ALTER TABLE ? I'm a bit skeptical of these suggestions since very competent people have already answered your post and did not come up with this... ;-) the only drawback I'm aware of is that TRUNCATE will immediatly free disk space on the OS level, so the table's space will not be "blocked". But probably the VACUUM you are currently performing will also eventually release the unused disk space, so this may or may not match the current behaviour. Best regards, -hannes -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
From: Jeff Janes [mailto:jeff.ja...@gmail.com] Sent: 17 February 2017 02:59 To: Tim Bellis <tim.bel...@metaswitch.com> Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Autovacuum stuck for hours, blocking queries On Wed, Feb 15, 2017 at 9:30 AM, Tim Bellis <tim.bel...@metaswitch.com<mailto:tim.bel...@metaswitch.com>> wrote: I have a postgres 9.3.4 database table which (intermittently but reliably) gets into a state where queries get blocked indefinitely (at least for many hours) behind an automatic vacuum. I was under the impression that vacuum should never take any blocking locks for any significant period of time, and so would like help resolving the issue. The process blocking the query is: postgres 21985 11304 98 Feb13 ?1-14:20:52 postgres: autovacuum worker process which is running the query autovacuum: VACUUM public. Are you sure it doesn't really say: autovacuum: VACUUM public. (to prevent wraparound) [Tim Bellis] It doesn’t. I was using the query from https://wiki.postgresql.org/wiki/Lock_Monitoring and looking at the ‘current_statement_in_blocking_process’ column. Is there a different query I should be using? If it doesn't include the "to prevent wraparound", then it should sacrifice itself as soon as it realizes it is blocking something else. If it is not doing that, something is wrong. If it does say "(to prevent wraparound)", then see all the other comments on this thread. Notes: - This database table is used for about 6 million row writes per day, all of which are then deleted at the end of the day. How many transactions do those 6 million writes comprise? [Tim Bellis] I’m not sure - I think it’s 6 million transactions, but there might be some batching going on that I’m unaware of. What would this affect? (I can dig in if necessary) (I might have been slightly wrong in characterising the exact behaviour; the table might be cleared every hour rather than every day, but there are still 6 million writes per day) - Our application kicks off a manual vacuum against this table each night which doesn't hit this problem, as far as we're aware. If it were having problems, would you be aware of it? Do you see in the log files the completion of the vacuum? Or look in pg_stat_user_tables to see when last_vacuum was. If it runs every night and succeeds, it is hard to see why wraparound would ever kick in. Unless you are hitting 150,000,000 transactions in a day. [Tim Bellis] I shall investigate this. Cheers, Jeff
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
The DELETE operations only deletes rows from the previous day. It's possible that there have been rows added that day which ought not to be deleted, so TRUNCATE wouldn't work. But that was a helpful suggestion - thanks! Tim -Original Message- From: Hannes Erven [mailto:han...@erven.at] Sent: 17 February 2017 11:47 To: pgsql-general@postgresql.org Cc: Tim Bellis <tim.bel...@metaswitch.com> Subject: Re: [GENERAL] Autovacuum stuck for hours, blocking queries Hi Tim, Am 2017-02-15 um 18:30 schrieb Tim Bellis: > I have a postgres 9.3.4 database table which (intermittently but > reliably) > gets into a state where queries get blocked indefinitely > [..] > Notes: > - This database table is used for about 6 million row writes per > day, > all of which are then deleted at the end of the day. If you are dumping the contents of the table anyways, why not use TRUNCATE instead of DELETE? It unlinks and recreates the table data files, requiring nearly zero IO and analyzing. Or even drop, and recreate the table with the correct new structure so you do not even need to ALTER TABLE ? I'm a bit skeptical of these suggestions since very competent people have already answered your post and did not come up with this... ;-) the only drawback I'm aware of is that TRUNCATE will immediatly free disk space on the OS level, so the table's space will not be "blocked". But probably the VACUUM you are currently performing will also eventually release the unused disk space, so this may or may not match the current behaviour. Best regards, -hannes -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
Hi Tim, Am 2017-02-15 um 18:30 schrieb Tim Bellis: I have a postgres 9.3.4 database table which (intermittently but reliably) > gets into a state where queries get blocked indefinitely > [..] Notes: - This database table is used for about 6 million row writes per day, > all of which are then deleted at the end of the day. If you are dumping the contents of the table anyways, why not use TRUNCATE instead of DELETE? It unlinks and recreates the table data files, requiring nearly zero IO and analyzing. Or even drop, and recreate the table with the correct new structure so you do not even need to ALTER TABLE ? I'm a bit skeptical of these suggestions since very competent people have already answered your post and did not come up with this... ;-) the only drawback I'm aware of is that TRUNCATE will immediatly free disk space on the OS level, so the table's space will not be "blocked". But probably the VACUUM you are currently performing will also eventually release the unused disk space, so this may or may not match the current behaviour. Best regards, -hannes -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
Tom Lane wrote: > Also you might want to look into how you got into a situation where > you have an anti-wraparound vacuum that's taking so long to run. If there are ALTERs running all the time, regular (non-anti-wraparound) vacuums would be canceled and never get a chance to run. Eventually, autovacuum decides it's had enough and doesn't cancel anymore, so everyone else gets stuck behind. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
Tim Belliswrites: > Even though this is a read only query, is it also expected to be blocked > behind the vacuum? Is there a way of getting indexes for a table which won't > be blocked behind a vacuum? It's not the vacuum that's blocking your read-only queries. It's the ALTER TABLE, which needs an exclusive lock in order to alter the table's schema. The ALTER is queued waiting for the vacuum to finish, and lesser lock requests queue up behind it. We could let the non-exclusive lock requests go ahead of the ALTER, but that would create a severe risk of the ALTER *never* getting to run. I'd kill the ALTER and figure on trying again after the vacuum is done. Also you might want to look into how you got into a situation where you have an anti-wraparound vacuum that's taking so long to run. You didn't do something silly like disable autovacuum did you? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On 02/16/2017 08:45 AM, Tim Bellis wrote: Thank you all - that's really useful :-) The other query that gets blocked behind the vacuum is the below (truncated). This query is generated by jdbc in this method: org.postgresql.jdbc2.AbstractJdbc2DatabaseMetaData.getIndexInfo(AbstractJdbc2DatabaseMetaData.java:4023) Is JDBC doing anything else before issuing this? Even though this is a read only query, is it also expected to be blocked behind the vacuum? Is there a way of getting indexes for a table which won't be blocked behind a vacuum? Table 13.2 here: https://www.postgresql.org/docs/9.5/static/explicit-locking.html shows the conflicts with SHARE UPDATE EXCLUSIVE(vacuum). pg_locks: https://www.postgresql.org/docs/9.5/static/view-pg-locks.html shows locks being held. So next time it happens I would take a look and see if you can work backwards from there. You could directly access the index information using: https://www.postgresql.org/docs/9.6/static/catalog-pg-index.html https://www.postgresql.org/docs/9.6/static/catalog-pg-class.html Thank you all again, Tim SELECT NULL AS TABLE_CAT, n.nspname AS TABLE_SCHEM, ct.relname AS TABLE_NAME, NOT i.indisunique AS NON_UNIQUE, NULL AS INDEX_QUALIFIER, ci.relname AS INDEX_NAME, CASE i.indisclustered WHEN true THEN 1ELSE CASE am.amname WHEN 'hash' THEN 2 ELSE 3END END AS TYPE, (i.keys).n AS ORDINAL_POSITION, pg_catalog.pg_get_indexdef(ci.oid, (i.keys).n, false) AS COLUMN_NAME, CASE am.amcanorder WHEN true THEN CASE i.indoption[(i.keys).n - 1] & 1 WHEN 1 THEN 'D' ELSE 'A' END ELSE NULL END AS ASC_OR_DESC, ci.reltuples AS CARDINALITY, ci.relpages AS PAGES, pg_catalog.pg_get_expr(i.indpred, i.indrelid) AS FILTER_CONDITION FROM pg_catalog.pg_class ct JOIN pg_catalog.pg_namespace n ON (ct.relnamespace = n.oid) JOIN (SELECT i.indexrelid, i.indrelid, i.indoption, i.indisunique, i.indisclustered, i.indpred, i.indexprs, information_schema._pg_expandarray(i.indkey) AS keys FROM pg_catalog.pg_index i) i ON (ct.oid = i.ind This query is cut off so cannot say whether it is the issue or not. -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
Thank you all - that's really useful :-) The other query that gets blocked behind the vacuum is the below (truncated). This query is generated by jdbc in this method: org.postgresql.jdbc2.AbstractJdbc2DatabaseMetaData.getIndexInfo(AbstractJdbc2DatabaseMetaData.java:4023) Even though this is a read only query, is it also expected to be blocked behind the vacuum? Is there a way of getting indexes for a table which won't be blocked behind a vacuum? Thank you all again, Tim SELECT NULL AS TABLE_CAT, n.nspname AS TABLE_SCHEM, ct.relname AS TABLE_NAME, NOT i.indisunique AS NON_UNIQUE, NULL AS INDEX_QUALIFIER, ci.relname AS INDEX_NAME, CASE i.indisclustered WHEN true THEN 1ELSE CASE am.amname WHEN 'hash' THEN 2 ELSE 3END END AS TYPE, (i.keys).n AS ORDINAL_POSITION, pg_catalog.pg_get_indexdef(ci.oid, (i.keys).n, false) AS COLUMN_NAME, CASE am.amcanorder WHEN true THEN CASE i.indoption[(i.keys).n - 1] & 1 WHEN 1 THEN 'D' ELSE 'A' END ELSE NULL END AS ASC_OR_DESC, ci.reltuples AS CARDINALITY, ci.relpages AS PAGES, pg_catalog.pg_get_expr(i.indpred, i.indrelid) AS FILTER_CONDITION FROM pg_catalog.pg_class ct JOIN pg_catalog.pg_namespace n ON (ct.relnamespace = n.oid) JOIN (SELECT i.indexrelid, i.indrelid, i.indoption, i.indisunique, i.indisclustered, i.indpred, i.indexprs, information_schema._pg_expandarray(i.indkey) AS keys FROM pg_catalog.pg_index i) i ON (ct.oid = i.ind -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
Scott Marlowe wrote: > Then a ddl gets in line. It has to wait on the vacuum, and the vacuum, > set to run super slow. And everybody waits. On vacuum. Note that this is normally not seen, because autovacuum cancels itself when somebody is blocked behind it -- until the table reaches the freeze_max_age limit, and then autovacuum is a for-wraparound one that is no longer terminated, and then everybody has to wait on it. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Wed, Feb 15, 2017 at 3:26 PM, Scott Marlowewrote: > On Wed, Feb 15, 2017 at 10:30 AM, Tim Bellis > wrote: >> I have a postgres 9.3.4 database table which (intermittently but reliably) >> gets into a state where queries get blocked indefinitely (at least for many >> hours) behind an automatic vacuum. I was under the impression that vacuum >> should never take any blocking locks for any significant period of time, and >> so would like help resolving the issue. >> >> The process blocking the query is: >> postgres 21985 11304 98 Feb13 ?1-14:20:52 postgres: autovacuum >> worker process >> which is running the query >> autovacuum: VACUUM public. >> >> The query being blocked is: >> ALTER TABLE ALTER COLUMN DROP DEFAULT >> (But I have seen this previously with other queries being blocked. I used >> the SQL in https://wiki.postgresql.org/wiki/Lock_Monitoring to determine >> which queries were blocked) > > There are also ways of making the table less likely / not likely / > will not get vacuum automatically. If you're willing to schedule ddl > and vacuum on your own you can then mix the two in relative safety. Followup: https://www.postgresql.org/docs/9.3/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On Wed, Feb 15, 2017 at 10:30 AM, Tim Belliswrote: > I have a postgres 9.3.4 database table which (intermittently but reliably) > gets into a state where queries get blocked indefinitely (at least for many > hours) behind an automatic vacuum. I was under the impression that vacuum > should never take any blocking locks for any significant period of time, and > so would like help resolving the issue. > > The process blocking the query is: > postgres 21985 11304 98 Feb13 ?1-14:20:52 postgres: autovacuum worker > process > which is running the query > autovacuum: VACUUM public. > > The query being blocked is: > ALTER TABLE ALTER COLUMN DROP DEFAULT > (But I have seen this previously with other queries being blocked. I used the > SQL in https://wiki.postgresql.org/wiki/Lock_Monitoring to determine which > queries were blocked) Yup, there's a priority inversion in DDL, DML and maintenance (vacuum). Vacuum runs slow in the background. Normal update/delete/insert work fine because of the type of lock vacuum has. Then a ddl gets in line. It has to wait on the vacuum, and the vacuum, set to run super slow. And everybody waits. On vacuum. Basically it's bad practice to alter tables that are big and being worked on, because one way or another you're going to pay a price. I've used partitions for logging and auditing that autocreate and drop and vacuum, but they never get ddl done on them when they're getting updated and vice versa. There are also ways of making the table less likely / not likely / will not get vacuum automatically. If you're willing to schedule ddl and vacuum on your own you can then mix the two in relative safety. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum stuck for hours, blocking queries
On 02/15/2017 09:30 AM, Tim Bellis wrote: I have a postgres 9.3.4 database table which (intermittently but reliably) gets into a state where queries get blocked indefinitely (at least for many hours) behind an automatic vacuum. I was under the impression that vacuum should never take any blocking locks for any significant period of time, and so would like help resolving the issue. The process blocking the query is: postgres 21985 11304 98 Feb13 ?1-14:20:52 postgres: autovacuum worker process which is running the query autovacuum: VACUUM public. The query being blocked is: ALTER TABLE ALTER COLUMN DROP DEFAULT (But I have seen this previously with other queries being blocked. I used the SQL in https://wiki.postgresql.org/wiki/Lock_Monitoring to determine which queries were blocked) Other ALTER TABLE queries? If so I believe this might apply: https://www.postgresql.org/docs/9.5/static/explicit-locking.html SHARE UPDATE EXCLUSIVE Conflicts with the SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE lock modes. This mode protects a table against concurrent schema changes and VACUUM runs. Acquired by VACUUM (without FULL), ANALYZE, CREATE INDEX CONCURRENTLY, and ALTER TABLE VALIDATE and other ALTER TABLE variants (for full details see ALTER TABLE). -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum and frequent pg_bulkload
On 11/20/2016 04:51 AM, Job wrote: Hello, i use PostgreSql 8.4.22 and two times a day, i import logs wth pg_bulkload and i delete old data. I move some millions of records in a day. FYI, Postgres 8.4 is over two years past EOL. I noticed that only autovacuum seems not to be able to free unused space. Do you run ANALYZE at any point in your procedure? What are the steps in the load/delete cycle? I need a periodical vacuum full but this operations takes many hours. Do you think it is due to pk_bulkload that is not able to "see" free-marked space to use when loading new data? Thank you, /F -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
On 3/9/15 3:56 AM, wambacher wrote: Hi paul just found my system (24 GB Mem, 72 GB Swap) running nearly at it's limits: The current vaccum is using 66.7 GB (sixty-six dot seven) GB of memory and my System is nearly down. I'm sorry, but that must be an bug. Remember: It's the Analyze of an GIN-Index that is making that problems. Various tables - same Problem. Regards walter http://postgresql.nabble.com/file/n5841074/top.png duríng the last 10 Minutes vaccum took one more GB, now it's using 67.5 if mem. should i ask at the dev-list? Open a ticket? Sorry for the late reply. Yes, that sounds like a bug in the GIN code. Please post to pgsql-bugs or hit http://www.postgresql.org/support/submitbug/ -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum query
On Thu, 26 Mar 2015 03:58:59 + Mitu Verma mitu.ve...@ericsson.com wrote: We have a customer complaining about the time taken by one of the application scripts while deleting older data from the log tables. During the deletion, customer reported that he often sees the below error and because of which table size doesn?t reduce. ERROR: canceling autovacuum task Date: 2015-03-14 04:29:19 Context: automatic analyze of table fm_db_Server3.mmsuper.audittraillogentry We have the following queries in this regard: - How often is the autovacuum task invoked by postgres As needed. Read: http://www.postgresql.org/docs/9.4/static/routine-vacuuming.html - If the task gets cancelled (while we were deleting data from the database) would this task be re-invoked at a later time? Yes. Read the above - If insertion of data into a table also impact this task? No. Vacuum operations are not triggered by INSERTs. - If we can manually schedule this task to a particular time (like off peak hours)? Yes, but given the questions you're asking, you probably do not have a good enough understanding of the situation to schedule it correctly and will make the problem worse. You can run it manually any time you want, but I don't recommend that you disable autovacuum unless you have a good understanding of what you're doing. Let me take a guess at the problem: The table gets LOTs of inserts, constantly, and somewhere there's a job that runs out of cron or some similar scheduler that DELETEs a lot of those rows in a big chunk. The DELETE process probably runs infrequently, like once a day or even once a week because the designers thought it would be best to get everything taken care of all at once during some real or perceived slow period on the database. One solution to this is to run the DELETE process more frequently, such as every 15 minutes. In such a case, the process will run much faster, make less changes, and require less work on the part of autovacuum to clean up after. People frequently complain that this will impact performance if run during normal use hours, but in every case I've seen, nobody had actually tested to see if that statement was true, and running smaller purges more frequently actually solved the problem. Another option would be to manually run vacuum after the big DELETE runs. See http://www.postgresql.org/docs/9.4/static/sql-vacuum.html Don't fall into the trap of running VACUUM FULL. This is usually a bad idea. If the client is complaining about reclaiming disk space, start asking some hard questions: How much space is too much? Why are you convinced that the space is wasted? Usually the correct answer is to add more disk space, since Postgres tends to fall into a groove with a particular table whereby the unused space is actually being used and reclaimed by data tuples as the data in the table changes. It's not unusal for the table to be 2x the size of the actual data on a heavily updated table. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum query
Another option, depending on the nature of the data and deletes, would be to partition the table. I created 7 tables that inherited from the original, one table for each day of the week. A nightly cron job then runs, leaving alone yesterday's and today's tables but truncating the other 5. Runs in 10 msec and vacuum doesn't need to run. -Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Bill Moran Sent: Thursday, March 26, 2015 6:07 AM To: Mitu Verma Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Autovacuum query On Thu, 26 Mar 2015 03:58:59 + Mitu Verma mitu.ve...@ericsson.com wrote: We have a customer complaining about the time taken by one of the application scripts while deleting older data from the log tables. During the deletion, customer reported that he often sees the below error and because of which table size doesn?t reduce. ERROR: canceling autovacuum task Date: 2015-03-14 04:29:19 Context: automatic analyze of table fm_db_Server3.mmsuper.audittraillogentry We have the following queries in this regard: - How often is the autovacuum task invoked by postgres As needed. Read: http://www.postgresql.org/docs/9.4/static/routine-vacuuming.html - If the task gets cancelled (while we were deleting data from the database) would this task be re-invoked at a later time? Yes. Read the above - If insertion of data into a table also impact this task? No. Vacuum operations are not triggered by INSERTs. - If we can manually schedule this task to a particular time (like off peak hours)? Yes, but given the questions you're asking, you probably do not have a good enough understanding of the situation to schedule it correctly and will make the problem worse. You can run it manually any time you want, but I don't recommend that you disable autovacuum unless you have a good understanding of what you're doing. Let me take a guess at the problem: The table gets LOTs of inserts, constantly, and somewhere there's a job that runs out of cron or some similar scheduler that DELETEs a lot of those rows in a big chunk. The DELETE process probably runs infrequently, like once a day or even once a week because the designers thought it would be best to get everything taken care of all at once during some real or perceived slow period on the database. One solution to this is to run the DELETE process more frequently, such as every 15 minutes. In such a case, the process will run much faster, make less changes, and require less work on the part of autovacuum to clean up after. People frequently complain that this will impact performance if run during normal use hours, but in every case I've seen, nobody had actually tested to see if that statement was true, and running smaller purges more frequently actually solved the problem. Another option would be to manually run vacuum after the big DELETE runs. See http://www.postgresql.org/docs/9.4/static/sql-vacuum.html Don't fall into the trap of running VACUUM FULL. This is usually a bad idea. If the client is complaining about reclaiming disk space, start asking some hard questions: How much space is too much? Why are you convinced that the space is wasted? Usually the correct answer is to add more disk space, since Postgres tends to fall into a groove with a particular table whereby the unused space is actually being used and reclaimed by data tuples as the data in the table changes. It's not unusal for the table to be 2x the size of the actual data on a heavily updated table. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum query
On March 25, 2015 09:31:24 PM David G. Johnston wrote: On Wed, Mar 25, 2015 at 8:58 PM, Mitu Verma mitu.ve...@ericsson.com wrote: Correcting the subject And this is why it is considered good form to do compose new message instead of replying to an existing one. Injecting your new topic into an existing unrelated mail thread is mildly annoying. Wildly off-topic, but I'm blaming Google. Their thread detection logic is so good that people actually don't know anymore how it was supposed to work, back in the bad old days where threads were managed by In-Reply-To headers. David J. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum query
On Wed, Mar 25, 2015 at 8:58 PM, Mitu Verma mitu.ve...@ericsson.com wrote: Correcting the subject And this is why it is considered good form to do compose new message instead of replying to an existing one. Injecting your new topic into an existing unrelated mail thread is mildly annoying. David J.
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Hi paul just found my system (24 GB Mem, 72 GB Swap) running nearly at it's limits: The current vaccum is using 66.7 GB (sixty-six dot seven) GB of memory and my System is nearly down. I'm sorry, but that must be an bug. Remember: It's the Analyze of an GIN-Index that is making that problems. Various tables - same Problem. Regards walter http://postgresql.nabble.com/file/n5841074/top.png duríng the last 10 Minutes vaccum took one more GB, now it's using 67.5 if mem. should i ask at the dev-list? Open a ticket? -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5841074.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
sorry, 64 GB swap -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5841075.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Hi, some final results: I monitored the vaccum process and logged some data using one big table and doing analyze/vaccum by hand. Table has two btree-indexes and one gin. maintenance_work_mem was 1GB. the analyze job used abot 1.2 GB virt mem during the whole task, no problems at all. The vacuum josb started with 3.3 GB and after processing the two simple indexes it used up to 5.5 GB of Vmem going down to 2.3 GB for the final work. http://postgresql.nabble.com/file/n5840914/pidstat.png This lead to out of memory problems during the last days. The vacuum of the first table (planet_osm_ways) used *upto 12 GB* until the OOM-Killer killed him. Sorry, but *why do analyze and vacuum ignore maintenance_work_mem?* I have no control about memory usage and ran in big trouble. Now i added 50GB swap to my 24GB system and have to cross my fingers. regards walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840914.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Karsten Hilbert wrote Of course, I am not suggesting you provide 48GB of swap and your problem is magically solved _but_ one thing we might take away from that old adage is that one might hope things to work better (say, while debugging) if there is at least as much swap as there is physical RAM based on the naive assumption that in this case 'everything can be swapped out'. no problem at all, got TBytes of free Diskpace. -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840781.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
wambacher wrote hi, waiting for the index (104/121GB), i read his document http://www.postgresql.org/docs/9.4/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT and will do some changes before the next analyze: some comments: - the OOM did not kill the Postmaster but the Analyze-Job. - started with 24GB real and 18GB Swap - that must be enought! -- killed - will reduce shared_mem from 1GB to 512 MB - will reduce max_connections to 100 (but no user was actice, the applications are down) - will do sysctl -w vm.overcommit_memory=2 (just did it) - may do echo -1000 /proc/self/oom_score_adj but only if the other actions fail - the last steps for older kernels are not relevant, i'm running ubuntu 14.04 LTS, which is 3.13 i'll keep you informed. regards walter very strange, after clustering, recreating the gis-index and changing nothing else no poblems at all. may be the index was corrupt? or there was so much reorganization needed that analyze run in trouble? i'll do most of the changes now, reboot and bring my system live again. Let's see whats happening the next days. Thanks to all walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840786.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
hi, waiting for the index (104/121GB), i read his document http://www.postgresql.org/docs/9.4/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT and will do some changes before the next analyze: some comments: - the OOM did not kill the Postmaster but the Analyze-Job. - started with 24GB real and 18GB Swap - that must be enought! -- killed - will reduce shared_mem from 1GB to 512 MB - will reduce max_connections to 100 (but no user was actice, the applications are down) - will do sysctl -w vm.overcommit_memory=2 (just did it) - may do echo -1000 /proc/self/oom_score_adj but only if the other actions fail - the last steps for older kernels are not relevant, i'm running ubuntu 14.04 LTS, which is 3.13 i'll keep you informed. regards walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840772.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
On Fri, Mar 06, 2015 at 02:39:34AM -0700, wambacher wrote: some comments: - the OOM did not kill the Postmaster but the Analyze-Job. - started with 24GB real and 18GB Swap - that must be enought! -- killed Back in the days it was conventional wisdom to have twice as much swap as you've got physical memory. Of course, I am not suggesting you provide 48GB of swap and your problem is magically solved _but_ one thing we might take away from that old adage is that one might hope things to work better (say, while debugging) if there is at least as much swap as there is physical RAM based on the naive assumption that in this case 'everything can be swapped out'. Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Jim Nasby-5 wrote Which is it? Is the vacuum process is using 1.2GB (5% of memory) or is it using 90% (~22GB)? i ran the job 2-3 times. - first with 18GB swap too. I heared it thrashing, performance went extremly down and after 2 hours i killed the job (reboot system, no other way to do it) - next without swap: i monitored the system with hmon and the vacuum task was getting bigger and bigger until oom killed it. VIRT at about 20.x GB, MEM% at 80-90% At this time i called for help. - next: rebuilt the gin-index without fastupdate=off to use the default. - vacuum planet_osm_ways on console - VIRT about 1.2 GB, MEM% about 3.4% on HTOP - crashed again, system logs are attached saying OOM killed him, but using about 1.2 GB, which is fine to me (and you) - dropped index, clustered, vacuum -- no problems - recreating of gin index is still running. 96/121 GB, some hours to go. waiting for next test. reporting a size of 1.2GB doesn't surprise me at all (assuming it's including shared memory in there). This is starting to sound like a regular OOM problem. Have you tried the steps in http://postgresql.org/docs/9.4/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT not yet, but i'll check it right now. Regards walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840765.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Hi, in my first post you can see all params: maintenance_work_mem = 64MB and two workers. i configured my system to the absolutely minimum ( got 24 GB real memory) and the problem was still there. Last night i rebuilt one index (122GB Size) and just in this minutes i started a manual analyze verbose planet_osm_ways to see whats happening. this will need some hours. Regards walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840685.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
... this will need some hours. Done after 30 Minutes :) nearly 50% dead rows - strange. Now i'll run a vacuum verbose planet_osm_ways because the system crashed during the autovacuum. cross my fingers. Walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840688.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
crashed: no idea what to do now. walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840696.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
On 3/5/15 2:06 PM, wambacher wrote: crashed: no idea what to do now. Crashed? Or hit by the OOM killer? What's the log say? While this is going on you might as well disable autovac for that table. It'll keep crashing and will interfere with your manual vacuums. It sounds at this point like the problem is in vacuuming, not analyze. Can you confirm? If so, please forgo analyzing the table until we can get vacuum figured out. What's the largest memory size that a vacuum/autovac against that table gets to compared to other backends? You meantioned 80-90% of memory before, but I don't know if that was for analyze or what. I wonder if we have some kind of memory leak in GIN's vacuum support... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Jim Nasby-5 wrote On 3/5/15 2:06 PM, wambacher wrote: Crashed? Or hit by the OOM killer? What's the log say? killed by OOM, but has only 1.2 GB mem, which is ok to me. While this is going on you might as well disable autovac for that table. It'll keep crashing and will interfere with your manual vacuums. did it this morning, the crash was running vacuum verbose planet_osm_ways by cli. It sounds at this point like the problem is in vacuuming, not analyze. Can you confirm? If so, please forgo analyzing the table until we can get vacuum figured out. yes, it's the vacuum task. What's the largest memory size that a vacuum/autovac against that table gets to compared to other backends? You meantioned 80-90% of memory before, but I don't know if that was for analyze or what. vacuum I wonder if we have some kind of memory leak in GIN's vacuum support... may be. At least i did: - droped the gin-index - cluster - analyze - vacuum all without any problems. now i'll add the index again and tomorrow do another vacuum by hand. 2:30 in germany, feeling tired ;) Regards walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840730.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
On 3/5/15 7:36 PM, wambacher wrote: Jim Nasby-5 wrote On 3/5/15 2:06 PM, wambacher wrote: Crashed? Or hit by the OOM killer? What's the log say? killed by OOM, but has only 1.2 GB mem, which is ok to me. Ok, but... What's the largest memory size that a vacuum/autovac against that table gets to compared to other backends? You meantioned 80-90% of memory before, but I don't know if that was for analyze or what. vacuum Which is it? Is the vacuum process is using 1.2GB (5% of memory) or is it using 90% (~22GB)? BTW, with 1GB shared buffers and 64MB maintenance_work_mem top reporting a size of 1.2GB doesn't surprise me at all (assuming it's including shared memory in there). This is starting to sound like a regular OOM problem. Have you tried the steps in http://postgresql.org/docs/9.4/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT ? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
wambacher wnordm...@gmx.de wrote: watching the memory usage of the autovaccum process: is was getting bigger and bigger at nearly constant speed. some MB per minute, iir. What are your settings for maintenance_work_mem and autovacuum_max_workers? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Circling back on this one, I had a look at our analyze code. I found one place where *maybe* we weren't freeing memory and freed it, but analyzing a 2M record table I barely see any bump up in memory usage (from 22M up to 24M at peak) during analyze. And the change I made didn't appear to alter that (though the objects were probably all small enough that they weren't being detoasted into copies in any event). Though maybe with a really big table? (with really big objects?) Though still, doesn't analyze just pull a limited sample (30K approx max) so why would table size make any difference after a certain point? P. On Tue, Mar 3, 2015 at 3:17 PM, Tom Lane t...@sss.pgh.pa.us wrote: wambacher wnordm...@gmx.de writes: My system has 24GB of real memory but after some hours one autovacuum worker is using 80-90% of memory, the OOM-Killer (out of memory killer) kills the process with kill -9 and the postgresql-server is restarting because of that problem. i changed the base configuration to use very small buffers, restartetd the server twice but the problem still exists. i think, it's allways the same table and that table is huge: 111GB data and 3 Indices with 4GB, 128 GB and 12 GB. It's the table planet_osm_ways from openstreetmap. maybe that helps. Maybe you could reduce the statistics targets for that table. I think we've heard that the analyze functions for PostGIS data types are memory hogs, too --- maybe it's worth inquiring about that on the postgis mailing lists. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Paul Ramsey wrote Though maybe with a really big table? (with really big objects?) Though still, doesn't analyze just pull a limited sample (30K approx max) so why would table size make any difference after a certain point? Hi paul, my table is quite big (about 293.049.000 records) but the objects are not. nodes[] contains maximal 2000 bigint and tags[] up to some hundred chars, sometimes some thousands chars. watching the memory usage of the autovaccum process: is was getting bigger and bigger at nearly constant speed. some MB per minute, iir. i'm just recreating planet_osm_ways_nodes without fastupdate=off regards walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840485.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Roxanne Reid-Bennett wrote Most definitely ask on the Postgis list. Identify the full Postgis version and Postgres versions as well. Hi Roxanne, seconds before sending it to the postgis-list i checked the table planet_osm_ways and there is no geometry: That can't be a postgis problem. I'll check fastupdate and the other hints, i got yesterday But here the missing infos: Regards walter -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840452.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
wambacher wnordm...@gmx.de writes: My system has 24GB of real memory but after some hours one autovacuum worker is using 80-90% of memory, the OOM-Killer (out of memory killer) kills the process with kill -9 and the postgresql-server is restarting because of that problem. i changed the base configuration to use very small buffers, restartetd the server twice but the problem still exists. i think, it's allways the same table and that table is huge: 111GB data and 3 Indices with 4GB, 128 GB and 12 GB. It's the table planet_osm_ways from openstreetmap. maybe that helps. Maybe you could reduce the statistics targets for that table. I think we've heard that the analyze functions for PostGIS data types are memory hogs, too --- maybe it's worth inquiring about that on the postgis mailing lists. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Tom Lane-2 wrote Maybe you could reduce the statistics targets for that table. don't understand what you mean. do you mean how often that table is autovacuumed? at the moment about once a day or once in two days, i think. I think we've heard that the analyze functions for PostGIS data types are memory hogs, too --- maybe it's worth inquiring about that on the postgis mailing listl good idea and yes, it's a table with postgis-geometries and the corresponding indices. will ask the postgis guys too. at the moment i disabled autovacuum but i'll need it soon again, because the server is running 24/7 and data is changing permantly. thanks walter btw: no big problems in my mini-config, ok? -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840305.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
wambacher wnordm...@gmx.de writes: Tom Lane-2 wrote Maybe you could reduce the statistics targets for that table. don't understand what you mean. do you mean how often that table is autovacuumed? at the moment about once a day or once in two days, i think. No, I mean the amount of stats detail that ANALYZE tries to collect. I'm guessing that it's not auto-vacuum but auto-analyze that's getting OOMed. See ALTER TABLE SET STATISTICS TARGET. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
Tom Lane-2 wrote See ALTER TABLE SET STATISTICS TARGET. thanks, will try it regards walter btw: the postgis analyze problem has been fixed more than one year ago, but i'll ask them too. -- View this message in context: http://postgresql.nabble.com/autovacuum-worker-running-amok-and-me-too-tp5840299p5840313.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum worker running amok - and me too ;)
On 3/3/2015 6:17 PM, Tom Lane wrote: wambacher wnordm...@gmx.de writes: My system has 24GB of real memory but after some hours one autovacuum worker is using 80-90% of memory, the OOM-Killer (out of memory killer) kills the process with kill -9 and the postgresql-server is restarting because of that problem. i changed the base configuration to use very small buffers, restartetd the server twice but the problem still exists. i think, it's allways the same table and that table is huge: 111GB data and 3 Indices with 4GB, 128 GB and 12 GB. It's the table planet_osm_ways from openstreetmap. maybe that helps. Maybe you could reduce the statistics targets for that table. I think we've heard that the analyze functions for PostGIS data types are memory hogs, too --- maybe it's worth inquiring about that on the postgis mailing lists. Most definitely ask on the Postgis list. Identify the full Postgis version and Postgres versions as well. We had a case on a trial upgrade (9.1 to 9.3 and Postgis upgrade (2.0-2.1)) where analyze was running amok memory wise on 3 tables with geometry types. (posted on this list) Unfortunately the VM that system was on got corrupted and I wasn't able to provide the data for a test scenario to Paul Ramsey - so never filed the bug report. (perhaps the VM was the issue and NOT the upgrade...) At the time, we ended up NOT upgrading the production box due this issue potentially being a show stopper. I've also never tried to re-create the upgrade stack to test it out on a current copy of the data. Roxanne regards, tom lane -- [At other schools] I think the most common fault in general is to teach students how to pass exams instead of teaching them the science. Donald Knuth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum with inheritance
yhe wrote From looking at the log, [...] Any reason not to share it with the people from whom you are asking for help? I was thinking that autovacuum should only happen on updated table so it should only autovacuum on one child and not the others for my case. At the point of stop usage on the partition it would still want at least one more cleanup vacuum; and eventually would want a FREEZE pass as well. Furthermore, the parent of the partition scheme still needs to be aware of the children and so you might be seeing that. David J. -- View this message in context: http://postgresql.nabble.com/autovacuum-with-inheritance-tp5829450p5829461.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum on partitioned tables in version 9.1
On 11/11/2014 07:52 PM, David G Johnston wrote: TL;DR - the interaction of ANALYZE and inheritance hierarchies seems to be broken for the uncommon use case where the inserts temporarily remain on the master table in order to allow RETURNING to work. Yeah, if I do: # vacuum verbose public.tablename; if finish immediately, but if I do: # vacuum analyze verbose public.tablename; it takes forever, it seems it analyze every child table. So for now is just a matter of removing vacuum for the master table and then doing it manually every night I guess. Nestor A. Diaz wrote The autovacuum process on version 9.1 keeps vacuuming the master tables and that takes a lot of time considering the master table have no records of its own. The trigger itself insert into the master table, then into the child and then remove the record from the master, we do that way because we need to get the inserted row info. If you say so...but the second paragraph makes your conclusion in the first paragraph false. Actually we have trigger like this: -- Insert Trigger CREATE TRIGGER insert_tablename_trigger BEFORE INSERT ON tablename FOR EACH ROW EXECUTE PROCEDURE insert_tablename_trigger(); -- Delete Trigger CREATE TRIGGER delete_tablename_trigger AFTER INSERT ON tablename FOR EACH ROW EXECUTE PROCEDURE delete_tablename_trigger(); -- Insert Function CREATE OR REPLACE FUNCTION insert_tablename_trigger() RETURNS trigger AS $BODY$ BEGIN EXECUTE 'INSERT INTO partitions.' || quote_ident(_table_name) || ' VALUES ($1.*)' USING NEW; RETURN NEW; END; $BODY$ LANGUAGE plpgsql; -- Delete Function CREATE OR REPLACE FUNCTION delete_tablename_trigger() RETURNS trigger AS $BODY$ BEGIN DELETE FROM ONLY public.tablename WHERE id = NEW.id; -- delete row again. RETURN NULL; END $BODY$ LANGUAGE plpgsql; When we insert into the master table, then the row is inserted twice ( in the master and the child ) we return the row inserted at the child and remove the one inserted at the master, this is how it works. This is a requirement for most ORM since they usually need an automatically generated row id column of the table, before that we didn't set up that way, but in order to keep the application happy we had do set up that way, or there exist a better approach for this ? The requirement is simple: We need partitioning and for every row inserted we need to return the whole row after an insert, like in: INSERT INTO TABLE ... RETURNING *; From the docs: The optional RETURNING clause causes INSERT to compute and return value(s) based on each row actually inserted. This is primarily useful for obtaining values that were supplied by defaults, such as a serial sequence number. However, any expression using the table's columns is allowed. The syntax of the RETURNING list is identical to that of the output list of SELECT. slds. -- Nestor A Diaz -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum on partitioned tables in version 9.1
TL;DR - the interaction of ANALYZE and inheritance hierarchies seems to be broken for the uncommon use case where the inserts temporarily remain on the master table in order to allow RETURNING to work. Note - I have not played with this scenario personally but http://www.postgresql.org/message-id/flat/cabrmo8revvbbfhy-nxw2aknr+3awdzxepgwkgrnflhvtips...@mail.gmail.com#cabrmo8revvbbfhy-nxw2aknr+3awdzxepgwkgrnflhvtips...@mail.gmail.com combined with this post leads me to that conclusion. Nestor A. Diaz wrote The autovacuum process on version 9.1 keeps vacuuming the master tables and that takes a lot of time considering the master table have no records of its own. The trigger itself insert into the master table, then into the child and then remove the record from the master, we do that way because we need to get the inserted row info. If you say so...but the second paragraph makes your conclusion in the first paragraph false. You may wish to read up on Multi-Version Concurrency Control (MVCC) Actually we don't use any features of version 9.1 that are not available under version 8.4, however I don't want to downgrade to version 8.4 as I consider that I still have not understood completely how auto vacuuming process works, and I need to improve on this. Well, you are actually using one right now - community support :) Earlier in this list a user experienced the same behavior: http://www.postgresql.org/message-id/flat/ CABrmO8rEvvbBfhY-NxW2AkNr+3aWdzXEPgWkgrNFLHvTipSHyw@.gmail On the same thread another user wrote it could be an issue that is to be resolved at minor version 9.1.8 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5454344b968d6a189219cfd49af609a3e7d6af33 I currently have PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit running on an eight-core processor and 24 GB RAM with the following options: Given that 9.1.14 is current being on 9.1.9 is going to be a problem. The referenced thread never came to conclusion - the OP never affirmed the patch fixed their problem - but the patch mentioned affected vacuum while the preceding conclusion in the post was that ANALYZE was the probable culprit - specifically that the ANALYZE on the master table cascaded to all of the children and so took however long it would take to analyze the entire partition hierarchy. In order to alleviate the I/O problem I disable autovacuum on all the master tables like so: ALTER TABLE public.tablename SET ( autovacuum_enabled = false, toast.autovacuum_enabled = false); But I know I can't left the database without vacuuming enabled because of transaction ID wraparround. If things get to a point where this is required it will run regardless of your configuration. So I need to set up a cron script for this; for every master table there is a new child table every week then I can start a vacuum process via cron for the table before the newly created, these tables are only used for reading after a week. But I need some clarification on this: Why is postgresql starting a vacuum on master tables too often ? Why it takes too much time on version 9.1 ? I guess because it needs to reclaim unused space due to the insert/remove process, but in version 8.4 that is unnoticeable. I'd be more concerned with the difference between 8.4 and 9.1 but if you are indeed physically inserting and the deleting from the master table you need some kind of vacuum if you want to reclaim that wasted space. As noted above the ANALYZE is a probable culprit here - and its interaction with inheritance seems under-documented and incompletely implemented. I think this would be more obvious but apparently most people do not write their trigger sets to leave the inserted record in the master table so as not to break RETURNING and then delete the record shortly thereafter. Someone from -hackers needs to comment on this use case and whether something can and should be done to accommodate it. How do I know which tables needs to be vacuumed ? any sql recipe ? How do I check when I am near the limit of the transaction ID ? Do I need to vacuum tables that haven't change a long time ago ? If a vacuum freeze has been run on a table then in the absence of subsequent updates it will not require vacuuming. Anybody have experienced the same behavior and would like to comment on this ? David J. -- View this message in context: http://postgresql.nabble.com/Autovacuum-on-partitioned-tables-in-version-9-1-tp5826595p5826603.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] AutoVacuum Daemon
On Mon, Dec 30, 2013 at 11:02 PM, Leonardo M. Ramé l.r...@griensu.com wrote: On 2013-12-30 13:45:43 +, Haribabu kommi wrote: On 30 December 2013 19:11 Leonardo M. Ramé wrote: Hi, I want know if I should run the auto-vacuum daemon (from /etc/init.d/) or it runs automatically and transparently if configured in postgres.conf?. If it must be configured manually, what is the script to be run, I didn't find pg_autovacuum or similar. I didn't find information about this on this page: http://www.postgresql.org/docs/8.4/static/routine- vacuuming.html#AUTOVACUUM P.S.: I'm on linux running PostgreSql 8.4 Just enable autovacuum configuration parameter in postgresql.conf file. Which internally spawns an autovacuum process which will take care of vacuuming. Thanks, that's easier than I thought. This parameter is on by default since 8.3, so you don't really need to touch it yourself normally. Regards, -- Michael -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] AutoVacuum Daemon
On 30 December 2013 19:11 Leonardo M. Ramé wrote: Hi, I want know if I should run the auto-vacuum daemon (from /etc/init.d/) or it runs automatically and transparently if configured in postgres.conf?. If it must be configured manually, what is the script to be run, I didn't find pg_autovacuum or similar. I didn't find information about this on this page: http://www.postgresql.org/docs/8.4/static/routine- vacuuming.html#AUTOVACUUM P.S.: I'm on linux running PostgreSql 8.4 Just enable autovacuum configuration parameter in postgresql.conf file. Which internally spawns an autovacuum process which will take care of vacuuming. Regards, Hari babu. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] AutoVacuum Daemon
On 2013-12-30 13:45:43 +, Haribabu kommi wrote: On 30 December 2013 19:11 Leonardo M. Ramé wrote: Hi, I want know if I should run the auto-vacuum daemon (from /etc/init.d/) or it runs automatically and transparently if configured in postgres.conf?. If it must be configured manually, what is the script to be run, I didn't find pg_autovacuum or similar. I didn't find information about this on this page: http://www.postgresql.org/docs/8.4/static/routine- vacuuming.html#AUTOVACUUM P.S.: I'm on linux running PostgreSql 8.4 Just enable autovacuum configuration parameter in postgresql.conf file. Which internally spawns an autovacuum process which will take care of vacuuming. Thanks, that's easier than I thought. Regards, -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 351 6629292 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum out of memory errors
tim truman mitram...@gmail.com wrote: [ memory usage map ] There doesn't seem to be any memory context using an unusually large amount of RAM. 522f9128.1151 ERROR: out of memory 522f9128.1151 DETAIL: Failed on request of size 336150396. 522f9128.1151 CONTEXT: automatic vacuum of table client.public.product SELECT version(); PostgreSQL 8.4.11 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit http://www.postgresql.org/support/versioning/ maintenance_work_mem | 1GB max_connections | 1000 shared_buffers | 11GB wal_buffers | 8MB work_mem | 8MB free -m total used free shared buffers cached Mem: 48396 48139 257 0 173 44771 -/+ buffers/cache: 3194 45201 Swap: 956 20 936 Clearly the free command was not run while the database service was running. Can we see what that looks like when it is running and handling roughly the same workload that has been causing problems (including any non-database processes which have been running on the machine during failures)? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum: found orphan temp table
On 06/25/13 06:13, Nicolau Roca wrote: 2013-06-25 15:02:15 CEST [::18264:1:] LOG: autovacuum: found orphan temp table pg_temp_47.est_backup_ids_temp in database estudis1314 2013-06-25 15:02:15 CEST [::18264:2:] LOG: autovacuum: found orphan temp table pg_temp_47.est_backup_files_temp in database estudis1314 I read a suggestion in the list pgsql-hackers (Message ID 48f4599d.7010...@enterprisedb.com http://www.postgresql.org/message-id/48f4599d.7010...@enterprisedb.com) about just dropping the pg_temp_x schema. However, no such schema exists: Try looking in the estudis1314 database: [postgres@postgresql1 ~]$ psql -d estudis1314 HTH, Bosco. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum: found orphan temp table
On Tue, Jun 25, 2013 at 6:43 PM, Nicolau Roca nicolau.r...@uib.cat wrote: Hi, after a server crash the following messages appear in the log file every minute: 2013-06-25 15:02:15 CEST [::18264:1:] LOG: autovacuum: found orphan temp table pg_temp_47.est_backup_ids_temp in database estudis1314 2013-06-25 15:02:15 CEST [::18264:2:] LOG: autovacuum: found orphan temp table pg_temp_47.est_backup_files_temp in database estudis1314 I read a suggestion in the list pgsql-hackers (Message ID 48f4599d.7010...@enterprisedb.comhttp://www.postgresql.org/message-id/48f4599d.7010...@enterprisedb.com) about just dropping the pg_temp_x schema. However, no such schema exists: You can query to find those schemas : select relname,nspname from pg_class join pg_namespace on (relnamespace= pg_namespace.oid) where pg_is_other_temp_schema(relnamespace); On finding you can drop those schemas,if you want to get rid of the messages, just do DROP SCHEMA pg_temp_NNN CASCADE; --- Regards, Raghavendra EnterpriseDB Corporation Blog: http://raghavt.blogspot.com/
Re: [GENERAL] Autovacuum doesn't work if the table has large number of records
Hi Jeff, Thanks for your reply. Yes, it took a while to complete the autovacuum. I checked it again this morning: last_autovacuum: 2013-04-14 06:17:02.9464+08 last_autoanalyze: 2013-04-13 20:27:12.396048+08 n_dead_tup: nill (or blank) n_live_tup: 334453396 relpages: 2654868 reltuples: 3.34453e+08 file size of pgstat.stat: /opt/PostgreSQL/9.2/data/pg_stat_tmp/pgstat.stat : 11611 I also checked the pg_log files, the autovacuum somehow was delayed by pgstat wait timeout , during 1:10am to 05:41am this morning (over 4.5 hours). I think this would impact the DB performance. 2013-04-14 01:10:15.314 HKT,,,16427,,51699177.402b,1,,2013-04-14 01:10:15 HKT,,0,LOG,0,autovacuum launcher started, 2013-04-14 04:08:32.741 HKT,,,16427,,51699177.402b,2,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:09:23.522 HKT,,,16427,,51699177.402b,3,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:09:54.825 HKT,,,16427,,51699177.402b,4,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:10:09.384 HKT,,,16427,,51699177.402b,5,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:45:05.227 HKT,,,16427,,51699177.402b,6,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:45:15.332 HKT,,,16427,,51699177.402b,7,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:45:15.342 HKT,,,23728,,5169c3d1.5cb0,1,,2013-04-14 04:45:05 HKT,3/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:45:25.443 HKT,,,16427,,51699177.402b,8,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:46:23.065 HKT,,,16427,,51699177.402b,9,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:46:33.168 HKT,,,16427,,51699177.402b,10,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:46:33.174 HKT,,,23752,,5169c41f.5cc8,1,,2013-04-14 04:46:23 HKT,3/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:46:43.286 HKT,,,16427,,51699177.402b,11,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:46:55.408 HKT,,,16427,,51699177.402b,12,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:47:05.508 HKT,,,16427,,51699177.402b,13,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:47:05.519 HKT,,,23753,,5169c43f.5cc9,1,,2013-04-14 04:46:55 HKT,3/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:47:15.620 HKT,,,16427,,51699177.402b,14,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:47:27.742 HKT,,,16427,,51699177.402b,15,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:47:37.843 HKT,,,16427,,51699177.402b,16,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:47:37.854 HKT,,,23758,,5169c45f.5cce,1,,2013-04-14 04:47:27 HKT,3/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:47:47.953 HKT,,,16427,,51699177.402b,17,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:48:00.075 HKT,,,16427,,51699177.402b,18,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:48:10.179 HKT,,,16427,,51699177.402b,19,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:48:10.189 HKT,,,23759,,5169c480.5ccf,1,,2013-04-14 04:48:00 HKT,3/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:48:20.291 HKT,,,16427,,51699177.402b,20,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 04:48:32.415 HKT,,,16427,,51699177.402b,21,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 05:09:16.361 HKT,,,16427,,51699177.402b,22,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 05:15:10.859 HKT,,,24645,,5169cad4.6045,1,,2013-04-14 05:15:00 HKT,3/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 05:15:12.853 HKT,,,16427,,51699177.402b,23,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 05:41:07.482 HKT,,,16427,,51699177.402b,24,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 05:41:17.571 HKT,,,16427,,51699177.402b,25,,2013-04-14 01:10:15 HKT,1/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 05:41:17.583 HKT,,,25513,,5169d0f3.63a9,1,,2013-04-14 05:41:07 HKT,3/0,0,WARNING,01000,pgstat wait timeout, 2013-04-14 06:17:02.946 HKT,,,16436,,51699179.4034,1,,2013-04-14 01:10:17 HKT,2/4,0,LOG,0,automatic vacuum of table postgres.public.test: index scans: 2 pages: 0 removed, 2654868 remain tuples: 0 removed, 334453396 remain buffer usage: 2501814 hits, 8108302 misses, 3585758 dirtied
Re: [GENERAL] Autovacuum doesn't work if the table has large number of records
On Sat, Apr 13, 2013 at 9:55 AM, Ascot Moss ascot.m...@gmail.com wrote: Current stat of test table: pg_class.reltuples: 3.8415e+08 pg_class.relpages: 1703069 last_autovacuum: null (or blank) last_autoanalyze: 2013-04-13 20:27:12.396048+08 pg_stat_user_tables.n_dead_tup: 3 The autovacuum threshold should be about : 76,830,130 (50 + 3.8415e+08 x 0.2) I expected the autovacuum should be run automatically to clear the dead tuples, however, after over 3 hours, by checking pg_stat_user_tables, the last_autovacuum is still null and n_dead_tup still equals to 3, Every page is going to be both read and dirtied, so with default vacuum_cost_* settings you are going to get have 1703069 * (10+20) / 200 = 255,460.35 delays of 0.020 seconds, for 5,109.207 second of sleeping. Plus it actually has to do the work, including fsync the WAL log about once every 32 buffers. So it is going to take a while. Can anyone advise me why the autovacuum is not running or if the autovacuum is running but it is not yet completed? You can check if it is ongoing: select * from pg_stat_activity where query like 'autovacuum%' \x\g\x Cheers, Jeff
Re: [GENERAL] Autovacuum Launcher Process Exception
elli...@cpi.com wrote: Hello, I have been running PostgreSQL for many months and just recently started getting this exception upon start up. Does anyone know what the issue might be? LOG: database system is ready to accept connections LOG: autovacuum launcher started LOG: autovacuum launcher process PID 7060 was terminated by exception 0xC006 HINT: See C include file ntstatus.h for a description of the hexadecimal value. http://www.experts-exchange.com/Programming/System/Windows__Programming/Q_20267274.html (Scroll way down to see the answers to the question) Do you have the binaries on a network filesystem which is sporadically failing? LOG: could not open temporary statistics file global/pgstat.tmp: No such file or directory LOG: all server processes terminated; reinitializing PANIC: could not open control file global/pg_control: No such file or directory So the files in the data directory are suddenly gone? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum running for a long time on a new table with 1 row
Lonni J Friedman netll...@gmail.com writes: Running 9.1.3 on Linux-x86_64. I'm seeing autovacuum running for the past 6 hours on a newly created table that only has 1 row of data in it. This table did exist previously, but was dropped recreated. I'm not sure if that might explain this behavior. When I strace the autovacuum process, I see the following scrolling by non-stop (with no changes to the file referenced): select(0, NULL, NULL, NULL, {0, 21000}) = 0 (Timeout) open(base/16412/214803_vm, O_RDWR)= -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR)= -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR)= -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR)= -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR)= -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR)= -1 ENOENT (No such file or directory) This seems to have been noticed and fixed in HEAD: http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=b4e0741727685443657b55932da0c06f028fbc00 I wonder whether that should've been back-patched. In the meantime, though, it sure looks like you've got a lot more than one row in there. Perhaps you did umpteen zillion updates on that one row? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum running for a long time on a new table with 1 row
On Fri, Jun 1, 2012 at 10:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: Lonni J Friedman netll...@gmail.com writes: Running 9.1.3 on Linux-x86_64. I'm seeing autovacuum running for the past 6 hours on a newly created table that only has 1 row of data in it. This table did exist previously, but was dropped recreated. I'm not sure if that might explain this behavior. When I strace the autovacuum process, I see the following scrolling by non-stop (with no changes to the file referenced): select(0, NULL, NULL, NULL, {0, 21000}) = 0 (Timeout) open(base/16412/214803_vm, O_RDWR) = -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR) = -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR) = -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR) = -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR) = -1 ENOENT (No such file or directory) open(base/16412/214803_vm, O_RDWR) = -1 ENOENT (No such file or directory) This seems to have been noticed and fixed in HEAD: http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=b4e0741727685443657b55932da0c06f028fbc00 I wonder whether that should've been back-patched. Thanks for your reply. I won't even pretend to understand what that fix does. Is this behavior something that is blatantly broken, or harmless, or somewhere in between? Should I expect autovacuum to eventually complete succesfully when it stumbles into this scenario? In the meantime, though, it sure looks like you've got a lot more than one row in there. Perhaps you did umpteen zillion updates on that one row? Before dropping recreating the table, yes it had millions of rows, and millions of updates. But since then, all I did was insert a single row, and watched autovacuum wedge itself in that seemingly infinite loop. I ended up doing a 'kill -2' on the autovacuum PID that was misbehaving, disabled autovacuuming the table, and went about what I needed to get done as an interim solution. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum running for a long time on a new table with 1 row
Lonni J Friedman netll...@gmail.com writes: On Fri, Jun 1, 2012 at 10:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: This seems to have been noticed and fixed in HEAD: http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=b4e0741727685443657b55932da0c06f028fbc00 I wonder whether that should've been back-patched. Thanks for your reply. I won't even pretend to understand what that fix does. Is this behavior something that is blatantly broken, or harmless, or somewhere in between? Should I expect autovacuum to eventually complete succesfully when it stumbles into this scenario? Well, the problem with the original code was that it would recheck the visibility map's file size anytime somebody tried to check a bit beyond the end of the map. If the map isn't there (which is not an error case) this would result in a useless open() attempt for each table page scanned by vacuum. So ordinarily I would say that yes you could expect autovac to complete eventually. However ... Before dropping recreating the table, yes it had millions of rows, and millions of updates. But since then, all I did was insert a single row, and watched autovacuum wedge itself in that seemingly infinite loop. I ended up doing a 'kill -2' on the autovacuum PID that was misbehaving, disabled autovacuuming the table, and went about what I needed to get done as an interim solution. ... if you really did drop and recreate the table, then at this point it should only have a single page, I would think. It might be worth checking the actual file size. pg_relation_size('tablename') is probably the quickest way. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum running for a long time on a new table with 1 row
On Fri, Jun 1, 2012 at 10:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: Lonni J Friedman netll...@gmail.com writes: On Fri, Jun 1, 2012 at 10:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: This seems to have been noticed and fixed in HEAD: http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=b4e0741727685443657b55932da0c06f028fbc00 I wonder whether that should've been back-patched. Thanks for your reply. I won't even pretend to understand what that fix does. Is this behavior something that is blatantly broken, or harmless, or somewhere in between? Should I expect autovacuum to eventually complete succesfully when it stumbles into this scenario? Well, the problem with the original code was that it would recheck the visibility map's file size anytime somebody tried to check a bit beyond the end of the map. If the map isn't there (which is not an error case) this would result in a useless open() attempt for each table page scanned by vacuum. So ordinarily I would say that yes you could expect autovac to complete eventually. However ... Before dropping recreating the table, yes it had millions of rows, and millions of updates. But since then, all I did was insert a single row, and watched autovacuum wedge itself in that seemingly infinite loop. I ended up doing a 'kill -2' on the autovacuum PID that was misbehaving, disabled autovacuuming the table, and went about what I needed to get done as an interim solution. ... if you really did drop and recreate the table, then at this point it should only have a single page, I would think. It might be worth checking the actual file size. pg_relation_size('tablename') is probably the quickest way. Unfortunately, I've since inserted several million rows into this table, so I'm guessing its too late now to check the size? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum running for a long time on a new table with 1 row
Lonni J Friedman netll...@gmail.com writes: On Fri, Jun 1, 2012 at 10:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: ... if you really did drop and recreate the table, then at this point it should only have a single page, I would think. Â It might be worth checking the actual file size. Â pg_relation_size('tablename') is probably the quickest way. Unfortunately, I've since inserted several million rows into this table, so I'm guessing its too late now to check the size? Yeah, it won't prove much now. If you see something like this happen again, please poke into it a bit more before you damage the evidence... regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and transaction id wraparound
Hi Jens, Thanks for answer. One more question: Can the manual VACUUM operate on database where there are long lasting transactions? In other words, do I need to restart the server or kill long lasting transactions in order to allow manual VACUUM to clear old XIDs? Regards, Pawel -- View this message in context: http://postgresql.1045698.n5.nabble.com/autovacuum-and-transaction-id-wraparound-tp5545412p5549786.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and transaction id wraparound
On Fri, Mar 9, 2012 at 12:15 AM, pawel_kukawski kukis...@gmail.com wrote: Hi Jens, Thanks for answer. One more question: Can the manual VACUUM operate on database where there are long lasting transactions? In other words, do I need to restart the server or kill long lasting transactions in order to allow manual VACUUM to clear old XIDs? Long running transactions block both manual and automatic vacuums. The only difference between manual and automatic vacuuming is the costing parameters, which don't have to be different. I.e. if you set the autovacuum_* costs to the same as the vacuum_* costs no diff in operation. The issue you might run into is the one Tom mentioned, that if you've got a long running autovacuum that started with the older less aggressive settings you might need to kill it (you can use pg_cancel_backend() or sigint from the command like) to get it to start back up with the new costs. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and transaction id wraparound
On Wednesday 07 March 2012 21:13:26 pawel_kukawski wrote: Hi, Do you know any real reason why the autovacuum may fail to clear old XIDs? If it's unable to keep up. Or may be, if there're very long running idle in transactions. Is this highly probable ? postmaster will shutdown to prevent wraparound, if there are fewer than1 million transactions left until wraparound. Rgds, Jens -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum and deadlocks
On Sat, Feb 18, 2012 at 8:59 AM, Gregg Jaskiewicz gryz...@gmail.com wrote: What is a likelihood of a deadlock occurring, caused (or helped by) auto vacuum. This is on 8.3. The table with deadlocks was quite busy with updates, etc. The only locking issues (not deadlocks) I've ever had with autovacuum were priority inversion issues with slony, where a slony execute script was stuck behind an autovacuum, and something else was then stuck behind the slony execute script. I'm afraid we'll need more info to determine what was happening. It's a good idea to make a snapshot of what's in pg_stat_activity and pg_locks when these things happen so you have something to look at after the fact. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
Lonni J Friedman netll...@gmail.com writes: When I strace PID 30188, I see tons of this scrolling past quickly, but I'm not really sure what it means beyond a 'Timeout' not looking good: select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) lseek(95, 753901568, SEEK_SET) = 753901568 read(95, \202\1\0\0\260\315\250\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1..., 8192) = 8192 lseek(95, 753917952, SEEK_SET) = 753917952 read(95, \202\1\0\0 N\253\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1..., 8192) = 8192 select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) I'm betting the selects are implementing vacuum_cost_delay, and that the reason this is taking forever is that you have that cranked up to an unreasonably high value. There's no evidence of looping in what you showed us, because the seek addresses are changing. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
On Tue, Nov 22, 2011 at 6:57 PM, Tom Lane t...@sss.pgh.pa.us wrote: Lonni J Friedman netll...@gmail.com writes: When I strace PID 30188, I see tons of this scrolling past quickly, but I'm not really sure what it means beyond a 'Timeout' not looking good: select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) lseek(95, 753901568, SEEK_SET) = 753901568 read(95, \202\1\0\0\260\315\250\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1..., 8192) = 8192 lseek(95, 753917952, SEEK_SET) = 753917952 read(95, \202\1\0\0 N\253\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1..., 8192) = 8192 select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) I'm betting the selects are implementing vacuum_cost_delay, and that the reason this is taking forever is that you have that cranked up to an unreasonably high value. There's no evidence of looping in what you showed us, because the seek addresses are changing. Thanks for your prompt reply. I was pretty sure that I was using the default, but just to confirm, I just ran: 'SHOW vacuum_cost_delay;' and it returned 0.Any other suggestions? Is there no way to safely kill off this autovacuum process that doesn't involve shutting down restarting the database? thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
Lonni J Friedman netll...@gmail.com writes: Thanks for your prompt reply. I was pretty sure that I was using the default, but just to confirm, I just ran: 'SHOW vacuum_cost_delay;' What about autovacuum_vacuum_cost_delay? The selects seem to be delaying for 32msec, which is not the default for anything. Is there no way to safely kill off this autovacuum process that doesn't involve shutting down restarting the database? Sending it a SIGINT ought to be safe enough, though I don't think that is necessarily advisable, because the next autovacuum will probably take just as long. Killing this one will mainly mean you've wasted (much of) the work it did so far. Before getting hasty I'd suggest identifying what table (or index) it's working on --- lsof on the process to see what FD 95 is connected to would be the first step. I'm thinking it's an index since the seek addresses don't seem to be consecutive. And it might be worth watching the seek addresses for awhile to see if you can prove that it's looping --- if it is, that might be an indication of a corrupt index. If it isn't, but is just steadily working through the index, you'd be best advised to have patience. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
Hi, On 23 November 2011 13:20, Lonni J Friedman netll...@gmail.com wrote: I investigated, and found that for the past ~18 hours, there's one autovacuum process that has been running, and not making any obvious progress: snip... I'm using the defaults for all the *vacuum* options in postgresql.conf, except for: log_autovacuum_min_duration = 2500 Defaults are: autovacuum_vacuum_cost_delay = 20 msec autovacuum_vacuum_cost_limit = -1 (ie vacuum_cost_limit is used) vacuum_cost_limit = 200 If table is busy -- many updates and deletes then auto vacuum exhausts cost limit almost immediately. You can try to set autovacuum_vacuum_cost_delay to -1 (which disables cost based auto vacuum) but you don't want to saturate your disks. Other option is to increase vacuum_cost_limit/autovacuum_vacuum_cost_limit -- Ondrej Ivanic (ondrej.iva...@gmail.com) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
On Tue, Nov 22, 2011 at 7:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: Lonni J Friedman netll...@gmail.com writes: Thanks for your prompt reply. I was pretty sure that I was using the default, but just to confirm, I just ran: 'SHOW vacuum_cost_delay;' What about autovacuum_vacuum_cost_delay? The selects seem to be delaying for 32msec, which is not the default for anything. 20ms is what it returns. I grepped for '32' in postgresql.conf and other than the '5432' port, nothing was returned. I'm using the official postgres RPMs that shipped with Fedora15-x86_64, so unless something weird got compiled in, I have no idea where that 32ms is coming from. Is there no way to safely kill off this autovacuum process that doesn't involve shutting down restarting the database? Sending it a SIGINT ought to be safe enough, though I don't think that is necessarily advisable, because the next autovacuum will probably take just as long. Killing this one will mainly mean you've wasted (much of) the work it did so far. ok, then I guess I'll wait longer. Before getting hasty I'd suggest identifying what table (or index) it's working on --- lsof on the process to see what FD 95 is connected to would be the first step. I'm thinking it's an index since the seek addresses don't seem to be consecutive. And it might be worth watching the seek addresses for awhile to see if you can prove that it's looping --- if it is, that might be an indication of a corrupt index. If it isn't, but is just steadily working through the index, you'd be best advised to have patience. I suspect you're right. I just ran strace against that PID again, and now all the lseek read FD's are referrring to a different number (115), so that means its moved onto something new since I looked a few hours ago? Anyway, I think this is what you were referring to: /proc/30188/fd/115 - /var/lib/pgsql/data/base/64793/72633.10 How do I correlate that file to an actual database object? thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
Lonni J Friedman netll...@gmail.com writes: I suspect you're right. I just ran strace against that PID again, and now all the lseek read FD's are referrring to a different number (115), so that means its moved onto something new since I looked a few hours ago? Anyway, I think this is what you were referring to: /proc/30188/fd/115 - /var/lib/pgsql/data/base/64793/72633.10 How do I correlate that file to an actual database object? 64793 is the pg_database.oid of the database, and 72633 is the pg_class.relfilenode value of the table/index. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
On Tue, Nov 22, 2011 at 7:49 PM, Tom Lane t...@sss.pgh.pa.us wrote: Lonni J Friedman netll...@gmail.com writes: I suspect you're right. I just ran strace against that PID again, and now all the lseek read FD's are referrring to a different number (115), so that means its moved onto something new since I looked a few hours ago? Anyway, I think this is what you were referring to: /proc/30188/fd/115 - /var/lib/pgsql/data/base/64793/72633.10 How do I correlate that file to an actual database object? 64793 is the pg_database.oid of the database, and 72633 is the pg_class.relfilenode value of the table/index. Its definitely an index.Thanks for your help, I just need to be patient now that I understand how to better monitor this. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
On Tue, Nov 22, 2011 at 11:00 PM, Lonni J Friedman netll...@gmail.com wrote: On Tue, Nov 22, 2011 at 7:49 PM, Tom Lane t...@sss.pgh.pa.us wrote: Lonni J Friedman netll...@gmail.com writes: I suspect you're right. I just ran strace against that PID again, and now all the lseek read FD's are referrring to a different number (115), so that means its moved onto something new since I looked a few hours ago? Anyway, I think this is what you were referring to: /proc/30188/fd/115 - /var/lib/pgsql/data/base/64793/72633.10 How do I correlate that file to an actual database object? 64793 is the pg_database.oid of the database, and 72633 is the pg_class.relfilenode value of the table/index. Its definitely an index. Thanks for your help, I just need to be patient now that I understand how to better monitor this. Well, it sounds like you have things set up for both a cost limit and a cost delay, which means if you manually vacuumed the thing, it would probably go quicker, at the cost of more i/o, but given the cpu overhead, probably a trade worth making. Personally I'd throw out those vacuum cost settings entirely as they cause more trouble than they're worth (IMNSHO), and you'll likely see this again in the future. Robert Treat conjecture: xzilla.net consulting: omniti.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
On Tue, Nov 22, 2011 at 10:51 PM, Robert Treat r...@xzilla.net wrote: On Tue, Nov 22, 2011 at 11:00 PM, Lonni J Friedman netll...@gmail.com wrote: On Tue, Nov 22, 2011 at 7:49 PM, Tom Lane t...@sss.pgh.pa.us wrote: Lonni J Friedman netll...@gmail.com writes: I suspect you're right. I just ran strace against that PID again, and now all the lseek read FD's are referrring to a different number (115), so that means its moved onto something new since I looked a few hours ago? Anyway, I think this is what you were referring to: /proc/30188/fd/115 - /var/lib/pgsql/data/base/64793/72633.10 How do I correlate that file to an actual database object? 64793 is the pg_database.oid of the database, and 72633 is the pg_class.relfilenode value of the table/index. Its definitely an index. Thanks for your help, I just need to be patient now that I understand how to better monitor this. Well, it sounds like you have things set up for both a cost limit and a cost delay, which means if you manually vacuumed the thing, it would probably go quicker, at the cost of more i/o, but given the cpu overhead, probably a trade worth making. Personally I'd throw out those vacuum cost settings entirely as they cause more trouble than they're worth (IMNSHO), and you'll likely see this again in the future. I'd keep an eye on iostat -xd 10 output when playing with vacuum settings. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum ignores some tables
2011/6/23 Gábor Farkas ga...@nekomancer.net: hi, postgresql8.4.7 here. i checked the pg_stat_user_tables table, and it have a lot of rows there where the last_autovacuum and/or last_autoanalyze are null. does this mean that autovacuum never worked on those tables? roughly 70% of all the tables have null in those fields.. in those never-autovacuumed tables there are tables that are quite big, and also have a lot of activity, so it's not that they never needed vacuuming... i wonder why autovacuum ignored them. i checked my settings with SHOW ALL in psql, and the corresponding settings are: autovacuum on autovacuum_analyze_scale_factor 0.1 autovacuum_analyze_threshold 50 autovacuum_freeze_max_age 2 autovacuum_max_workers 3 autovacuum_naptime 1min autovacuum_vacuum_cost_delay 20ms autovacuum_vacuum_cost_limit -1 autovacuum_vacuum_scale_factor 0.2 autovacuum_vacuum_threshold 50 track_counts on any ideas why autovacuum ignores some of the tables? The table may have not had enough updates or deletes to trigger a vacuum. Are these insert-only tables? When you look at pg_stat_user_tables, check the n_tup_upd and n_tup_del columns. If autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor * rows in the table) n_dead_tup in pg_stat_user_tables, then the table should be autovacuum'd. If it hasn't yet reached this number, it won't yet be a candidate. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum ignores some tables
2011/6/23 Thom Brown t...@linux.com: 2011/6/23 Gábor Farkas ga...@nekomancer.net: hi, postgresql8.4.7 here. i checked the pg_stat_user_tables table, and it have a lot of rows there where the last_autovacuum and/or last_autoanalyze are null. does this mean that autovacuum never worked on those tables? roughly 70% of all the tables have null in those fields.. in those never-autovacuumed tables there are tables that are quite big, and also have a lot of activity, so it's not that they never needed vacuuming... i wonder why autovacuum ignored them. i checked my settings with SHOW ALL in psql, and the corresponding settings are: autovacuum on autovacuum_analyze_scale_factor 0.1 autovacuum_analyze_threshold 50 autovacuum_freeze_max_age 2 autovacuum_max_workers 3 autovacuum_naptime 1min autovacuum_vacuum_cost_delay 20ms autovacuum_vacuum_cost_limit -1 autovacuum_vacuum_scale_factor 0.2 autovacuum_vacuum_threshold 50 track_counts on any ideas why autovacuum ignores some of the tables? The table may have not had enough updates or deletes to trigger a vacuum. Are these insert-only tables? When you look at pg_stat_user_tables, check the n_tup_upd and n_tup_del columns. If autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor * rows in the table) n_dead_tup in pg_stat_user_tables, then the table should be autovacuum'd. If it hasn't yet reached this number, it won't yet be a candidate. thanks for the explanation, now i understand. just to clarify: you probably meant the opposite, correct? when n_dead_tup is MORE than the threshold... gabor -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum ignores some tables
2011/6/23 Gábor Farkas ga...@nekomancer.net: 2011/6/23 Thom Brown t...@linux.com: 2011/6/23 Gábor Farkas ga...@nekomancer.net: hi, postgresql8.4.7 here. i checked the pg_stat_user_tables table, and it have a lot of rows there where the last_autovacuum and/or last_autoanalyze are null. does this mean that autovacuum never worked on those tables? roughly 70% of all the tables have null in those fields.. in those never-autovacuumed tables there are tables that are quite big, and also have a lot of activity, so it's not that they never needed vacuuming... i wonder why autovacuum ignored them. i checked my settings with SHOW ALL in psql, and the corresponding settings are: autovacuum on autovacuum_analyze_scale_factor 0.1 autovacuum_analyze_threshold 50 autovacuum_freeze_max_age 2 autovacuum_max_workers 3 autovacuum_naptime 1min autovacuum_vacuum_cost_delay 20ms autovacuum_vacuum_cost_limit -1 autovacuum_vacuum_scale_factor 0.2 autovacuum_vacuum_threshold 50 track_counts on any ideas why autovacuum ignores some of the tables? The table may have not had enough updates or deletes to trigger a vacuum. Are these insert-only tables? When you look at pg_stat_user_tables, check the n_tup_upd and n_tup_del columns. If autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor * rows in the table) n_dead_tup in pg_stat_user_tables, then the table should be autovacuum'd. If it hasn't yet reached this number, it won't yet be a candidate. thanks for the explanation, now i understand. just to clarify: you probably meant the opposite, correct? when n_dead_tup is MORE than the threshold... Erk, yes, switch the to a . -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] autovacuum issue after upgrade to 9.0.1
We are still seeing the spike in vacuums every 8 days, even after upgrading to 9.0.3. Any suggestions on how to spread them out? Thanks, George Woodring On Wed, Mar 16, 2011 at 7:12 PM, Aleksey Tsalolikhin atsaloli.t...@gmail.com wrote: Do you see this issue on 9.0.3, the current and the recommended 9.x version? -- iGLASS Networks www.iglass.net -- iGLASS Networks www.iglass.net
Re: [GENERAL] Autovacuum firing up during my manual vacuum on same table
On 3. April 2011, Joshua D. Drake wrote: On Sat, 2 Apr 2011 19:26:56 +0200, Henry C. he...@cityweb.co.za wrote: On Sat, April 2, 2011 14:17, Jens Wilke wrote: Nevertheless since at least 8.4 IMO there's no need to bother with manual vacuum any more. Uhh, this is entirely untrue. There are plenty of cases where 8.4 autovacuum can't cut it. Which cases? Isn't it more like something else went suboptimal when starting to think about manual vacuum? May be i better had written that since 8.4 there's the opportunity not to bother with manual vacuum any more. Regards, Jens -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum firing up during my manual vacuum on same table
On Sat, April 2, 2011 22:30, Tom Lane wrote: Have you tried upping the aggressiveness of autovacuum? I'm wondering about poor selection of the cost_delay settings in particular. It's quite easy to slow autovacuum to the point that it takes forever to do anything. It's been on the default 20ms. Now giving 0 a try. In our app responsiveness is less of a concern since we don't have human interaction. Reliability is a greater concern. It's also possible that Henry is getting bit by the bug fixed here: Author: Tom Lane t...@sss.pgh.pa.us Branch: master [b58c25055] 2010-11-19 22:28:20 -0500 Branch: REL9_0_STABLE Release: REL9_0_2 [b5efc0940] 2010-11-19 22:28:25 -0500 Branch: REL8_4_STABLE Release: REL8_4_6 [fab2af30d] 2010-11-19 22:28:30 -0500 Branch: REL8_3_STABLE Release: REL8_3_13 [6cb9d5113] 2010-11-19 22:28:35 -0500 Fix leakage of cost_limit when multiple autovacuum workers are active. I'm using 9.0.3, and typically (when things eventually deteriorate to a impending-wraparound situation) there are at least 2 and sometimes a few more autovac procs running - some of them for weeks). Anyway, time will now tell whether a cost_delay of 0 and some more SSDs will help prevent hitting the wraparound wall. Cheers h -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum firing up during my manual vacuum on same table
On Sat, 2 Apr 2011 19:26:56 +0200, Henry C. he...@cityweb.co.za wrote: On Sat, April 2, 2011 14:17, Jens Wilke wrote: Nevertheless since at least 8.4 IMO there's no need to bother with manual vacuum any more. Uhh, this is entirely untrue. There are plenty of cases where 8.4 autovacuum can't cut it. Sadly, in my case, the db is so busy that autovac processes run for weeks and never catch up (insufficient h/w for the app quite frankly - the addition of You can disable autovacuum for the relations that you are manually vacuuming. If pre 8.4 see pg_autovacuum, if post see alter table. JD -- PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum firing up during my manual vacuum on same table
On Sun, Apr 3, 2011 at 2:39 PM, Joshua D. Drake j...@commandprompt.com wrote: On Sat, 2 Apr 2011 19:26:56 +0200, Henry C. he...@cityweb.co.za wrote: On Sat, April 2, 2011 14:17, Jens Wilke wrote: Nevertheless since at least 8.4 IMO there's no need to bother with manual vacuum any more. Uhh, this is entirely untrue. There are plenty of cases where 8.4 autovacuum can't cut it. This is doubly true if you haven't adjusted any of the default costing parameters of autovacuum. It is not set to the level of aggressiveness needed for big servers under high load, since doing so would slow down / swamp smaller servers. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum firing up during my manual vacuum on same table
Anyway, is that autovac duplicating work or locked out and waiting? Impolitely responding to my own post: a quick strace confirms the autovac process is indeed locked out and waiting it's turn to work. Presumably when my manual vacuum finishes, it will then proceed and *hopefully* not re-vacuum the table? Regards Henry -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum firing up during my manual vacuum on same table
On Samstag, 2. April 2011, Henry C. wrote: I just noticed something odd: I'm busy with a manual vacuum on a table and an autovacuum keeps firing up as well. Usually a manual vacuum cancels a running autovacuum task. You should find a notice about the cancelation in th logfile. current_query | vacuum analyze age | 11:46:57.245568 Where is the age column from? It's not in pg_stat_activity. Is one of the two processes waiting=t in pg_stat_activity? Shure it's the same Table? Do you have one Table named page_citation_text in public and one in anoter sheme? I'm trying to vacuum the table as quickly as possible so a manual vacuum seems to be in order as my understanding (and experience) is that the autovac is a hell of a lot slower to mitigate impact on general performance. Tune the autovacuum settings, especially the scale factors of the tables in question and the cost limit. Is autovacuum_vacuum_cost_limit != -1 ? This could be one reason, why manual vacuum is faster. Nevertheless since at least 8.4 IMO there's no need to bother with manual vacuum any more. Regards, Jens -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Autovacuum firing up during my manual vacuum on same table
Original-Nachricht Datum: Sat, 2 Apr 2011 14:17:37 +0200 Von: Jens Wilke j...@wilke.org An: pgsql-general@postgresql.org Betreff: Re: [GENERAL] Autovacuum firing up during my manual vacuum on same table On Samstag, 2. April 2011, Henry C. wrote: I just noticed something odd: I'm busy with a manual vacuum on a table and an autovacuum keeps firing up as well. Usually a manual vacuum cancels a running autovacuum task. You should find a notice about the cancelation in th logfile. current_query | vacuum analyze age | 11:46:57.245568 Where is the age column from? It's not in pg_stat_activity. Is one of the two processes waiting=t in pg_stat_activity? Shure it's the same Table? Do you have one Table named page_citation_text in public and one in anoter sheme? I'm trying to vacuum the table as quickly as possible so a manual vacuum seems to be in order as my understanding (and experience) is that the autovac is a hell of a lot slower to mitigate impact on general performance. Tune the autovacuum settings, especially the scale factors of the tables in question and the cost limit. Is autovacuum_vacuum_cost_limit != -1 ? This could be one reason, why manual vacuum is faster. Nevertheless since at least 8.4 IMO there's no need to bother with manual vacuum any more. Regards, Jens a bit off-topic, but: i'm using pg 9 and always getting messages to vacuum tables in pgadmin. according to your post this souldn't be the case? cheers sven -- GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit gratis Handy-Flat! http://portal.gmx.net/de/go/dsl -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general