Re: [GENERAL] Table Vacuum Taking a Long Time

2014-04-02 Thread Hannes Erven

Hi Eliot,


> If I continue to run vacuum
> analyze on it, it continues to take about 90 seconds each time.
> [..]
> I was under the impression that if nothing had been done to the table
> since it was last vacuumed, that it would return immediately.

in addition to what others already said, I'd expect that by specifying 
ANALYZE you are forcing PostgreSQL to do quite expensive additional work.


Is there a reason why you cannot (or do not want to) rely on 
autovaccuum? Have you tried tuning the parameters?


Even if you must use manual VACUUMs, you can probably get away without 
ANALZE.



Best regards,

-hannes


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table Vacuum Taking a Long Time

2014-04-02 Thread Alvaro Herrera
Eliot Gable wrote:
> I have a table which is about 12 GB in size. It has had a vacuum full
> analyze run on it, and then immediately after, I run vacuum analyze and it
> takes about 90 seconds to complete. If I continue to run vacuum analyze on
> it, it continues to take about 90 seconds each time. This system has a
> single 7200 RPM drive in it, so it's not a very fast drive. I was under the
> impression that if nothing had been done to the table since it was last
> vacuumed, that it would return immediately. Further, this is an append-only
> table, so why should it need to be vacuumed at all? We ran into cases where
> after writing to it long enough, the PGSQL autovacuum process would kick in
> and force a vacuum saying something about preventing wrap around. I don't
> understand why it would do this if it is append-only and we are using
> 64-bit sequences as IDs without OIDs turned on. What would be wrapping
> around without a vacuum? We tried to mitigate this by manually running
> vacuum programmatically, but then we end up using all the disk IO just
> running vacuum all the time, because it is constantly running through the
> entire table even though very little (if anything) has been done to it
> since the last vacuum.

Vacuuming a table requires first scanning the heap, then scanning each
indexes.  The heap scan can be optimized using the visibility map: pages
can be skipped if they are marked all-visible.  Vacuum will mark pages
all-visible, so a second vacuum immediately thereafter would need to
read few pages (and probably modify even fewer).  But the index scans
cannot be optimized in that fashion; indexes must be scanned completely
each time.  I would bet that that's where the time is going.  Also note
that not all pages can be marked all-visible, and that pages are only
skipped if there are enough of them consecutive that it's worth when
considering disk readahead done by the operating system.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table Vacuum Taking a Long Time

2014-04-02 Thread François Beausoleil
Hi!

Le 2014-04-02 à 10:26, Eliot Gable a écrit :

> I have a table which is about 12 GB in size. It has had a vacuum full analyze 
> run on it, and then immediately after, I run vacuum analyze and it takes 
> about 90 seconds to complete. If I continue to run vacuum analyze on it, it 
> continues to take about 90 seconds each time. This system has a single 7200 
> RPM drive in it, so it's not a very fast drive. I was under the impression 
> that if nothing had been done to the table since it was last vacuumed, that 
> it would return immediately. Further, this is an append-only table, so why 
> should it need to be vacuumed at all? We ran into cases where after writing 
> to it long enough, the PGSQL autovacuum process would kick in and force a 
> vacuum saying something about preventing wrap around. I don't understand why 
> it would do this if it is append-only and we are using 64-bit sequences as 
> IDs without OIDs turned on. What would be wrapping around without a vacuum? 
> We tried to mitigate this by manually running vacuum programmatically, but 
> then we end up using all the disk IO just running vacuum all the time, 
> because it is constantly running through the entire table even though very 
> little (if anything) has been done to it since the last vacuum.
> 
> Is this described behavior expected? If so, why?

You don't mention the version of PostgreSQL, but let me link you to this page 
in the manual:

23.1.5. Preventing Transaction ID Wraparound Failures

"""But since transaction IDs have limited size (32 bits) a cluster that runs 
for a long time (more than 4 billion transactions) would suffer transaction ID 
wraparound"""

http://www.postgresql.org/docs/9.3/static/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

This is what the message in the log said.

Because the table is append only, old rows must receive the transaction ID 
which is guaranteed to be lower than all other transactions: FrozenXID.

If you run two vacuums back to back, presumably the table is mostly in RAM and 
returns more quickly the next time around, hence the very different runtimes.

You say "if nothing had been done to the table": do you have a single very 
large append-only table in your cluster? If so, it might be better to split the 
table in multiple partitions, and then when a partition is untouched, the 
vacuum daemon should not touch the table.

Please provide your exact PostgreSQL version, RAM, disk and other details, 
relevant postgresql.conf parameters so that we may help more.

Cheers!
François Beausoleil
Seevibes



smime.p7s
Description: S/MIME cryptographic signature


[GENERAL] Table Vacuum Taking a Long Time

2014-04-02 Thread Eliot Gable
I have a table which is about 12 GB in size. It has had a vacuum full
analyze run on it, and then immediately after, I run vacuum analyze and it
takes about 90 seconds to complete. If I continue to run vacuum analyze on
it, it continues to take about 90 seconds each time. This system has a
single 7200 RPM drive in it, so it's not a very fast drive. I was under the
impression that if nothing had been done to the table since it was last
vacuumed, that it would return immediately. Further, this is an append-only
table, so why should it need to be vacuumed at all? We ran into cases where
after writing to it long enough, the PGSQL autovacuum process would kick in
and force a vacuum saying something about preventing wrap around. I don't
understand why it would do this if it is append-only and we are using
64-bit sequences as IDs without OIDs turned on. What would be wrapping
around without a vacuum? We tried to mitigate this by manually running
vacuum programmatically, but then we end up using all the disk IO just
running vacuum all the time, because it is constantly running through the
entire table even though very little (if anything) has been done to it
since the last vacuum.

Is this described behavior expected? If so, why?

If it is not expected, what should I be looking for which might explain why
it is taking so long?

Thanks.

-- 
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be
a crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live;
not live to eat.) ~Marcus Tullius Cicero