[GENERAL] Adding more space, and a vacuum question.

2011-01-28 Thread Herouth Maoz
Hello. We have two problems (which may actually be related...)

1. We are running at over 90% capacity of the disk at one of the servers - a 
report/data warehouse system. We have ran out of disk space several times. Now 
we need to make some file-archived data available on the database to support 
our legal team. This means two huge tables to be added to the database. The 
only solution that I see is to add more space by means of another tablespace. 
The two tables are static - after loading them and creating indexes they will 
not be changed.

The machine has no additional room for internal disks. It is a recent purchase 
and not likely to be replaced any time soon. Now, my position is that the best 
solution would be to add an external hard disk, via USB/firewire, and use it 
for the archive tables. My sysadmin, on the other hand, wants to mount a 
storage machine remotely and use it for the extra tablespace, as the storage 
machine is a more reliable hardware. I think that remote mounted volumes are 
not a proper device for a database, as the network is subject to load and I've 
ran into frozen mounts in both NFS and SMB in the past. Never mind being slower.

Which solution would you advise and which one of us is right?

2. That database has a few really huge tables. I think they are not being 
automatically vacuumed properly. In the past few days I've noticed a vacuum 
process on one of them which has been running since January 14th. 
Unfortunately, it never finished, because we were informed of a scheduled power 
down in our building yesterday, and had to shut down the machine. The questions 
are:

a. Is it normal for vacuum processes to take two weeks?
b. What happens if the vacuum process is stopped? Are the tuples partially 
recovered, or are they only recovered if the process completes properly?
c. Is there anything I can do to make vacuums shorter?
d. After restarting the server, all the data in pg_stat_user_tables seem to 
have been reset. What does this mean and how does this affect vacuum scheduling?

Thank you in advance,
Herouth


Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-29 Thread Craig Ringer

On 01/29/2011 05:12 AM, Herouth Maoz wrote:


The machine has no additional room for internal disks. It is a recent
purchase and not likely to be replaced any time soon.


Newly acquired or not, it sounds like it isn't sized correctly for the 
load and needs an upgrade if it can't be shifted into a more suitable 
role and replaced.



Now, my position
is that the best solution would be to add an external hard disk, via
USB/firewire


eSATA? Via a PCI or PCIe add-in SATA controller if there's no existing 
eSATA.


FireWire is usable for a database. USB is too ... kind of. Performance 
will be poor because of the high latency, CPU-heavy non-DMA access done 
by the USB stack.


For something read-only, that might be OK.


and use it for the archive tables. My sysadmin, on the
other hand, wants to mount a storage machine remotely and use it for the
extra tablespace, as the storage machine is a more reliable hardware.


If you have iSCSI or ATA-over-Ethernet disk volumes you can mount, that 
might be a good idea. I'd personally avoid NFS or SMB.


That said, again if it's read-only you might be fine.


think that remote mounted volumes are not a proper device for a
database, as the network is subject to load and I've ran into frozen
mounts in both NFS and SMB in the past. Never mind being slower.


Remote *file* *level* shares aren't great for databases, IMO.


a. Is it normal for vacuum processes to take two weeks?


Define "really huge" and describe the hardware; without numbers it's 
hard to know. What version of Pg are you using?


Was it a standalone VACUUM or was it an autovacuum worker?


b. What happens if the vacuum process is stopped? Are the tuples
partially recovered, or are they only recovered if the process completes
properly?


I *think* tuples become available progressively, but I'm not certain of 
that.



c. Is there anything I can do to make vacuums shorter?


Do it much more often.

Use Pg 8.4 or later, with visibility map.

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-29 Thread Herouth Maoz

בתאריך 29/01/11 13:57, ציטוט Craig Ringer:

On 01/29/2011 05:12 AM, Herouth Maoz wrote:


The machine has no additional room for internal disks. It is a recent
purchase and not likely to be replaced any time soon.


Newly acquired or not, it sounds like it isn't sized correctly for the 
load and needs an upgrade if it can't be shifted into a more suitable 
role and replaced.

Sigh. Budget considerations, you know.



Now, my position
is that the best solution would be to add an external hard disk, via
USB/firewire


eSATA? Via a PCI or PCIe add-in SATA controller if there's no existing 
eSATA.
Oh, yes, I forgot about eSATA. I meant basically a real local connection 
rather than network one.




FireWire is usable for a database. USB is too ... kind of. Performance 
will be poor because of the high latency, CPU-heavy non-DMA access 
done by the USB stack.


For something read-only, that might be OK.


and use it for the archive tables. My sysadmin, on the
other hand, wants to mount a storage machine remotely and use it for the
extra tablespace, as the storage machine is a more reliable hardware.


If you have iSCSI or ATA-over-Ethernet disk volumes you can mount, 
that might be a good idea. I'd personally avoid NFS or SMB.


OK.


That said, again if it's read-only you might be fine.
Question is - if the read-only tablespace gets stuck/frozen, what 
happens to the read-write part of the database, which is absolutely 
essential to have in good responsive working order?

a. Is it normal for vacuum processes to take two weeks?


Define "really huge" and describe the hardware; without numbers it's 
hard to know. What version of Pg are you using?
Pg 8.3.11. The tables have more than 200,000,000 records. About the 
hardware, I'm not entirely in the loop, but it has two dual-core Intel 
Xeon 5130 CPUs, 4G of memory, and its system disk (111G) is separate 
from the database disk (825G). The disks are hardware RAID, but I'm not 
sure which level, and I think they are 10,000 RPM but I could be wrong.


Was it a standalone VACUUM or was it an autovacuum worker?

Autovacuum worker.

TIA,
Herouth

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-30 Thread Herouth Maoz

On 30/01/2011, at 13:03, Alban Hertroys wrote:

> On 28 Jan 2011, at 22:12, Herouth Maoz wrote:
> 
>> 2. That database has a few really huge tables. I think they are not being 
>> automatically vacuumed properly. In the past few days I've noticed a vacuum 
>> process on one of them which has been running since January 14th. 
>> Unfortunately, it never finished, because we were informed of a scheduled 
>> power down in our building yesterday, and had to shut down the machine. The 
>> questions are:
>> 
>> a. Is it normal for vacuum processes to take two weeks?
> 
> For a 200M record table that's definitely on the long side. It was probably 
> waiting on a lock by another transaction. In most cases that means that some 
> transaction was kept open for that duration.
> If that transaction came into existence by accident, then vacuum should be 
> fine now that the server has restarted - that transaction is gone now. You 
> may want to keep an eye out for long-running transactions though, that's 
> usually a programming error - it's sometimes done deliberately, but it's 
> still a bad idea from the point of the database.

Unless my eyes were deceiving me, this was not the case. Sure, there have been 
heavy transactions during that time (e.g. the daily backup of the database, and 
the daily inserts into other tables, which take a long time, and a few selects 
which I haven't been able to find an optimal index for). But this is the query 
I use to see these processes (ran from a superuser):

SELECT usename, procpid, query_start, client_addr, client_port, 
current_query,waiting
FROM pg_stat_activity
WHERE query_start < now() - interval '3 seconds'
AND xact_start is not null order by xact_start

Any long transactions should be caught by it, but most of the time, all I see 
are vacuum workers.

By the way, the auto vacuum on that table has started again - but only after 
more records were deleted from it. It has now been running since yesterday at 
17:00. Here is the pg_stat_user_tables record for this table (which has also 
updated after the deletes):

relid| 17806
schemaname   | sms
relname  | billing__archive
seq_scan | 9
seq_tup_read | 2053780855
idx_scan | 2553
idx_tup_fetch| 8052678
n_tup_ins| 11437874
n_tup_upd| 0
n_tup_del| 7987450
n_tup_hot_upd| 0
n_live_tup   | 218890768
n_dead_tup   | 33710378
last_vacuum  | 
last_autovacuum  | 
last_analyze | 2011-01-29 15:29:37.059176+02
last_autoanalyze | 

> 
> In older PG versions autovacuum could get stuck like that on large tables. It 
> keeps starting over trying to vacuum that same table, but never reaches the 
> end of it. Since it's only a single worker process (in those versions), it 
> also will never vacuum any tables beyond the table it got stuck on.

How old? Mine is 8.3.11.

> 
> If you don't delete or update tuples a lot, then the tables are probably just 
> that big. If you do delete/update them regularly, try if a normal vacuum will 
> shrink them enough (probably not) and if not, schedule a VACUUM FULL and a 
> REINDEX at some time the database isn't too busy. Both are quite heavy 
> operations that take exclusive locks on things (tables, indices).

Yes, I do delete many tuples from that table. My mode of usage is like this: I 
have a small table called billing which receives new data every night. I want 
to keep that table small so that those nightly updates don't take an overly 
long time, because all data (several such tables) has to be ready in the 
database by the next morning. Therefore, once a week on the weekend, I move a 
week's worth of data to billing__archive (the table we are discussing), and 
delete a week's worth from its end. Now, the indexes on that table would make 
this impossible to do within the weekend, so what I do is drop all the indexes 
before I do the inserts, and then recreate them, and then do the deletes.

What you are saying is that in this mode of operation, there's basically no 
hope that autovacuum will ever salvage the deleted records? Does removing and 
recreating the indexes have any effect on the vacuuming process?

If a vacuum takes me several days (let alone over a week!) than a VACUUM FULL 
is out of the question. VACUUM FULL locks the table completely and that table 
is essential to our customer care. If push comes to shove, I think I'd rather 
dump that table, drop it, and restore it over the weekend, which I believe will 
be faster than a VACUUM FULL.

One other important question: a tuple marked by VACUUM as reusable (not VACUUM 
FULL which restores it to the operating system) - can its space ever be used by 
another table, or can it only be used for new inserts into the same table?

> 
>> d. After restarting the server, all the data in pg_stat_user_tables seem to 
>> have been reset. What does this mean and how does this affect vacuum 
>> scheduling?
> 
> I recall reading somewhere that that's normal; probably this

Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-30 Thread Andrew Sullivan
On Sun, Jan 30, 2011 at 04:56:29PM +0200, Herouth Maoz wrote:
> 
> Unless my eyes were deceiving me, this was not the case. Sure, there have 
> been heavy transactions during that time (e.g. the daily backup of the 
> database, and the daily inserts into other tables, which take a long time, 
> and a few selects which I haven't been able to find an optimal index for). 
> But this is the query I use to see these processes (ran from a superuser):
> 
> SELECT usename, procpid, query_start, client_addr, client_port, 
> current_query,waiting
> FROM pg_stat_activity
> WHERE query_start < now() - interval '3 seconds'
> AND xact_start is not null order by xact_start
> 
> Any long transactions should be caught by it, but most of the time, all I see 
> are vacuum workers.

Well, what's your I/O on the disk?  Have you tuned vacuum?  Maybe
you're just saturating the ability of the table to be vacuumed, or
else vacuum is being told to back off?

> Yes, I do delete many tuples from that table. My mode of usage is
> like this: I have a small table called billing which receives new
> data every night. I want to keep that table small so that those
> nightly updates don't take an overly long time, because all data
> (several such tables) has to be ready in the database by the next
> morning. Therefore, once a week on the weekend, I move a week's
> worth of data to billing__archive (the table we are discussing), and
> delete a week's worth from its end. Now, the indexes on that table
> would make this impossible to do within the weekend, so what I do is
> drop all the indexes before I do the inserts, and then recreate
> them, and then do the deletes.

Without looking at the details of your database, I have to say that
the above sounds to me like more work than letting the system handle
this itself.  I have a suspicion that what you really want to do is
trickle out the changes rather than trying to do things in big batches
this way.
 
> If a vacuum takes me several days (let alone over a week!) than a
> VACUUM FULL is out of the question. VACUUM FULL locks the table
> completely and that table is essential to our customer care. If push
> comes to shove, I think I'd rather dump that table, drop it, and
> restore it over the weekend, which I believe will be faster than a
> VACUUM FULL.

Yes, I think so too.  And I bet at the current state of affairs,
that's a good bet.  Whatever the situation, I suspect things are too
bad off to be worth trying to get through a vacuum with.
 
> One other important question: a tuple marked by VACUUM as reusable
> (not VACUUM FULL which restores it to the operating system) - can
> its space ever be used by another table, or can it only be used for
> new inserts into the same table?

It's managed by postgres, but given your churn rate on these tables
I'd be tempted to set a fillfactor with a lot of room, and let the
tables be "big" (i.e. with a lot of empty space) so that their long
term storage footprint is stable.

A

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-30 Thread Herouth Maoz

On 30/01/2011, at 12:27, Craig Ringer wrote:
> 
> OK, so you're pre-8.4 , which means you have the max_fsm settings to play 
> with. Have you seen any messages in the logs about the free space map (fsm)? 
> If your install didn't have a big enough fsm to keep track of deleted tuples, 
> you'd face massive table bloat that a regular vacuum couldn't fix.

Ouch. You're absolutely right. There are messages about max_fsm_pages in the 
postgres log. It's currently set to 153600. According to the documentation, I 
can increase it up to 20. Will that even help? How do I find out how many I 
need to set it to?

> 
> You also don't have the visibility map, which means that (auto)vacuum can't 
> skip bits of the tables it knows don't need vacuuming. Your vacuums will be 
> slower.
> 
> Autovacuum improved significantly in both 8.4 and 9.0; consider an upgrade.


I will consider it. Thank you.

Herouth

Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-30 Thread Craig Ringer

On 01/31/2011 12:14 AM, Herouth Maoz wrote:


On 30/01/2011, at 12:27, Craig Ringer wrote:


OK, so you're pre-8.4 , which means you have the max_fsm settings to
play with. Have you seen any messages in the logs about the free space
map (fsm)? If your install didn't have a big enough fsm to keep track
of deleted tuples, you'd face massive table bloat that a regular
vacuum couldn't fix.


Ouch. You're absolutely right. There are messages about max_fsm_pages in
the postgres log. It's currently set to 153600. According to the
documentation, I can increase it up to 20. Will that even help? How
do I find out how many I need to set it to?


I think the logs suggest what to set. I haven't used 8.3 in ages and 
don't remember well.


Increasing it won't help after the fact. You almost certainly have badly 
bloated tables. Fixing that will be interesting in your current 
low-disk-space situation. VACUUM FULL would work - but will exclusively 
lock the table being vacuumed for *ages*, so nothing else can do any 
work, not even reads. CLUSTER will do the same, and while it's much 
faster, to work it requires enough free disk space to store a complete 
copy of the still-valid parts of the table while the bloated original is 
still on disk. You may have to look into some of the lockless fake 
vacuum full approaches.


I think table bloat identification and management is one of the worst 
problems PostgreSQL has remaining. It's too hard, out of the box, to 
discover bloat developing, and it's too disruptive to fix it if and when 
it does happen. The automatic free space map management in 8.4, and the 
ongoing autovacuum improvements, help reduce the chances of bloat 
happening, but it's still a pain to monitor for and a pain to fix when 
it does happen.


For approaches to possibly fixing your problem, see:

http://www.depesz.com/index.php/2010/10/17/reduce-bloat-of-table-without-longexclusive-locks/

http://blog.endpoint.com/2010/09/reducing-bloat-without-locking.html

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-31 Thread Alban Hertroys
On 30 Jan 2011, at 17:14, Herouth Maoz wrote:

> On 30/01/2011, at 12:27, Craig Ringer wrote:
>> 
>> OK, so you're pre-8.4 , which means you have the max_fsm settings to play 
>> with. Have you seen any messages in the logs about the free space map (fsm)? 
>> If your install didn't have a big enough fsm to keep track of deleted 
>> tuples, you'd face massive table bloat that a regular vacuum couldn't fix.
> 
> Ouch. You're absolutely right. There are messages about max_fsm_pages in the 
> postgres log. It's currently set to 153600. According to the documentation, I 
> can increase it up to 20. Will that even help? How do I find out how many 
> I need to set it to?
> 
>> 
>> You also don't have the visibility map, which means that (auto)vacuum can't 
>> skip bits of the tables it knows don't need vacuuming. Your vacuums will be 
>> slower.
>> 
>> Autovacuum improved significantly in both 8.4 and 9.0; consider an upgrade.
> 
> I will consider it. Thank you.

I'm not trying to push an upgrade on you, but if restoring a dump is one of 
your better options to get rid of table bloat anyway, now seems a good time.

What I'd do is:
- Find a test system with enough disk space
- Install PG 8.4 on it
- Create a dump of your database, using this 8.4 version of pg_dump
- Restore it to the test version and check everything's alright
- If it is, pull the plug on that 8.3 database - you'll probably want to create 
a fresh dump here if the system is still in production
- Install an 8.4 and use the previously made dump to restore it
- Don't forget to run analyse right after if it's going to be used right away

You will want to check the release notes. One of the things that sometimes 
bites people is that 8.4 has tighter type-conversion restrictions. Some 
implicit type-casts that used to work don't anymore, unless made explicit.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4d466a8511732033268635!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Adding more space, and a vacuum question.

2011-01-31 Thread Herouth Maoz

On 31/01/2011, at 03:49, Craig Ringer wrote:

> For approaches to possibly fixing your problem, see:
> 
> http://www.depesz.com/index.php/2010/10/17/reduce-bloat-of-table-without-longexclusive-locks/
> 
> http://blog.endpoint.com/2010/09/reducing-bloat-without-locking.html

I'm not quite sure what this will do for me. How will Postgresql know that 
there is free space in low-numbered pages, if the bloat has not been vacuumed 
off and is not in the fsm?
--
חרות מעוז
יוניסל פתרונות סלולריים מתקדמים
☎ 03-5181717 שלוחה 742