Re: [PERFORM] too many clog files

2008-09-09 Thread Matt Smiley
Hi Duan,

As others have said, you should probably attempt to run pg_dump to export your 
database.  If that doesn't work, consider restoring from backup.  If the dump 
does work, you can create a clean PGDATA directory (using initdb like when you 
setup your original installation), and create a fresh copy of your database 
using the dump file.  Then abandon your potentially damaged PGDATA directory.

For future reference:

 - The "autovacuum" parameter in postgresql.conf is off by default under 
Postgres 8.1.  You should probably turn it on to ensure regular vacuuming, 
unless you have your own cronjob to do the vacuuming.

 - About finding old transactions, there are 2 places you have to look for old 
transactions.  The usual place is in pg_stat_activity.  The 2nd place is 
"pg_prepared_xacts", where prepared transactions are listed.  If there's a 
prepared transaction in your system, it might explain why your old commit-logs 
aren't being purged.  The following query shows both prepared and normal 
transactions:

select
  l.transactionid,
  age(l.transactionid) as age,  /* measured in number of other transactions 
elapsed, not in terms of time */
  l.pid,
  case when l.pid is null then false else true end as is_prepared,
  a.backend_start,
  p.prepared as time_xact_was_prepared,
  p.gid as prepared_name
from
  pg_locks l
  left outer join pg_stat_activity a on l.pid = a.procpid
  left outer join pg_prepared_xacts p on l.transactionid = p.transaction
where
  l.locktype = 'transactionid'
  and l.mode = 'ExclusiveLock'
  and l.granted
order by age(l.transactionid) desc
;

 transactionid | age | pid  | is_prepared | backend_start |
time_xact_was_prepared |  prepared_name
---+-+--+-+---+---+--
316645 |  44 |  | f   |   | 
2008-09-09 00:31:46.724178-07 | my_prepared_transaction1
316689 |   0 | 6093 | t   | 2008-09-09 00:40:10.928287-07 | 
  |
(2 rows)

Note that unless you run this query as a superuser (e.g. "postgres"), the 
columns from pg_stat_activity will only be visible for sessions that belong to 
you.  To rollback this example prepared transaction, you'd type:
  ROLLBACK PREPARED 'my_prepared_transaction1';

Hope this helps!
Matt



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] How to measure IO performance?

2008-09-09 Thread Andre Brandt
Hi out there,

I've some little questions, perhaps you can help me...

At the moment, we're planning our new clustered ERP system which
consists of a java application server and a postgresql database. The
hardware, which is actually used for that system isn't able to handle
the workload (2 Processors, load of 6-8 + 12GB Ram), so it is very, very
slow - and that although we already deactived a lot of stuff we normally
want to do, like a logging and something like that...
We've already choosen some hardware for the new cluster (2x quadcore
Xeon + 64GB Ram should handle that - also in case of  failover when one
server has to handle both, applicaton and database! The actual system
can't do that anymore...) but I also have to choose the filesystem
hardware. And that is a problem - we know, that the servers will be fast
enough, but we don't know, how many I/O performance is needed.
At the moment, we're using a scsi based shared storage (HP MSA500G2 -
which contains 10 disks for the database - 8xdata(raid 1+0)+2x
logs(raid1) ) and we often have a lot wait I/O when 200 concurrent users
are working... (when all features we need are activated, that wait I/O
will heavy increase, we think...)
So in order to get rid of wait I/O (as far as possible), we have to
increase the I/O performance. Because of there are a lot storage systems
out there, we need to know how many I/O's per second we actually need.
(To decide, whether a storage systems can handle our load or a bigger
system is required. )  Do you have some suggestions, how to measure that?
Do you have experience with postgres on something like HP MSA2000(10-20
disks) or RamSan systems?

Best regards,
Andre






























-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to measure IO performance?

2008-09-09 Thread Scott Marlowe
On Tue, Sep 9, 2008 at 7:59 AM, Andre Brandt <[EMAIL PROTECTED]> wrote:
> Hi out there,
>
> I've some little questions, perhaps you can help me...
>
> So in order to get rid of wait I/O (as far as possible), we have to
> increase the I/O performance. Because of there are a lot storage systems
> out there, we need to know how many I/O's per second we actually need.
> (To decide, whether a storage systems can handle our load or a bigger
> system is required. )  Do you have some suggestions, how to measure that?
> Do you have experience with postgres on something like HP MSA2000(10-20
> disks) or RamSan systems?

Generally the best bang for the buck is with Direct Attached Storage
system with a high quality RAID controllers, like the 3Ware or Areca
or LSI or HPs 800 series.  I've heard a few good reports on higher end
adaptecs, but most adaptec RAID controllers are pretty poor db
performers.

To get an idea of how much I/O you'll need, you need to see how much
you use now.  A good way to do that is to come up with a realistic
benchmark and run it at a low level of concurrency on your current
system, while running iostat and / or vmstat in the background.
pidstat can be pretty useful too.  Run a LONG benchmark so it averages
out, you don't want to rely on a 5 minute benchmark.  Once you have
some base numbers, increase the scaling factor (i.e. number of threads
under test) and measure I/O and CPU etc for that test.

Now, figure out how high a load factor you'd need to run your full
load and multiply that times your 1x benchmark's I/O numbers, plus a
fudge factor or 2 to 10 times for overhead.

The standard way to hand more IO ops per second is to add spindles.
It might take more than one RAID controller or external RAID enclosure
to meet your needs.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] too many clog files

2008-09-09 Thread Matt Smiley
Alvaro Herrera wrote:
> Move the old clog files back where they were, and run VACUUM FREEZE in
> all your databases.  That should clean up all the old pg_clog files, if
> you're really that desperate.

Has anyone actually seen a CLOG file get removed under 8.2 or 8.3?  How about 
8.1?

I'm probably missing something, but looking at src/backend/commands/vacuum.c 
(under 8.2.9 and 8.3.3), it seems like vac_truncate_clog() scans through *all* 
tuples of pg_database looking for the oldest datfrozenxid.  Won't that always 
be template0, which as far as I know can never be vacuumed (or otherwise 
connected to)?

postgres=# select datname, datfrozenxid, age(datfrozenxid), datallowconn from 
pg_database order by age(datfrozenxid), datname ;
 datname  | datfrozenxid |   age| datallowconn
--+--+--+--
 template1| 36347792 | 3859 | t
 postgres | 36347733 | 3918 | t
 mss_test | 36347436 | 4215 | t
 template0|  526 | 36351125 | f
(4 rows)

I looked at several of my 8.2 databases' pg_clog directories, and they all have 
all the sequentially numbered segments ( through current segment).  Would 
it be reasonable for vac_truncate_clog() to skip databases where datallowconn 
is false (i.e. template0)?  Looking back to the 8.1.13 code, it does exactly 
that:
if (!dbform->datallowconn)
continue;

Also, Duan, if you have lots of files under pg_clog, you may be burning through 
transactions faster than necessary.  Do your applications leave autocommit 
turned on?  And since no one else mentioned it, as a work-around for a small 
filesystem you can potentially shutdown your database, move the pg_clog 
directory to a separate filesystem, and create a symlink to it under your 
PGDATA directory.  That's not a solution, just a mitigation.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Effects of setting linux block device readahead size

2008-09-09 Thread Mark Wong
Hi all,

I've started to display the effects of changing the Linux block device
readahead buffer to the sequential read performance using fio.  There
are lots of raw data buried in the page, but this is what I've
distilled thus far.  Please have a look and let me know what you
think:

http://wiki.postgresql.org/wiki/HP_ProLiant_DL380_G5_Tuning_Guide#Readahead_Buffer_Size

Regards,
Mark

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance