Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 The problem with a system-wide no-WAL setting is it means you can't
 trust the system catalogs after a crash.  Which means you are forced to
 use initdb to recover from any crash, in return for not a lot of savings
 (for typical usages where there's not really much churn in the
 catalogs). 

What about having a catalog only WAL setting, userset ?

I'm not yet clear on the point but it well seems that the per
transaction WAL setting is impossible because of catalogs (meaning
mainly DDL support), but I can see us enforcing durability and crash
safety there.

That would probably mean that setting WAL level this low yet doing any
kind of DDL would need to be either an ERROR, or better yet, a WARNING
telling that the WAL level can not be that low so has been raised by the
system.

Regards,
-- 
dim

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Rob Wultsch
On Fri, Jun 18, 2010 at 1:55 PM, Josh Berkus j...@agliodbs.com wrote:

 It must be a setting, not a version.

 For instance suppose you have a session table for your website and a
 users table.

 - Having ACID on the users table is of course a must ;
 - for the sessions table you can drop the D

 You're trying to solve a different use-case than the one I am.

 Your use-case will be solved by global temporary tables.  I suggest that
 you give Robert Haas some help  feedback on that.

 My use case is people using PostgreSQL as a cache, or relying entirely
 on replication for durability.

 --
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com



Is he? Wouldn't a global temporary table have content that is not
visible between db connections? A db session many not be the same as a
user session.

-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] WAL+Os on a single disk

2010-06-24 Thread Matthew Wakeling

On Wed, 23 Jun 2010, Scott Marlowe wrote:

We have a 12 x 600G hot swappable disk system (raid 10)
and 2 internal disk  ( 2x 146G)

Does it make sense to put the WAL and OS on the internal disks


So for us, the WAL and OS and logging on the same data set works well.


Generally, it is recommended that you put the WAL onto a separate disc to 
the data. However, in this case, I would be careful. It may be that the 12 
disc array is more capable. Specifically, it is likely that the 12-disc 
array has a battery backed cache, but the two internal drives (RAID 1 
presumably) do not. If this is the case, then putting the WAL on the 
internal drives will reduce performance, as you will only be able to 
commit a transaction once per revolution of the internal discs. In 
contrast, if the WAL is on a battery backed cache array, then you can 
commit much more frequently.


Test it and see.

Matthew

--
I don't want the truth. I want something I can tell parliament!
 -- Rt. Hon. Jim Hacker MP
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Small Queries Really Fast, Large Queries Really Slow...

2010-06-24 Thread Tom Wilcox
Hi again!

I have finally got my Ubuntu VirtualBox VM running PostgreSQL with PL/Python
and am now looking at performance.

So here's the scenario:

We have a great big table:

cse=# \d nlpg.match_data
 Table nlpg.match_data
  Column   |   Type   |
Modifiers
---+--+--
 premise_id| integer  |
 usrn  | bigint   |
 org   | text |
 sao   | text |
 level | text |
 pao   | text |
 name  | text |
 street| text |
 town  | text |
 postcode  | text |
 match_data_id | integer  | not null default
nextval('nlpg.match_data_match_data_id_seq1'::regclass)
 addr_str  | text |
 tssearch_name | tsvector |
 tssearch_street   | tsvector |
 tssearch_addr_str | tsvector |
Indexes:
match_data_pkey1 PRIMARY KEY, btree (match_data_id)
index_match_data_mid btree (match_data_id)
index_match_data_pid btree (premise_id)
index_match_data_tssearch_addr_str gin (tssearch_addr_str)
index_match_data_tssearch_name gin (tssearch_name)
index_match_data_tssearch_street gin (tssearch_street)
index_match_data_usrn btree (usrn)

KEY NOTE:
nlpg.match_data has approximately 27,000,000 rows..

Running this query:

EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id 
100;

I get this:

Index Scan using match_data_pkey1 on match_data  (cost=0.00..1452207.14
rows=1913756 width=302) (actual time=23.448..61559.652 rows=99 loops=1)
  Index Cond: (match_data_id  100)
Total runtime: 403855.675 ms

I copied a chunk of the table like this:

CREATE TABLE nlpg.md_copy AS SELECT * FROM nlpg.match_data WHERE
match_data_id  100;

Then ran the same query on the smaller copy table:

EXPLAIN ANALYZE UPDATE nlpg.md_copy SET org = org WHERE match_data_id 
100;

And get this:

Seq Scan on md_copy  (cost=0.00..96935.99 rows=999899 width=301) (actual
time=26.745..33944.923 rows=99 loops=1)
  Filter: (match_data_id  100)
Total runtime: 57169.419 ms

As you can see this is much faster per row with the smaller table chunk. I
then tried running the same first query with 10 times the number of rows:

EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id 
1000;

This takes a massive amount of time (still running) and is definitely a
non-linear increase in the run time in comparison with the previous query.

EXPLAIN UPDATE nlpg.match_data SET org = org WHERE match_data_id  1000;
Seq Scan on match_data  (cost=0.00..3980053.11 rows=19172782 width=302)
  Filter: (match_data_id  1000)

Any suggestions on what I can do to speed things up? I presume if I turn off
Sequential Scan then it might default to Index Scan.. Is there anything
else?

Cheers,
Tom


Re: [PERFORM] Small Queries Really Fast, Large Queries Really Slow...

2010-06-24 Thread tv
 Any suggestions on what I can do to speed things up? I presume if I turn
 off
 Sequential Scan then it might default to Index Scan.. Is there anything
 else?

 Cheers,
 Tom

Well, I doubt turning off the sequential scan will improve the performance
in this case - actually the first case (running 400 sec) uses an index
scan, while the 'fast' one uses sequential scan.

Actually I'd try exactly the oposite - disabling the index scan, i.e.
forcing it to use sequential scan in the first case. You're selecting
about 4% of the rows, but we don't know how 'spread' are those rows
through the table. It might happen PostgreSQL actually has to read all the
blocks of the table.

This might be improved by clustering - create and index on the
'match_data_id' colunm and then run

CLUSTER match_data_id_idx ON match_data;

This will sort the table accoring to match_data_id column, which should
improve the performance. But it won't last forever - it degrades through
time, so you'll have to perform clustering once a while (and it locks the
table, so be careful).

How large is the table anyway? How many rows / pages are there? Try
something like this

SELECT reltuples, relpages FROM pg_class WHERE relname = 'match_data';

Multiply the blocks by 8k and you'll get the occupied space. How much is
it? How much memory (shared_buffers) is there?

You could try partitioning accoring to the match_data_id column, but there
are various disadvantages related to foreign keys etc. and it's often a
major change in the application, so I'd consider other solutions first.

BTW I have no experience with running PostgreSQL inside a Virtual Box VM,
so it might be another source of problems. I do remember we had some
serious problems with I/O (network and disks) when running vmware, but it
was a long time ago and now it works fine. But maybe this the root cause?
Can you run dstat / vmstat / iostat or something like that in the host OS
to see which of the resources is causing problems (if any)?

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Robert Haas
On Thu, Jun 24, 2010 at 4:40 AM, Rob Wultsch wult...@gmail.com wrote:
 On Fri, Jun 18, 2010 at 1:55 PM, Josh Berkus j...@agliodbs.com wrote:

 It must be a setting, not a version.

 For instance suppose you have a session table for your website and a
 users table.

 - Having ACID on the users table is of course a must ;
 - for the sessions table you can drop the D

 You're trying to solve a different use-case than the one I am.

 Your use-case will be solved by global temporary tables.  I suggest that
 you give Robert Haas some help  feedback on that.

 My use case is people using PostgreSQL as a cache, or relying entirely
 on replication for durability.

 Is he? Wouldn't a global temporary table have content that is not
 visible between db connections? A db session many not be the same as a
 user session.


I'm planning to implement global temporary tables, which can have
different contents for each user session.

And I'm also planning to implement unlogged tables, which have the
same contents for all sessions but are not WAL-logged (and are
truncated on startup).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Write performance

2010-06-24 Thread Janning
Hi,

at the moment we encounter some performance problems with our database server.

We have a 12 GB RAM machine with intel i7-975 and using
3 disks Seagate Barracuda 7200.11, ST31500341AS (1.5 GB) 
One disk for the system and WAL etc. and one SW RAID-0 with two disks for  
postgresql data. Our database is about 24GB.

Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and 
reads of about 1000 blocks per second on our disk which holds the data 
directories of postgresql (WAL are on a different disk)

3000 blocks ~ about 3 MB/s write
1000 blocks ~ about 1 MB/s read

At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load 
(so 4 of 8 cpu cores are in use for io wait)

We know, its a poor man disk setup (but we can not find a hoster with rather 
advanced disk configuration at an affordable price). Anyway, we ran some tests 
on it:


# time sh -c dd if=/dev/zero of=bigfile bs=8k count=300  sync
300+0 records in
300+0 records out
2457600 bytes (25 GB) copied, 276.03 s, 89.0 MB/s

real4m48.658s
user0m0.580s
sys 0m51.579s

# time dd if=bigfile of=/dev/null bs=8k
300+0 records in
300+0 records out
2457600 bytes (25 GB) copied, 222.841 s, 110 MB/s

real3m42.879s
user0m0.468s
sys 0m18.721s



Of course, writing large chunks is quite a different usage pattern. But I am 
wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can 
run a test with 89 MB/s writing and 110MB/s reading.

Can you give some hints, if this numbers seems to be reasonable? 

kind regards
Janning





-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write performance

2010-06-24 Thread Kenneth Marshall
On Thu, Jun 24, 2010 at 02:43:33PM +0200, Janning wrote:
 Hi,
 
 at the moment we encounter some performance problems with our database server.
 
 We have a 12 GB RAM machine with intel i7-975 and using
 3 disks Seagate Barracuda 7200.11, ST31500341AS (1.5 GB) 
 One disk for the system and WAL etc. and one SW RAID-0 with two disks for  
 postgresql data. Our database is about 24GB.
 
 Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and 
 reads of about 1000 blocks per second on our disk which holds the data 
 directories of postgresql (WAL are on a different disk)
 
 3000 blocks ~ about 3 MB/s write
 1000 blocks ~ about 1 MB/s read
 
 At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load 
 (so 4 of 8 cpu cores are in use for io wait)
 
 We know, its a poor man disk setup (but we can not find a hoster with rather 
 advanced disk configuration at an affordable price). Anyway, we ran some 
 tests 
 on it:
 
 
 # time sh -c dd if=/dev/zero of=bigfile bs=8k count=300  sync
 300+0 records in
 300+0 records out
 2457600 bytes (25 GB) copied, 276.03 s, 89.0 MB/s
 
 real  4m48.658s
 user  0m0.580s
 sys   0m51.579s
 
 # time dd if=bigfile of=/dev/null bs=8k
 300+0 records in
 300+0 records out
 2457600 bytes (25 GB) copied, 222.841 s, 110 MB/s
 
 real  3m42.879s
 user  0m0.468s
 sys   0m18.721s
 
 
 
 Of course, writing large chunks is quite a different usage pattern. But I am 
 wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can 
 run a test with 89 MB/s writing and 110MB/s reading.
 
 Can you give some hints, if this numbers seems to be reasonable? 
 
 kind regards
 Janning
 

Yes, these are typical random I/O versus sequential I/O rates for
hard drives. Your I/O is extremely under-powered relative to your
CPU/memory. For DB servers, many times you need much more I/O
instead.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write performance

2010-06-24 Thread Matthew Wakeling

On Thu, 24 Jun 2010, Janning wrote:

We have a 12 GB RAM machine with intel i7-975 and using
3 disks Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)


Those discs are 1.5TB, not 1.5GB.


One disk for the system and WAL etc. and one SW RAID-0 with two disks for
postgresql data. Our database is about 24GB.


Beware of RAID-0 - make sure you can recover the data when (not if) a disc 
fails.



Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and
reads of about 1000 blocks per second on our disk which holds the data
directories of postgresql (WAL are on a different disk)

3000 blocks ~ about 3 MB/s write
1000 blocks ~ about 1 MB/s read

At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load
(so 4 of 8 cpu cores are in use for io wait)


Not quite sure what situation you are measuring these figures under. 
However, as a typical figure, let's say you are doing random access with 
8kB blocks (as in Postgres), and the access time on your drive is 8.5ms 
(as with these drives).


For each drive, you will be able to read/write approximately 8kB / 
0.0085s, giving 941kB per second. If you have multiple processes all doing 
random access, then you may be able to utilise both discs and get double 
that.



Of course, writing large chunks is quite a different usage pattern. But I am
wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can
run a test with 89 MB/s writing and 110MB/s reading.


That's quite right, and typical performance figures for a drive like that.

Matthew

--
Don't criticise a man until you have walked a mile in his shoes; and if
you do at least he will be a mile behind you and bare footed.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write performance

2010-06-24 Thread Janning
On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote:
 On Thu, 24 Jun 2010, Janning wrote:
  We have a 12 GB RAM machine with intel i7-975 and using
  3 disks Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)

 Those discs are 1.5TB, not 1.5GB.

sorry, my fault.

  One disk for the system and WAL etc. and one SW RAID-0 with two disks for
  postgresql data. Our database is about 24GB.

 Beware of RAID-0 - make sure you can recover the data when (not if) a disc
 fails.

oh sorry again, its a raid-1 of course. shame on me.

  Our munin graph reports at 9:00 a clock writes of 3000 blocks per second
  and reads of about 1000 blocks per second on our disk which holds the
  data directories of postgresql (WAL are on a different disk)
 
  3000 blocks ~ about 3 MB/s write
  1000 blocks ~ about 1 MB/s read
 
  At the same time we have nearly 50% CPU I/O wait and only 12% user CPU
  load (so 4 of 8 cpu cores are in use for io wait)

 Not quite sure what situation you are measuring these figures under.
 However, as a typical figure, let's say you are doing random access with
 8kB blocks (as in Postgres), and the access time on your drive is 8.5ms
 (as with these drives).

 For each drive, you will be able to read/write approximately 8kB /
 0.0085s, giving 941kB per second. If you have multiple processes all doing
 random access, then you may be able to utilise both discs and get double
 that.

So with your calculation I have a maximum of 2MB/s random access. So i really 
need to upgrade my disk configuration!

  Of course, writing large chunks is quite a different usage pattern. But I
  am wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if
  i can run a test with 89 MB/s writing and 110MB/s reading.

 That's quite right, and typical performance figures for a drive like that.

thanks for your help.

kind regards 
Janning

 Matthew

 --
  Don't criticise a man until you have walked a mile in his shoes; and if
  you do at least he will be a mile behind you and bare footed.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write performance

2010-06-24 Thread Janning
thanks for your quick response, kenneth

On Thursday 24 June 2010 14:47:34 you wrote:
 On Thu, Jun 24, 2010 at 02:43:33PM +0200, Janning wrote:
  Hi,
 
  at the moment we encounter some performance problems with our database
  server.
 
  We have a 12 GB RAM machine with intel i7-975 and using
  3 disks Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)
  One disk for the system and WAL etc. and one SW RAID-0 with two disks for
  postgresql data. Our database is about 24GB.
[...]
 Your I/O is extremely under-powered relative to your
 CPU/memory. For DB servers, many times you need much more I/O
 instead.

So at the moment we are using this machine as our primary database server:
http://www.hetzner.de/en/hosting/produkte_rootserver/eq9/

Sadly, our hoster is not offering advanced disk setup. Now we have two options

1. buying a server on our own and renting a co-location.
I fear we do not know enough about hardware to vote for this option. I think 
for co-locating your own server one should have more knowledge about hardware.

2. renting a server from a hoster with an advanced disk setup.
Can anybody recommend a good hosting solution in germany with a good disk 
setup for postgresql? 


kind regards
Janning


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] WAL+Os on a single disk

2010-06-24 Thread Scott Marlowe
On Thu, Jun 24, 2010 at 5:14 AM, Matthew Wakeling matt...@flymine.org wrote:
 On Wed, 23 Jun 2010, Scott Marlowe wrote:

 We have a 12 x 600G hot swappable disk system (raid 10)
 and 2 internal disk  ( 2x 146G)

 Does it make sense to put the WAL and OS on the internal disks

 So for us, the WAL and OS and logging on the same data set works well.

 Generally, it is recommended that you put the WAL onto a separate disc to
 the data. However, in this case, I would be careful. It may be that the 12
 disc array is more capable. Specifically, it is likely that the 12-disc
 array has a battery backed cache, but the two internal drives (RAID 1
 presumably) do not. If this is the case, then putting the WAL on the
 internal drives will reduce performance, as you will only be able to commit
 a transaction once per revolution of the internal discs. In contrast, if the
 WAL is on a battery backed cache array, then you can commit much more
 frequently.

This is not strictly true of the WAL, which writes sequentially and
more than one transaction at a time.  As you said though, test it to
be sure.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write performance

2010-06-24 Thread Greg Smith
As others have already pointed out, your disk performance here is 
completely typical of a single pair of drives doing random read/write 
activity.  So the question you should be asking is how to reduce the 
amount of reading and writing needed to run your application.  The 
suggestions at 
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server address 
that.  Increases to shared_buffers and checkpoint_segments in particular 
can dramatically reduce the amount of I/O needed to run an application.  
On the last server I turned, random reads went from a constant stream of 
1MB/s (with default value of shared_buffers at 32MB) to an average of 
0.1MB/s just by adjusting those two parameters upwards via those guidelines.


If you haven't already made large increases to those values, I'd suggest 
starting there before presuming you must get a different disk setup.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] WAL+Os on a single disk

2010-06-24 Thread Anj Adu
What would you recommend to do a quick test for this? (i.e WAL on
internal disk vs WALon the 12 disk raid array )?

On Thu, Jun 24, 2010 at 6:31 AM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Thu, Jun 24, 2010 at 5:14 AM, Matthew Wakeling matt...@flymine.org wrote:
 On Wed, 23 Jun 2010, Scott Marlowe wrote:

 We have a 12 x 600G hot swappable disk system (raid 10)
 and 2 internal disk  ( 2x 146G)

 Does it make sense to put the WAL and OS on the internal disks

 So for us, the WAL and OS and logging on the same data set works well.

 Generally, it is recommended that you put the WAL onto a separate disc to
 the data. However, in this case, I would be careful. It may be that the 12
 disc array is more capable. Specifically, it is likely that the 12-disc
 array has a battery backed cache, but the two internal drives (RAID 1
 presumably) do not. If this is the case, then putting the WAL on the
 internal drives will reduce performance, as you will only be able to commit
 a transaction once per revolution of the internal discs. In contrast, if the
 WAL is on a battery backed cache array, then you can commit much more
 frequently.

 This is not strictly true of the WAL, which writes sequentially and
 more than one transaction at a time.  As you said though, test it to
 be sure.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cpu bound postgresql setup.

2010-06-24 Thread Rajesh Kumar Mallah
Dear List,

1. It was found that too many stray queries were getting generated
from rouge users and bots
we controlled using some manual methods.

2. We have made application changes and some significant changes have been done.

3. we use xfs  and our controller has BBU , we changed barriers=1 to
barriers=0 as
i learnt that having barriers=1 on xfs  and fsync  as the sync
method, the advantage
of BBU is lost unless barriers is = 0 (correct me if my
understanding is wrong)

4. We had implemented partitioning using exclusion constraints ,
parent relnship
was removed from quite a lot of old partition tables.

our postgresql.conf

--
# cat postgresql.conf  | grep -v ^\s*# | grep -v ^\s*$

listen_addresses = '*'  # what IP address(es) to listen on;
port = 5432 # (change requires restart)
max_connections = 300   # (change requires restart)
shared_buffers = 10GB   # min 128kB
work_mem = 4GB  # min 64kB
fsync = on  # turns forced synchronization on or off
synchronous_commit = on # immediate fsync at commit
checkpoint_segments = 30# in logfile segments, min 1, 16MB each
archive_mode = on   # allows archiving to be done
archive_command = '/opt/scripts/archive_wal.sh %p %f '
archive_timeout = 600   # force a logfile segment switch after this
effective_cache_size = 18GB
constraint_exclusion = on   # on, off, or partition
logging_collector = on  # Enable capturing of stderr and csvlog
log_directory = '/var/log/postgresql'   # directory where log
files are written,
log_filename = 'postgresql.log' # log file name pattern,
log_truncate_on_rotation = on   # If on, an existing log file of the
log_rotation_age = 1d   # Automatic rotation of logfiles will
log_error_verbosity = verbose   # terse, default, or verbose messages
log_min_duration_statement = 5000   # -1 is disabled, 0 logs all statements
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8' # locale for system
error message
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'  # locale for number formatting
lc_time = 'en_US.UTF-8' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
add_missing_from = on
custom_variable_classes = 'general' # list of custom
variable class names
general.report_level = ''
general.disable_audittrail2 = ''
general.employee=''


Also i would like to apologize that some of the discussions on this problem
 inadvertently became private between me  kevin.


On Thu, Jun 24, 2010 at 12:10 AM, Rajesh Kumar Mallah
mallah.raj...@gmail.com wrote:
 It was nice to go through the interesting posting guidelines.  i shall
 be analyzing the slow queries more objectively tomorrow during the
 peak hours. I really hope it sould be possible to track down the
 problem.

 On 6/23/10, Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Rajesh Kumar Mallah mallah.raj...@gmail.com wrote:

 did you suggest at some point that number of backend per core
 should be preferebly 3 ?

 I've found the number of *active* backends is optimal around (2 *
 cores) + spindles.  You said you had eight cores and eight or ten
 spindles, so I figure a connection pool limited to somewhere around
 24 active connections is ideal.  (Depending on how you set up your
 pool, you may need a higher total number of connections to keep 24
 active.)

 -Kevin


 --
 Sent from Gmail for mobile | mobile.google.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] ALTER Table and CLUSTER does adding a new column rewrite clustered? (8.4.3)

2010-06-24 Thread Bruce Momjian
Scott Carey wrote:
 v. 8.4.3 
 
 I have a table that has several indexes, one of which the table is
 clustered on.  If I do an ALTER TABLE Foo ADD COLUMN bar integer not
 null default -1;
 
 It re-writes the whole table.

All good questions:

 * Does it adhere to the CLUSTER property of the table and write the new
 version clustered?

The new table is the exact same heap ordering as the old table;  it does
not refresh the clustering if the table has become unclustered.

 * Does it properly write it with the FILLFACTOR setting?

Yes, inserts are used to populate the new table, and inserts honor
FILLFACTOR.

 * Are all the indexes re-created too, or are they bloated and need a REINDEX?

They are recreated.

 http://www.postgresql.org/docs/8.4/static/sql-altertable.html 
   does not seem to answer the above, it mentions the conditions that
 cause a rewrite but does not say what the state is after the rewrite
 with respect to CLUSTER, FILLFACTOR, and index bloat.

I have added a documentation patch to mention the indexes are rebuilt; 
applied patch attached.

The gory details can be found in src/backend/commands/tablecmds.c.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +
Index: doc/src/sgml/ref/alter_table.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/alter_table.sgml,v
retrieving revision 1.114
diff -c -c -r1.114 alter_table.sgml
*** doc/src/sgml/ref/alter_table.sgml	9 Jun 2010 17:48:10 -	1.114
--- doc/src/sgml/ref/alter_table.sgml	24 Jun 2010 14:54:00 -
***
*** 689,696 
  
 para
  Adding a column with a non-null default or changing the type of an
! existing column will require the entire table to be rewritten.  This
! might take a significant amount of time for a large table; and it will
  temporarily require double the disk space.  Adding or removing a system
  literaloid/ column likewise requires rewriting the entire table.
 /para
--- 689,696 
  
 para
  Adding a column with a non-null default or changing the type of an
! existing column will require the entire table and indexes to be rewritten.
! This might take a significant amount of time for a large table; and it will
  temporarily require double the disk space.  Adding or removing a system
  literaloid/ column likewise requires rewriting the entire table.
 /para

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cpu bound postgresql setup.

2010-06-24 Thread Kevin Grittner
I'm not clear whether you still have a problem, or whether the
changes you mention solved your issues.  I'll comment on potential
issues that leap out at me.
 
Rajesh Kumar Mallah mallah.raj...@gmail.com wrote:
 
 3. we use xfs  and our controller has BBU , we changed barriers=1
 to barriers=0 as i learnt that having barriers=1 on xfs  and fsync
 as the sync method, the advantage of BBU is lost unless barriers
 is = 0 (correct me if my understanding is wrong)
 
We use noatime,nobarrier in /etc/fstab.  I'm not sure where you're
setting that, but if you have a controller with BBU, you want to set
it to whichever disables write barriers.
 
 max_connections = 300
 
As I've previously mentioned, I would use a connection pool, in
which case this wouldn't need to be that high.
 
 work_mem = 4GB
 
That's pretty high.  That much memory can be used by each active
connection, potentially for each of several parts of the active
query on each connection.  You should probably set this much lower
in postgresql.conf and boost it if necessary for individual queries.
 
 effective_cache_size = 18GB
 
With 32GB RAM on the machine, I would probably set this higher --
somewhere in the 24GB to 30GB range, unless you have specific
reasons to believe otherwise.  It's not that critical, though.
 
 add_missing_from = on
 
Why?  There has been discussion of eliminating this option -- do you
have queries which rely on the non-standard syntax this enables?
 
 Also i would like to apologize that some of the discussions on
 this problem inadvertently became private between me  kevin.
 
Oops.  I failed to notice that.  Thanks for bringing it back to the
list.  (It's definitely in your best interest to keep it in front of
all the other folks here, some of whom regularly catch things I miss
or get wrong.)
 
If you still do have slow queries, please follow up with details.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write performance

2010-06-24 Thread Janning Vygen
On Thursday 24 June 2010 15:16:05 Janning wrote:
 On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote:
  On Thu, 24 Jun 2010, Janning wrote:
   We have a 12 GB RAM machine with intel i7-975 and using
   3 disks Seagate Barracuda 7200.11, ST31500341AS (1.5 TB)
  
  For each drive, you will be able to read/write approximately 8kB /
  0.0085s, giving 941kB per second. If you have multiple processes all
  doing random access, then you may be able to utilise both discs and get
  double that.

 So with your calculation I have a maximum of 2MB/s random access. So i
 really need to upgrade my disk configuration!

i was looking at tomshardware.com and the fastest disk is

   Maxtor Atlas 15K II * 8K147S0,SAS,147 GB, 16 MB Cache,15000 rpm

with 5.5 ms random access time. 

So even if i switch to those disks i can only reach a perfomace gain of 1.5, 
right? 

To achieve a better disk performance by factor of ten, i need a raid-10 setup 
with 12 disks (so i have 6 raid-1 bundles). Or are there other factors with 
high end disks? 

kind regards 
Janning


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Write performance

2010-06-24 Thread Jesper Krogh

On 2010-06-24 15:45, Janning Vygen wrote:

On Thursday 24 June 2010 15:16:05 Janning wrote:
   

On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote:
 

On Thu, 24 Jun 2010, Janning wrote:
   

We have a 12 GB RAM machine with intel i7-975 and using
3 disks Seagate Barracuda 7200.11, ST31500341AS (1.5 TB)

 

For each drive, you will be able to read/write approximately 8kB /
0.0085s, giving 941kB per second. If you have multiple processes all
doing random access, then you may be able to utilise both discs and get
double that.
   

So with your calculation I have a maximum of 2MB/s random access. So i
really need to upgrade my disk configuration!
 

i was looking at tomshardware.com and the fastest disk is

Maxtor Atlas 15K II * 8K147S0,SAS,147 GB, 16 MB Cache,15000 rpm

with 5.5 ms random access time.

So even if i switch to those disks i can only reach a perfomace gain of 1.5,
right?

To achieve a better disk performance by factor of ten, i need a raid-10 setup
with 12 disks (so i have 6 raid-1 bundles). Or are there other factors with
high end disks?
   


Well. On the write-side, you can add in a Raid controller with Battery 
backed

write cache to not make the writes directly hit disk. This improves
the amount of writing you can do.

On the read-side you can add more memory to your server so a significant
part of your most active dataset is cached in memory.

It depends on the actual sizes and workload what gives the most benefit
for you.

--
Jesper

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cpu bound postgresql setup.

2010-06-24 Thread Rajesh Kumar Mallah
On Thu, Jun 24, 2010 at 8:57 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 I'm not clear whether you still have a problem, or whether the
 changes you mention solved your issues.  I'll comment on potential
 issues that leap out at me.

It shall require more observation to know if the problem is solved.
my  problem  was high load average in the server . We find that
when ldavg is between 10-20 responses of applications were acceptable
ldavg   40 makes things slower.

What prompted me to post to list is that the server transitioned from
being IO bound to CPU bound and 90% of syscalls being
lseek(XXX, 0, SEEK_END) = YYY


 Rajesh Kumar Mallah mallah.raj...@gmail.com wrote:

 3. we use xfs  and our controller has BBU , we changed barriers=1
 to barriers=0 as i learnt that having barriers=1 on xfs  and fsync
 as the sync method, the advantage of BBU is lost unless barriers
 is = 0 (correct me if my understanding is wrong)

 We use noatime,nobarrier in /etc/fstab.  I'm not sure where you're
 setting that, but if you have a controller with BBU, you want to set
 it to whichever disables write barriers.

as per suggestion in discussions on some other thread I set it
in /etc/fstab.


 max_connections = 300

 As I've previously mentioned, I would use a connection pool, in
 which case this wouldn't need to be that high.

We do use connection pooling provided to mod_perl server
via Apache::DBI::Cache. If i reduce this i *get* too many
connections from non-superuser ...   error. Will pgpool - I/II
still applicable in this scenario ?



 work_mem = 4GB

 That's pretty high.  That much memory can be used by each active
 connection, potentially for each of several parts of the active
 query on each connection.  You should probably set this much lower
 in postgresql.conf and boost it if necessary for individual queries.

hmmm.. it was 8GB for many months !

i shall reduce it further, but will it not result in usage of too many
temp files
and saturate i/o?




 effective_cache_size = 18GB

 With 32GB RAM on the machine, I would probably set this higher --
 somewhere in the 24GB to 30GB range, unless you have specific
 reasons to believe otherwise.  It's not that critical, though.

i do not remember well but there is a system view that (i think)
guides at what stage the marginal returns of increasing it
starts disappearing , i had set it a few years back.



 add_missing_from = on

 Why?  There has been discussion of eliminating this option -- do you
 have queries which rely on the non-standard syntax this enables?

unfortunately yes.


 Also i would like to apologize that some of the discussions on
 this problem inadvertently became private between me  kevin.

 Oops.  I failed to notice that.  Thanks for bringing it back to the
 list.  (It's definitely in your best interest to keep it in front of
 all the other folks here, some of whom regularly catch things I miss
 or get wrong.)

 If you still do have slow queries, please follow up with details.


I have now set log_min_duration_statement = 5000
and there are few queries that come to logs.

please comment on the connection pooling aspect.

Warm  Regards
Rajesh Kumar Mallah.


 -Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cpu bound postgresql setup.

2010-06-24 Thread Rajesh Kumar Mallah
i do not remember well but there is a system view that (i think)
guides at what stage the marginal returns of increasing it
starts disappearing , i had set it a few years back.

Sorry the above comment was regarding setting shared_buffers
not effective_cache_size.



On Thu, Jun 24, 2010 at 10:55 PM, Rajesh Kumar Mallah
mallah.raj...@gmail.com wrote:
 On Thu, Jun 24, 2010 at 8:57 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 I'm not clear whether you still have a problem, or whether the
 changes you mention solved your issues.  I'll comment on potential
 issues that leap out at me.

 It shall require more observation to know if the problem is solved.
 my  problem  was high load average in the server . We find that
 when ldavg is between 10-20 responses of applications were acceptable
 ldavg   40 makes things slower.

 What prompted me to post to list is that the server transitioned from
 being IO bound to CPU bound and 90% of syscalls being
 lseek(XXX, 0, SEEK_END) = YYY


 Rajesh Kumar Mallah mallah.raj...@gmail.com wrote:

 3. we use xfs  and our controller has BBU , we changed barriers=1
 to barriers=0 as i learnt that having barriers=1 on xfs  and fsync
 as the sync method, the advantage of BBU is lost unless barriers
 is = 0 (correct me if my understanding is wrong)

 We use noatime,nobarrier in /etc/fstab.  I'm not sure where you're
 setting that, but if you have a controller with BBU, you want to set
 it to whichever disables write barriers.

 as per suggestion in discussions on some other thread I set it
 in /etc/fstab.


 max_connections = 300

 As I've previously mentioned, I would use a connection pool, in
 which case this wouldn't need to be that high.

 We do use connection pooling provided to mod_perl server
 via Apache::DBI::Cache. If i reduce this i *get* too many
 connections from non-superuser ...   error. Will pgpool - I/II
 still applicable in this scenario ?



 work_mem = 4GB

 That's pretty high.  That much memory can be used by each active
 connection, potentially for each of several parts of the active
 query on each connection.  You should probably set this much lower
 in postgresql.conf and boost it if necessary for individual queries.

 hmmm.. it was 8GB for many months !

 i shall reduce it further, but will it not result in usage of too many
 temp files
 and saturate i/o?




 effective_cache_size = 18GB

 With 32GB RAM on the machine, I would probably set this higher --
 somewhere in the 24GB to 30GB range, unless you have specific
 reasons to believe otherwise.  It's not that critical, though.

 i do not remember well but there is a system view that (i think)
 guides at what stage the marginal returns of increasing it
 starts disappearing , i had set it a few years back.



 add_missing_from = on

 Why?  There has been discussion of eliminating this option -- do you
 have queries which rely on the non-standard syntax this enables?

 unfortunately yes.


 Also i would like to apologize that some of the discussions on
 this problem inadvertently became private between me  kevin.

 Oops.  I failed to notice that.  Thanks for bringing it back to the
 list.  (It's definitely in your best interest to keep it in front of
 all the other folks here, some of whom regularly catch things I miss
 or get wrong.)

 If you still do have slow queries, please follow up with details.


 I have now set log_min_duration_statement = 5000
 and there are few queries that come to logs.

 please comment on the connection pooling aspect.

 Warm  Regards
 Rajesh Kumar Mallah.


 -Kevin



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Josh Berkus

 And I'm also planning to implement unlogged tables, which have the
 same contents for all sessions but are not WAL-logged (and are
 truncated on startup).

Yep.  And it's quite possible that this will be adequate for most users.

And it's also possible that the extra CPU which Robert isn't getting rid
of (bgwriter, checkpointing, etc.) does not have a measurable impact on
performance.  At this point, my idea (which I call
RunningWithScissorsDB) is only an idea for experimentation and
performance testing.  It's pretty far off from being a TODO.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cpu bound postgresql setup.

2010-06-24 Thread Alvaro Herrera
Excerpts from Rajesh Kumar Mallah's message of jue jun 24 13:25:32 -0400 2010:

 What prompted me to post to list is that the server transitioned from
 being IO bound to CPU bound and 90% of syscalls being
 lseek(XXX, 0, SEEK_END) = YYY

It could be useful to find out what file is being seeked.  Correlate the
XXX with files in /proc/pid/fd (at least on Linux) to find out more.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cpu bound postgresql setup.

2010-06-24 Thread Kevin Grittner
Rajesh Kumar Mallah mallah.raj...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 max_connections = 300

 As I've previously mentioned, I would use a connection pool, in
 which case this wouldn't need to be that high.
 
 We do use connection pooling provided to mod_perl server
 via Apache::DBI::Cache. If i reduce this i *get* too many
 connections from non-superuser ...   error. Will pgpool - I/II
 still applicable in this scenario ?
 
Yeah, you can't reduce this setting without first having a
connection pool in place which will limit how many connections are
in use.  We haven't used any of the external connection pool
products for PostgreSQL yet, because when we converted to PostgreSQL
we were already using a pool built into our application framework. 
This pool queues requests for database transactions and has one
thread per connection in the database pool to pull and service
objects which encapsulate the logic of the database transaction.
 
We're moving to new development techniques, since that framework is
over ten years old now, but the overall approach is going to stay
the same -- because it has worked so well for us.  By queuing
requests beyond the number which can keep all the server's resources
busy, we avoid wasting resources on excessive context switching and
(probably more significant) contention for locks.  At one point our
busiest server started to suffer performance problems under load,
and we were able to fix them by simple configuring the connection
pool to half its previous size -- both response time and throughput
improved.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Pavel Stehule
2010/6/24 Josh Berkus j...@agliodbs.com:

 And I'm also planning to implement unlogged tables, which have the
 same contents for all sessions but are not WAL-logged (and are
 truncated on startup).

this is similar MySQL's memory tables. Personally, I don't see any
practical sense do same work on PostgreSQL now, when memcached exists.
Much more important is smarter cache controlling then we have now -
maybe with priorities for some tables and some operations
(applications) - sometimes we don't need use cache for extra large
scans.

Regards

Pavel Stehule



 Yep.  And it's quite possible that this will be adequate for most users.

 And it's also possible that the extra CPU which Robert isn't getting rid
 of (bgwriter, checkpointing, etc.) does not have a measurable impact on
 performance.  At this point, my idea (which I call
 RunningWithScissorsDB) is only an idea for experimentation and
 performance testing.  It's pretty far off from being a TODO.


 --
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Joshua D. Drake
On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote:
 2010/6/24 Josh Berkus j...@agliodbs.com:
 
  And I'm also planning to implement unlogged tables, which have the
  same contents for all sessions but are not WAL-logged (and are
  truncated on startup).
 
 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

Because memcache is yet another layer and increases overhead to the
application developers by adding yet another layer to work with. Non
logged tables would rock.

SELECT * FROM foo;

:D

JD




-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Pavel Stehule
2010/6/24 Joshua D. Drake j...@commandprompt.com:
 On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote:
 2010/6/24 Josh Berkus j...@agliodbs.com:
 
  And I'm also planning to implement unlogged tables, which have the
  same contents for all sessions but are not WAL-logged (and are
  truncated on startup).

 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

 Because memcache is yet another layer and increases overhead to the
 application developers by adding yet another layer to work with. Non
 logged tables would rock.

I see only one positive point - it can help to people with broken
design application with migration to PostgreSQL.

There are different interesting feature - cached procedure's results
like Oracle 11. - it's more general.

only idea.

For me memory tables are nonsens, but what about memory cached
materialised views (maybe periodically refreshed)?

Regards

Pavel


 SELECT * FROM foo;

 :D

:)

 JD




 --
 PostgreSQL.org Major Contributor
 Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
 Consulting, Training, Support, Custom Development, Engineering



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread A.M.

On Jun 24, 2010, at 4:01 PM, Pavel Stehule wrote:

 2010/6/24 Joshua D. Drake j...@commandprompt.com:
 On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote:
 2010/6/24 Josh Berkus j...@agliodbs.com:
 
 And I'm also planning to implement unlogged tables, which have the
 same contents for all sessions but are not WAL-logged (and are
 truncated on startup).
 
 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.
 
 Because memcache is yet another layer and increases overhead to the
 application developers by adding yet another layer to work with. Non
 logged tables would rock.
 
 I see only one positive point - it can help to people with broken
 design application with migration to PostgreSQL.

The broken design is being required to work around PostgreSQL's lack of this 
optimization.

 
 There are different interesting feature - cached procedure's results
 like Oracle 11. - it's more general.
 
 only idea.
 
 For me memory tables are nonsens, but what about memory cached
 materialised views (maybe periodically refreshed)?

Non-WAL-logged, non-fsynced tables are not equivalent to MySQL memory tables. 
Such tables simply contain transient information. One can already make memory 
tables in PostgreSQL by making a tablespace in a tmpfs partition.

I have been eagerly waiting for this feature for six years so that I can write 
proper queries against ever-changing session data with transactional semantics 
(which memcached cannot offer). The only restriction I see for these transient 
data tables is that they cannot be referenced by standard tables using foreign 
key constraints. Otherwise, these tables behave like any other. That's the 
benefit.

Cheers,
M
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Pavel Stehule
2010/6/24 A.M. age...@themactionfaction.com:

 On Jun 24, 2010, at 4:01 PM, Pavel Stehule wrote:

 2010/6/24 Joshua D. Drake j...@commandprompt.com:
 On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote:
 2010/6/24 Josh Berkus j...@agliodbs.com:

 And I'm also planning to implement unlogged tables, which have the
 same contents for all sessions but are not WAL-logged (and are
 truncated on startup).

 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

 Because memcache is yet another layer and increases overhead to the
 application developers by adding yet another layer to work with. Non
 logged tables would rock.

 I see only one positive point - it can help to people with broken
 design application with migration to PostgreSQL.

 The broken design is being required to work around PostgreSQL's lack of this 
 optimization.


 There are different interesting feature - cached procedure's results
 like Oracle 11. - it's more general.

 only idea.

 For me memory tables are nonsens, but what about memory cached
 materialised views (maybe periodically refreshed)?

 Non-WAL-logged, non-fsynced tables are not equivalent to MySQL memory 
 tables. Such tables simply contain transient information. One can already 
 make memory tables in PostgreSQL by making a tablespace in a tmpfs 
 partition.

 I have been eagerly waiting for this feature for six years so that I can 
 write proper queries against ever-changing session data with transactional 
 semantics (which memcached cannot offer). The only restriction I see for 
 these transient data tables is that they cannot be referenced by standard 
 tables using foreign key constraints. Otherwise, these tables behave like any 
 other. That's the benefit.


if you remove WAL, then there are MVCC still - you have to do VACUUM,
you have to do ANALYZE, you have to thinking about indexes ...
Processing pipe for simple query is long too. The removing WAL doesn't
do memory database from Postgres. But You have to know best, what do
you do.

Regards

Pavel Stehule

p.s. maybe memcached is too simply for you - there are more NoSQL db

 Cheers,
 M
 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Josh Berkus

 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

Thing is, if you only have one table (say, a sessions table) which you
don't want logged, you don't necessarily want to fire up a 2nd software
application just for that.  Plus, recent testing seems to show that with
no logging, memcached isn't really faster than PG.

Also, like for asynch_commit, this is something where users are
currently turning off fsync.  Any option where we can present users with
controlled, predictable data loss instead of random corruption is a good
one.

 Much more important is smarter cache controlling then we have now -
 maybe with priorities for some tables and some operations
 (applications) - sometimes we don't need use cache for extra large
 scans.

Well, that would be good *too*.  You working on it?  ;-)

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Pavel Stehule
2010/6/24 Josh Berkus j...@agliodbs.com:

 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

 Thing is, if you only have one table (say, a sessions table) which you
 don't want logged, you don't necessarily want to fire up a 2nd software
 application just for that.  Plus, recent testing seems to show that with
 no logging, memcached isn't really faster than PG.

sorry, I thinking some else. Not only WAL does significant overhead.
You need litlle bit more memory, much more processing time. With very
fast operations, the bottle neck will be in interprocess communication
- but it doesn't mean so pg isn't slower than memcached. I repeating
it again - there are no any universal tool for all tasks.


 Also, like for asynch_commit, this is something where users are
 currently turning off fsync.  Any option where we can present users with
 controlled, predictable data loss instead of random corruption is a good
 one.


it isn't too simple. What about statistics? These are used in system table.

 Much more important is smarter cache controlling then we have now -
 maybe with priorities for some tables and some operations
 (applications) - sometimes we don't need use cache for extra large
 scans.

 Well, that would be good *too*.  You working on it?  ;-)


no - just I know about possible problems with memory control.

 --
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] requested shared memory size overflows size_t

2010-06-24 Thread Craig James

Can anyone tell me what's going on here?  I hope this doesn't mean my system 
tables are corrupt...

Thanks,
Craig


select relname, pg_relation_size(relname) from pg_class
where pg_get_userbyid(relowner) = 'emol_warehouse_1'
and relname not like 'pg_%'
order by pg_relation_size(relname) desc;
ERROR:  relation rownum_temp does not exist

emol_warehouse_1= select relname from pg_class where relname = 'rownum_temp';
   relname
--
 rownum_temp
(1 row)

emol_warehouse_1= \d rownum_temp
Did not find any relation named rownum_temp.
emol_warehouse_1= create table rownum_temp(i int);
CREATE TABLE
emol_warehouse_1= drop table rownum_temp;
DROP TABLE
emol_warehouse_1= select relname, pg_relation_size(relname) from pg_class
where pg_get_userbyid(relowner) = 'emol_warehouse_1'
and relname not like 'pg_%'
order by pg_relation_size(relname) desc;
ERROR:  relation rownum_temp does not exist

emol_warehouse_1= select relname, pg_relation_size(relname) from pg_class;
ERROR:  relation tables does not exist





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] requested shared memory size overflows size_t

2010-06-24 Thread Alvaro Herrera
Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010:

 select relname, pg_relation_size(relname) from pg_class
  where pg_get_userbyid(relowner) = 'emol_warehouse_1'
  and relname not like 'pg_%'
  order by pg_relation_size(relname) desc;
 ERROR:  relation rownum_temp does not exist
 
 emol_warehouse_1= select relname from pg_class where relname = 'rownum_temp';
 relname
 --
   rownum_temp
 (1 row)

What's the full row?  I'd just add a WHERE relkind = 'r' to the above
query anyway.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] System tables screwed up? (WAS requested shared memory size overflows size_t)

2010-06-24 Thread Craig James

On 6/24/10 4:19 PM, Alvaro Herrera wrote:

Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010:


select relname, pg_relation_size(relname) from pg_class
  where pg_get_userbyid(relowner) = 'emol_warehouse_1'
  and relname not like 'pg_%'
  order by pg_relation_size(relname) desc;
ERROR:  relation rownum_temp does not exist

emol_warehouse_1=  select relname from pg_class where relname = 'rownum_temp';
 relname
--
   rownum_temp
(1 row)


What's the full row?  I'd just add a WHERE relkind = 'r' to the above
query anyway.


Thanks, in fact that works.  But my concern is that these are system tables and 
system functions and yet they seem to be confused.  I've used this query dozens 
of times and never seen this behavior before.  It makes me really nervous...

Craig

P.S. Sorry I got the Subject wrong the first time by hitting the REPLY key 
mindlessly, I've changed it now.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] System tables screwed up? (WAS requested shared memory size overflows size_t)

2010-06-24 Thread Alvaro Herrera
Excerpts from Craig James's message of jue jun 24 19:24:44 -0400 2010:
 On 6/24/10 4:19 PM, Alvaro Herrera wrote:
  Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010:
 
  select relname, pg_relation_size(relname) from pg_class
where pg_get_userbyid(relowner) = 'emol_warehouse_1'
and relname not like 'pg_%'
order by pg_relation_size(relname) desc;
  ERROR:  relation rownum_temp does not exist
 
  emol_warehouse_1=  select relname from pg_class where relname = 
  'rownum_temp';
   relname
  --
 rownum_temp
  (1 row)
 
  What's the full row?  I'd just add a WHERE relkind = 'r' to the above
  query anyway.
 
 Thanks, in fact that works.  But my concern is that these are system tables 
 and system functions and yet they seem to be confused.  I've used this query 
 dozens of times and never seen this behavior before.  It makes me really 
 nervous...

I think you're being bitten by lack of schema qualification.  Perhaps
you ought to pass pg_class.oid to pg_relation_size instead of relname.
What did you do to make pg_relation_size to work on type name?

Why is this a -performance question anyway?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Occasional giant spikes in CPU load

2010-06-24 Thread Craig James

I'm reviving this question because I never figured it out.  To summarize: At random 
intervals anywhere from a few times per hour to once or twice a day, we see a huge spike 
in CPU load that essentially brings the system to a halt for up to a minute or two.  
Previous answers focused on what is it doing, i.e. is it really Postgres or 
something else?

Now the question has narrowed down to this: what could trigger EVERY postgres backend to 
do something at the same time?  See the attached output from top -b, which 
shows what is happening during one of the CPU spikes.

A little background about our system.  We have roughly 100 FastCGI clients 
connected at all times that are called on to generate images from data in the 
database.  Even though there are a lot of these, they don't do much.  They sit 
there most of the time, then they spew out a couple dozen GIF images in about 
one second as a user gets a new page of data.  Each GIF image requires fetching 
a single row using a single indexed column, so it's a trival amount of work for 
Postgres.

We also have the heavy lift application that does the search.  Typically one 
or two of these is running at a time, and takes from a fraction of a second to a few 
minutes to complete.  In this particular instance, immediately before this spike, the CPU 
load was only at about 10% -- a couple users poking around with easy queries.

So what is it that will cause every single Postgres backend to come to life at 
the same moment, when there's no real load on the server?  Maybe if a backend 
crashes?  Some other problem?

There's nothing in the serverlog.

Thanks,
Craig


top - 12:15:09 up 81 days, 21:18,  4 users,  load average: 0.38, 0.38, 0.73
Tasks: 374 total,  95 running, 279 sleeping,   0 stopped,   0 zombie
Cpu(s): 62.5%us,  2.2%sy,  0.0%ni, 34.9%id,  0.2%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   8194800k total,  7948928k used,   245872k free,   36k buffers
Swap:  2031608k total,   161136k used,  1870472k free,  7129744k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
22120 postgres  20   0 2514m  17m  13m R   11  0.2   0:01.02 postmaster
18497 postgres  20   0 2514m  11m 8832 R6  0.1   0:00.62 postmaster
22962 postgres  20   0 2514m  12m 9548 R6  0.2   0:00.22 postmaster
24002 postgres  20   0 2514m  11m 8804 R6  0.1   0:00.15 postmaster
25900 postgres  20   0 2514m  11m 8824 R6  0.1   0:00.55 postmaster
 8941 postgres  20   0 2324m 6172 4676 R5  0.1   0:00.32 postmaster
10622 postgres  20   0 2514m  12m 9444 R5  0.2   0:00.79 postmaster
14021 postgres  20   0 2514m  11m 8548 R5  0.1   0:00.28 postmaster
14075 postgres  20   0 2514m  11m 8672 R5  0.1   0:00.27 postmaster
14423 postgres  20   0 2514m  11m 8572 R5  0.1   0:00.29 postmaster
18896 postgres  20   0 2324m 5644 4204 R5  0.1   0:00.11 postmaster
18897 postgres  20   0 2514m  12m 9800 R5  0.2   0:00.27 postmaster
18928 postgres  20   0 2514m  11m 8792 R5  0.1   0:00.18 postmaster
18973 postgres  20   0 2514m  11m 8792 R5  0.1   0:00.70 postmaster
22049 postgres  20   0 2514m  17m  14m R5  0.2   0:01.11 postmaster
22050 postgres  20   0 2514m  16m  13m R5  0.2   0:01.06 postmaster
22843 postgres  20   0 2514m  12m 9328 R5  0.2   0:00.20 postmaster
24202 postgres  20   0 2324m 5560 4120 R5  0.1   0:00.07 postmaster
24388 postgres  20   0 2514m  12m 9380 R5  0.2   0:00.16 postmaster
25903 postgres  20   0 2514m  11m 8828 R5  0.1   0:00.55 postmaster
28362 postgres  20   0 2514m  11m 8952 R5  0.1   0:00.48 postmaster
 5667 postgres  20   0 2324m 6752 5588 R4  0.1   0:08.93 postmaster
 7531 postgres  20   0 2324m 5452 4008 R4  0.1   0:03.21 postmaster
 9219 postgres  20   0 2514m  11m 8476 R4  0.1   0:00.89 postmaster
 9820 postgres  20   0 2514m  12m 9.9m R4  0.2   0:00.92 postmaster
10050 postgres  20   0 2324m 6172 4676 R4  0.1   0:00.31 postmaster
10645 postgres  20   0 2514m  12m 9512 R4  0.2   0:00.72 postmaster
14582 postgres  20   0 2514m  25m  21m R4  0.3   0:02.10 postmaster
18502 postgres  20   0 2514m  11m 9040 R4  0.1   0:00.64 postmaster
18972 postgres  20   0 2514m  11m 8792 R4  0.1   0:00.76 postmaster
18975 postgres  20   0 2514m  11m 8904 R4  0.1   0:00.63 postmaster
19496 postgres  20   0 2514m  14m  11m R4  0.2   0:00.44 postmaster
22121 postgres  20   0 2514m  16m  13m R4  0.2   0:00.81 postmaster
24340 postgres  20   0 2514m  12m 9424 R4  0.2   0:00.15 postmaster
24483 postgres  20   0 2324m 6008 4536 R4  0.1   0:00.21 postmaster
25668 postgres  20   0 2514m  16m  13m R4  0.2   0:00.91 postmaster
26382 postgres  20   0 2514m  11m 8996 R4  0.1   0:00.50 postmaster
28363 postgres  20   0 2514m  11m 8908 R4  0.1   0:00.34 postmaster
 9754 postgres  20   0 2514m  11m 8752 R3  0.1   0:00.29 postmaster
16113 postgres  20   0 2514m  17m  14m R3  0.2   0:01.10 postmaster
18498 postgres  20   0 2514m  11m 8844 R 

Re: [PERFORM] Occasional giant spikes in CPU load

2010-06-24 Thread Joshua D. Drake
On Thu, 2010-06-24 at 17:50 -0700, Craig James wrote:
 I'm reviving this question because I never figured it out.  To summarize: At 
 random intervals anywhere from a few times per hour to once or twice a day, 
 we see a huge spike in CPU load that essentially brings the system to a halt 
 for up to a minute or two.  Previous answers focused on what is it doing, 
 i.e. is it really Postgres or something else?
 
 Now the question has narrowed down to this: what could trigger EVERY postgres 
 backend to do something at the same time?  See the attached output from top 
 -b, which shows what is happening during one of the CPU spikes.

checkpoint causing IO Wait.

What does sar say about these times?

Joshua D. Drake


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Occasional giant spikes in CPU load

2010-06-24 Thread Greg Smith

Craig James wrote:
Now the question has narrowed down to this: what could trigger EVERY 
postgres backend to do something at the same time?  See the attached 
output from top -b, which shows what is happening during one of the 
CPU spikes.


By the way:  you probably want top -b -c, which will actually show you 
what each client is doing via inspecting what it's set its command line to.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] requested shared memory size overflows size_t

2010-06-24 Thread Robert Haas
On Thu, Jun 24, 2010 at 7:19 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Craig James's message of jue jun 24 19:03:00 -0400 2010:

 select relname, pg_relation_size(relname) from pg_class
          where pg_get_userbyid(relowner) = 'emol_warehouse_1'
          and relname not like 'pg_%'
          order by pg_relation_size(relname) desc;
 ERROR:  relation rownum_temp does not exist

 emol_warehouse_1= select relname from pg_class where relname = 
 'rownum_temp';
         relname
 --
   rownum_temp
 (1 row)

 What's the full row?  I'd just add a WHERE relkind = 'r' to the above
 query anyway.

Yeah - also, it would probably be good to call pg_relation_size on
pg_class.oid rather than pg_class.relname, to avoid any chance of
confusion over which objects are in which schema.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Occasional giant spikes in CPU load

2010-06-24 Thread Tom Lane
Craig James craig_ja...@emolecules.com writes:
 So what is it that will cause every single Postgres backend to come to life 
 at the same moment, when there's no real load on the server?  Maybe if a 
 backend crashes?  Some other problem?

sinval queue overflow comes to mind ... although that really shouldn't
happen if there's no real load on the server.  What PG version is
this?  Also, the pg_stat_activity view contents when this happens would
probably be more useful to look at than top output.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance