date:20111027

Ben Chobot be...@silentmedia.com writes:
 Today I tried to restore a 70GB database with the standard pg_dump -h 
 old_server  | psql -h new_server  method. I had 100GB set aside for 
 WAL files, which I figured surely would be enough, because all of the data, 
 including indices, is only 70GB. So I was a bit surprised when the restore 
 hung mis-way because my pg_xlogs directory ran out of space. 

 Is it expected that WAL files are less dense than data files?

Yes, that's not particularly surprising ... but how come they weren't
getting recycled?  Perhaps you had configured WAL archiving but it was
broken?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] WAL file size vs. data file size

2011-10-27 Thread Ben Chobot

On Oct 27, 2011, at 8:44 AM, Tom Lane wrote:

 Ben Chobot be...@silentmedia.com writes:
 Today I tried to restore a 70GB database with the standard pg_dump -h 
 old_server ∑ | psql -h new_server ∑ method. I had 100GB set aside for 
 WAL files, which I figured surely would be enough, because all of the data, 
 including indices, is only 70GB. So I was a bit surprised when the restore 
 hung mis-way because my pg_xlogs directory ran out of space. 
 
 Is it expected that WAL files are less dense than data files?
 
 Yes, that's not particularly surprising ... but how come they weren't
 getting recycled?  Perhaps you had configured WAL archiving but it was
 broken?

It's because I'm archiving wal files into Amazon's S3, which is slooow. 
PG is recycling as fast as it can, but when a few MB of COPY rows seem to 
ballon up to a few hundred MB of WAL files, it has a lot to archive before it 
can recycle. It'll be fine for steady state but it looks like it's just going 
to be a waste for this initial load.

What's the expected density ratio? I was always under the impression it would 
be about 1:1 when doing things like COPY, and have never seen anything to the 
contrary. 
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] matching against a list of regexp?

2011-10-27 Thread Gauthier, Dave

Hi:

I need to be able to select all records with a col value that matches any of a 
list of regexp.  Sort of like...

select a,b,c from foo where d ~ ('^xyz','blah','shrug$');

Does anyone know the right syntax for this?

Thanks!

Re: [GENERAL] matching against a list of regexp?

2011-10-27 Thread Richard Broersma

On Thu, Oct 27, 2011 at 9:18 AM, Gauthier, Dave dave.gauth...@intel.com wrote:
 I need to be able to select all records with a col value that matches any of
 a list of regexp.  Sort of like...

 select a,b,c from foo where d ~ ('^xyz','blah','shrug$');

WHERE d ~ '^xyz|blah|shrug$'


-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Are pg_xlog/* fiels necessary for PITR?

2011-10-27 Thread Venkat Balaji

On Thu, Oct 27, 2011 at 7:57 PM, rihad ri...@mail.ru wrote:

 Hi, I'm backing up the entire server directory from time to time. pg_xlog/
 directory containing WAL files is pretty heavy (wal_level=archive). Can I
 exclude it from the regular tar archive?


The best would be to perform pg_switch_xlog() and take a backup excluding
pg_xlog.

To recover the last moment TXNs, you might need pg_xlog (depends on when you
would be recovering). pg_switch_xlog() will reduce the dependency on pg_xlog
files to a greater extent.


 #!/bin/sh

 renice 20 $$ 2/dev/null
 pgsql -U pgsql -q -c CHECKPOINT postgres # speed up pg_start_backup()


pg_start_backup() performs a checkpoint and ensures that all the data till
that particular checkpoint and TXN id will be backed up (or marked as needed
for data consistency while restoring and recovering).


 pgsql -U pgsql -q -c select pg_start_backup('sol') postgres
 tar -cjf - /db 2/dev/null | ssh -q -i ~pgsql/.ssh/id_rsa -p 2022 -c
 blowfish dbarchive@10.0.0.1 'cat  db.tbz'
 pgsql -U pgsql -q -c select pg_stop_backup() postgres
 sleep 60 #wait for new WAL backups to appear
 echo 'ssh -q dbarchive@10.0.0.1 ./post-backup.sh' | su -m pgsql


 I want to change tar invocation to be: tar -cjf --exclude 'db/pg_xlog/*'
 ...

 Will there be enough data in case of recovery? (May God forbid... )))


But, all the WAL Archives between backup start time and end time must be
backed up. They are needed at any cost for the database to be consistent and
the recovery to be smooth.

Recovering to any point-in-time purely depends on your backup strategy.

Thanks
VB

Re: [GENERAL] PostGIS in a commercial project

2011-10-27 Thread Martijn van Oosterhout

On Tue, Oct 25, 2011 at 01:41:17PM +0200, Thomas Kellerer wrote:
 Thank you very much for the detailed explanation.
 
 I always have a hard time to understand the GPL especially the
 dividing line between using, linkin and creating a derived work.

That because the GPL does not get to define those terms. They are
defined by copyright law, the licence does not get to choose what is a
derived work and what isn't.  The FSF is of the opinion that anything
linked to a GPL library is a derived work, but that isn't true in all
cases (libedit vs libreadline is one of those borderline cases).

I note in the OPs case they are relying on the customer to install
PostGIS.  The GPL only applies to *redistribution* not usage.  So if
you're not supplying your customers with PostGIS then the fact that
it's GPL seems completely irrelevent.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

[GENERAL] Server hitting 100% CPU usage, system comes to a crawl.


Hi all, need some help/clues on tracking down a performance issue.

PostgreSQL version: 8.3.11

I've got a system that has 32 cores and 128 gigs of ram. We have 
connection pooling set up, with about 100 - 200 persistent connections 
open to the database. Our applications then use these connections to 
query the database constantly, but when a connection isn't currently 
executing a query, it's IDLE. On average, at any given time, there are 
3 - 6 connections that are actually executing a query, while the rest 
are IDLE.


About once a day, queries that normally take just a few seconds slow way 
down, and start to pile up, to the point where instead of just having 
3-6 queries running at any given time, we get 100 - 200. The whole 
system comes to a crawl, and looking at top, the CPU usage is 99%.


Looking at top, I see no SWAP usage, very little IOWait, and there are a 
large number of postmaster processes at 100% cpu usage (makes sense, at 
this point there are 150 or so queries currently executing on the database).


 Tasks: 713 total,  44 running, 668 sleeping,   0 stopped,   1 zombie
Cpu(s):  4.4%us, 92.0%sy,  0.0%ni,  3.0%id,  0.0%wa,  0.0%hi,  0.3%si,  
0.2%st

Mem:  134217728k total, 131229972k used,  2987756k free,   462444k buffers
Swap:  8388600k total,  296k used,  8388304k free, 119029580k cached


In the past, we noticed that autovacuum was hitting some large tables at 
the same time this happened, so we turned autovacuum off to see if that 
was the issue, and it still happened without any vacuums running.


We also ruled out checkpoints being the cause.

I'm currently digging through some statistics I've been gathering to see 
if traffic increased at all, or remained the same when the slowdown 
occurred. I'm also digging through the logs from the postgresql cluster 
(I increased verbosity yesterday), looking for any clues. Any 
suggestions or clues on where to look for this to see what can be 
causing a slowdown like this would be greatly appreciated.


Thanks,
- Brian F

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

2011-10-27 Thread John R Pierce


On 10/27/11 11:39 AM, Brian Fehrle wrote:


I've got a system that has 32 cores and 128 gigs of ram. We have 
connection pooling set up, with about 100 - 200 persistent connections 
open to the database. Our applications then use these connections to 
query the database constantly, but when a connection isn't currently 
executing a query, it's IDLE. On average, at any given time, there 
are 3 - 6 connections that are actually executing a query, while the 
rest are IDLE. 



thats not a very effective use of pooling.   the pooling model, you'd 
have a connection pool sufficient actual database connections to satisfy 
your concurrency requirements, and your apps would grab a connection 
from the pool, do a transaction, then release the connection back to the 
pool.


now, I don't know that this has anything to do with your performance 
problem, I'm just pointing out this anomaly.  a pool doesn't do much 
good if the clients grab a connection and just sit on it.



--
john r pierceN 37, W 122
santa cruz ca mid-left coast


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

On Thu, Oct 27, 2011 at 12:39 PM, Brian Fehrle
bri...@consistentstate.com wrote:
 Looking at top, I see no SWAP usage, very little IOWait, and there are a
 large number of postmaster processes at 100% cpu usage (makes sense, at this
 point there are 150 or so queries currently executing on the database).

  Tasks: 713 total,  44 running, 668 sleeping,   0 stopped,   1 zombie
 Cpu(s):  4.4%us, 92.0%sy,  0.0%ni,  3.0%id,  0.0%wa,  0.0%hi,  0.3%si,
  0.2%st
 Mem:  134217728k total, 131229972k used,  2987756k free,   462444k buffers
 Swap:  8388600k total,      296k used,  8388304k free, 119029580k cached

OK, a few points.  1: You've got a zombie process.  Find out what's
causing that, it could be a trigger of some type for this behaviour.
2: You're 92% sys.  That's bad.  It means the OS is chewing up 92% of
your 32 cores doing something.  what tasks are at the top of the list
in top?

Try running vmstat 10 for a a minute or so then look at cs and int
columns.  If cs or int is well over 100k there could be an issue with
thrashing, where your app is making some change to the db that
requires all backends to be awoken at once and the machine just falls
over under the load.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

On Thu, Oct 27, 2011 at 1:48 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 OK, a few points.  1: You've got a zombie process.  Find out what's

To expand on the zombie thing, it's quite possible that you're
managing to make a pg backend process crashout, which would cause the
db to restart midday, which is bad (TM) since that dumps all of shared
buffers and forces all clients to reconnect.  So look through the
system logs for segmentation faults, etc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

On Thu, Oct 27, 2011 at 1:52 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Thu, Oct 27, 2011 at 1:48 PM, Scott Marlowe scott.marl...@gmail.com 
 wrote:
 OK, a few points.  1: You've got a zombie process.  Find out what's

 To expand on the zombie thing, it's quite possible that you're
 managing to make a pg backend process crashout, which would cause the
 db to restart midday, which is bad (TM) since that dumps all of shared
 buffers and forces all clients to reconnect.  So look through the
 system logs for segmentation faults, etc.

One last thing, you should upgrade to the latest 8.3 version to see if
that helps.  There was a bug fix around 8.3.13 or so that stopped
postgresql from restarting due to a simple data corruption issue that
should have only resulted in an error message not a restart of the db.
 I know, cause I found it. :)  Thanks to the pg devs for fixing it.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.


On 10/27/2011 02:50 PM, Tom Lane wrote:

Brian Fehrlebri...@consistentstate.com  writes:

Hi all, need some help/clues on tracking down a performance issue.
PostgreSQL version: 8.3.11
I've got a system that has 32 cores and 128 gigs of ram. We have
connection pooling set up, with about 100 - 200 persistent connections
open to the database. Our applications then use these connections to
query the database constantly, but when a connection isn't currently
executing a query, it'sIDLE. On average, at any given time, there are
3 - 6 connections that are actually executing a query, while the rest
areIDLE.
About once a day, queries that normally take just a few seconds slow way
down, and start to pile up, to the point where instead of just having
3-6 queries running at any given time, we get 100 - 200. The whole
system comes to a crawl, and looking at top, the CPU usage is 99%.

This is jumping to a conclusion based on insufficient data, but what you
describe sounds a bit like the sinval queue contention problems that we
fixed in 8.4.  Some prior reports of that:
http://archives.postgresql.org/pgsql-performance/2008-01/msg1.php
http://archives.postgresql.org/pgsql-performance/2010-06/msg00452.php

If your symptoms match those, the best fix would be to update to 8.4.x
or later, but a stopgap solution would be to cut down on the number of
idle backends.

regards, tom lane
That sounds somewhat close to the same issue I am seeing. Main 
differences being that my spike lasts for much longer than a few 
minutes, and can only be resolved when the cluster is restarted. Also, 
that second link shows TOP where much of the CPU is via the 'user', 
rather than the 'sys' like mine.


Is there anything I can look at more to get more info on this 'sinval 
que contention problem'?


Also, having my cpu usage high in 'sys' rather than 'us', could that be 
a red flag? Or is that normal?


- Brian F

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

2011-10-27 Thread Scott Mead

On Thu, Oct 27, 2011 at 2:39 PM, Brian Fehrle bri...@consistentstate.comwrote:

Hi all, need some help/clues on tracking down a performance issue.

PostgreSQL version: 8.3.11

I've got a system that has 32 cores and 128 gigs of ram. We have connection
pooling set up, with about 100 - 200 persistent connections open to the
database. Our applications then use these connections to query the database
constantly, but when a connection isn't currently executing a query, it's
IDLE. On average, at any given time, there are 3 - 6 connections that are
actually executing a query, while the rest are IDLE.

Remember, when you read pg_stat_activity, it is showing you query activity
from that exact specific moment in time. Just because it looks like only
3-6 connections are executing, doesn't mean that 200 aren't actually
executing .1ms statements. With such a beefy box, I would see if you can
examine any stats from your connection pooler to find out how many
connections are actually getting used.

About once a day, queries that normally take just a few seconds slow way
down, and start to pile up, to the point where instead of just having 3-6
queries running at any given time, we get 100 - 200. The whole system comes
to a crawl, and looking at top, the CPU usage is 99%.

Looking at top, I see no SWAP usage, very little IOWait, and there are a
large number of postmaster processes at 100% cpu usage (makes sense, at this
point there are 150 or so queries currently executing on the database).

Tasks: 713 total, 44 running, 668 sleeping, 0 stopped, 1 zombie
Cpu(s): 4.4%us, 92.0%sy, 0.0%ni, 3.0%id, 0.0%wa, 0.0%hi, 0.3%si,
0.2%st
Mem: 134217728k total, 131229972k used, 2987756k free, 462444k buffers
Swap: 8388600k total, 296k used, 8388304k free, 119029580k cached

In the past, we noticed that autovacuum was hitting some large tables at
the same time this happened, so we turned autovacuum off to see if that was
the issue, and it still happened without any vacuums running.

That was my next question :)

We also ruled out checkpoints being the cause.

How exactly did you rule this out? Just because a checkpoint is over
doesn't mean that it hasn't had a negative effect on the OS cache. If
you're stuck going to disk, that could be hurting you (that being said, you
do point to a low I/O wait above, so you're probably correct in ruling this
out).

I'm currently digging through some statistics I've been gathering to see if
traffic increased at all, or remained the same when the slowdown occurred.
I'm also digging through the logs from the postgresql cluster (I increased
verbosity yesterday), looking for any clues. Any suggestions or clues on
where to look for this to see what can be causing a slowdown like this would
be greatly appreciated.

Are you capturing table-level stats from pg_stat_user_[tables | indexes]?
Just because a server doesn't look busy doesn't mean that you're not doing
1000 index scans per second returning 1000 tuples each time.

--Scott

Thanks,
- Brian F

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Custom data type in C with one fixed and one variable attribute

2011-10-27 Thread Adrian Schreyer

Hi,

I am trying to create a custom data type in C that has a fixed size
and a variable size attribute - is that actually possible? The
documentation mentions only one or the other but a struct in the
pg_trgm extension (TRGM) seems to have that.

The data type I have is

typedef struct {
int4   length;
uint32 foo;
char   bar[1];
} oefp;

The external representation of that data type would be (1,
'hexadecimal string here'), for example.

This is my _in function to parse the external cstring.

PG_FUNCTION_INFO_V1(mydatatype_in);
Datum mydatatype_in(PG_FUNCTION_ARGS)
{
char   *rawcstring = PG_GETARG_CSTRING(0);

uint32  foo;
char   *buffer = (char *) palloc(strlen(rawcstring));

if (sscanf(rawcstring, (%u,%[^)]), foo, buffer) != 2)
{
ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
 errmsg(Invalid input syntax: \%s\, rawcstring)));
}

mydatatype *dt = (mydatatype*) palloc(VARHDRSZ + sizeof(uint32) +
strlen(buffer));

SET_VARSIZE(dt, VARHDRSZ + sizeof(uint32) + strlen(buffer));
memcpy(dt-bar, buffer, strlen(buffer));
dt-foo = foo;

PG_RETURN_POINTER(dt);
}

The problem is however that dt-bar contains not only the input string
but random characters or other garbage as well, so something must go
wrong at the end of the function. Any thoughts what it could be?

Cheers,

Adrian

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.


On 10/27/2011 02:27 PM, Scott Mead wrote:



On Thu, Oct 27, 2011 at 2:39 PM, Brian Fehrle 
bri...@consistentstate.com mailto:bri...@consistentstate.com wrote:


Hi all, need some help/clues on tracking down a performance issue.

PostgreSQL version: 8.3.11

I've got a system that has 32 cores and 128 gigs of ram. We have
connection pooling set up, with about 100 - 200 persistent
connections open to the database. Our applications then use these
connections to query the database constantly, but when a
connection isn't currently executing a query, it's IDLE. On
average, at any given time, there are 3 - 6 connections that are
actually executing a query, while the rest are IDLE.


Remember, when you read pg_stat_activity, it is showing you query 
activity from that exact specific moment in time.  Just because it 
looks like only 3-6 connections are executing, doesn't mean that 200 
aren't actually executing  .1ms statements.  With such a beefy box, I 
would see if you can examine any stats from your connection pooler to 
find out how many connections are actually getting used.


Correct, we're getting a few hundred transactions per second, but under 
normal operation, polling pg_stat_activity will show the average of 3 - 
6 queries that were running at that moment, and those queries run for an 
average of 5 - 7 seconds. So my belief is that something happens to the 
system where either a) We get a ton more queries than normal from the 
application (currently hunting down data to support this), or b) the 
overall speed of the system slows down so that all queries increase in 
time so much that polling pg_stat_activity lets me actually see them.




About once a day, queries that normally take just a few seconds
slow way down, and start to pile up, to the point where instead of
just having 3-6 queries running at any given time, we get 100 -
200. The whole system comes to a crawl, and looking at top, the
CPU usage is 99%.

Looking at top, I see no SWAP usage, very little IOWait, and there
are a large number of postmaster processes at 100% cpu usage
(makes sense, at this point there are 150 or so queries currently
executing on the database).

 Tasks: 713 total,  44 running, 668 sleeping,   0 stopped,   1 zombie
Cpu(s):  4.4%us, 92.0%sy,  0.0%ni,  3.0%id,  0.0%wa,  0.0%hi,
 0.3%si,  0.2%st
Mem:  134217728k total, 131229972k used,  2987756k free,   462444k
buffers
Swap:  8388600k total,  296k used,  8388304k free, 119029580k
cached


In the past, we noticed that autovacuum was hitting some large
tables at the same time this happened, so we turned autovacuum off
to see if that was the issue, and it still happened without any
vacuums running.

That was my next question :)


We also ruled out checkpoints being the cause.

How exactly did you rule this out?  Just because a checkpoint is over 
doesn't mean that it hasn't had a negative effect on the OS cache.  If 
you're stuck going to disk, that could be hurting you (that being 
said, you do point to a low I/O wait above, so you're probably correct 
in ruling this out).


Checkpoint settings were set to the default per install. 5 minute 
timeout, 0.5 completion target, and 30s warning. Looking at the logs, we 
were getting a checkpoint every 5 minutes on the dot.


I looked at the data in pg_stat_database and noticed that buffers 
written by checkpoints are near 4X that of the background writer. So I 
implemented some changes to get more to be written by the background 
writer, including increasing the checkpoint timeout to 30 minutes, and 
setting the frequency of the bgwriter wait time from 200ms to 50ms.


checkpoints now happen 30 mins apart on the dot, and there was not a 
checkpoint happening the last time this issue of major slowdown occured.




I'm currently digging through some statistics I've been gathering
to see if traffic increased at all, or remained the same when the
slowdown occurred. I'm also digging through the logs from the
postgresql cluster (I increased verbosity yesterday), looking for
any clues. Any suggestions or clues on where to look for this to
see what can be causing a slowdown like this would be greatly
appreciated.

Are you capturing table-level stats from pg_stat_user_[tables | 
indexes]?  Just because a server doesn't look busy doesn't mean that 
you're not doing 1000 index scans per second returning 1000 tuples 
each time.



I am not grabbing any of those at the moment, I'll look into those.

- Brian F

--Scott

Thanks,
   - Brian F

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org

mailto:pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Getting X coordinate from a point(lseg), btw i read the man page about points.

Ing.Edmundo.Robles.Lopez erob...@sensacd.com.mx writes:
 Hi in the main page about geometric operations said:
 It is possible to access the two component numbers of a  point  as though it 
 were an array with indices 0 and 1. For example, if  t.p  is a  point  column 
 then  SELECT p[0] FROM t  retrieves the X coordinate and  UPDATE t SET p[1] = 
 ...  changes the Y coordinate. In the same way, a value of type  box  or  
 lseg  can be treated as an array of two  point  values.

 [ So how to get p2.x from an lseg value? ]

   select  info[0] from table limit 1;
   (647753.125,2825633.75)

Right, that gets you a point.

   i still want to get647753.125, so i did:
 select  info[0][0] from table limit 1;

Close, but that notation only works for a 2-dimensional array, which an
lseg is not.  What you need is

regression=# select (info[0])[0] from table;
 f1 

 647753.125
(1 row)

The parenthesized object is a point, and then an entirely separate
subscripting operation has to be applied to it to get its X coordinate.

 then i did:
   select point(info[0])[0]  from table limit 1;

Well, that's unnecessary since info[0] is already a point, but the
syntactic problem is again that you have to parenthesize the thing that
the second subscript is being applied to:

select (point(info[0]))[0]  from table limit 1;

You need parentheses any time you're going to apply subscripting or
field selection to something that isn't a simple variable reference.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

Brian Fehrle bri...@consistentstate.com writes:
 Hi all, need some help/clues on tracking down a performance issue.
 PostgreSQL version: 8.3.11

 I've got a system that has 32 cores and 128 gigs of ram. We have 
 connection pooling set up, with about 100 - 200 persistent connections 
 open to the database. Our applications then use these connections to 
 query the database constantly, but when a connection isn't currently 
 executing a query, it's IDLE. On average, at any given time, there are 
 3 - 6 connections that are actually executing a query, while the rest 
 are IDLE.

 About once a day, queries that normally take just a few seconds slow way 
 down, and start to pile up, to the point where instead of just having 
 3-6 queries running at any given time, we get 100 - 200. The whole 
 system comes to a crawl, and looking at top, the CPU usage is 99%.

This is jumping to a conclusion based on insufficient data, but what you
describe sounds a bit like the sinval queue contention problems that we
fixed in 8.4.  Some prior reports of that:
http://archives.postgresql.org/pgsql-performance/2008-01/msg1.php
http://archives.postgresql.org/pgsql-performance/2010-06/msg00452.php

If your symptoms match those, the best fix would be to update to 8.4.x
or later, but a stopgap solution would be to cut down on the number of
idle backends.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

Also, I'm not having any issue with the database restarting itself, 
simply becoming unresponsive / slow to respond, to the point where just 
sshing to the box takes about 30 seconds if not longer. Performing a 
pg_ctl restart on the cluster resolves the issue.


I looked through the logs for any segmentation faults, none found. In 
fact the only thing in my log that seems to be 'bad' are the following.


Oct 27 08:53:18 snip postgres[17517]: [28932839-1] 
user=snip,db=snip ERROR:  deadlock detected
Oct 27 11:49:22 snip postgres[608]: [19-1] user=snip,db=snip 
ERROR:  could not serialize access due to concurrent update


I don't believe these occurred too close to the slowdown.

- Brian F

On 10/27/2011 02:09 PM, Brian Fehrle wrote:

On 10/27/2011 01:48 PM, Scott Marlowe wrote:

On Thu, Oct 27, 2011 at 12:39 PM, Brian Fehrle
bri...@consistentstate.com  wrote:
Looking at top, I see no SWAP usage, very little IOWait, and there 
are a
large number of postmaster processes at 100% cpu usage (makes sense, 
at this

point there are 150 or so queries currently executing on the database).

  Tasks: 713 total,  44 running, 668 sleeping,   0 stopped,   1 zombie
Cpu(s):  4.4%us, 92.0%sy,  0.0%ni,  3.0%id,  0.0%wa,  0.0%hi,  0.3%si,
  0.2%st
Mem:  134217728k total, 131229972k used,  2987756k free,   462444k 
buffers
Swap:  8388600k total,  296k used,  8388304k free, 119029580k 
cached

OK, a few points.  1: You've got a zombie process.  Find out what's
causing that, it could be a trigger of some type for this behaviour.
2: You're 92% sys.  That's bad.  It means the OS is chewing up 92% of
your 32 cores doing something.  what tasks are at the top of the list
in top?

Out of the top 50 processes in top, 48 of them are postmasters, one is 
syslog, and one is psql. Each of the postmasters have a high %CPU, the 
top ones being 80% and higher, the rest being anywhere between 30% - 
60%. Would postmaster 'queries' that are running attribute to the sys 
CPU usage, or should they be under the 'us' CPU usage?




Try running vmstat 10 for a a minute or so then look at cs and int
columns.  If cs or int is well over 100k there could be an issue with
thrashing, where your app is making some change to the db that
requires all backends to be awoken at once and the machine just falls
over under the load.


We've restarted the postgresql cluster, so the issue is not happening 
at this moment. but running a vmstat 10 had my 'cs' average at 3K and 
'in' averaging around 9.5K.


- Brian F



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] PostgreSQL at LISA in Boston: Dec. 7-8

2011-10-27 Thread Josh Berkus

All,

We are going to be doing a booth at Usenix LISA[1] conference in Boston
from December 7-8.  If you live in Boston and are a PostgreSQL
enthusiast, or if you plan to attend LISA, I want your help!

I need booth volunteers to help me work the booth.  All you need is some
general knowledge of how to use PostgreSQL, enthusiasm, and 3 (or more)
free hours.  We will supply the rest, including flyers, magazines, and a
t-shirt for volunteers.  If you don't already have a pass for the
conference, I will get you one for the exhibit hall.

Or you can attend the full conference if you pay a registration fee; get
$100 off with our community discount, LISA11POSTGRE.

We will also have a BOF[2] on the night of the 7th at the conference
hotel, where I will demo some 9.1 and 9.2 features and talk about the
Postgres project.

Finally, PalominoDB is organizing a Boston PUG (PostgreSQL User Group)
meeting at which I will speak. Details TBD, but expect it at MIT on one
of the nights of the 5th, 6th, or 8th.  Maybe we can start a regular
BostonPUG!

[1] Large Information Systems Administration:
http://www.usenix.org/events/lisa11/exhibition.html
[2] http://www.usenix.org/events/lisa11/bofs.html#postgres

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.


On 10/27/2011 01:48 PM, Scott Marlowe wrote:

On Thu, Oct 27, 2011 at 12:39 PM, Brian Fehrle
bri...@consistentstate.com  wrote:

Looking at top, I see no SWAP usage, very little IOWait, and there are a
large number of postmaster processes at 100% cpu usage (makes sense, at this
point there are 150 or so queries currently executing on the database).

  Tasks: 713 total,  44 running, 668 sleeping,   0 stopped,   1 zombie
Cpu(s):  4.4%us, 92.0%sy,  0.0%ni,  3.0%id,  0.0%wa,  0.0%hi,  0.3%si,
  0.2%st
Mem:  134217728k total, 131229972k used,  2987756k free,   462444k buffers
Swap:  8388600k total,  296k used,  8388304k free, 119029580k cached

OK, a few points.  1: You've got a zombie process.  Find out what's
causing that, it could be a trigger of some type for this behaviour.
2: You're 92% sys.  That's bad.  It means the OS is chewing up 92% of
your 32 cores doing something.  what tasks are at the top of the list
in top?

Out of the top 50 processes in top, 48 of them are postmasters, one is 
syslog, and one is psql. Each of the postmasters have a high %CPU, the 
top ones being 80% and higher, the rest being anywhere between 30% - 
60%. Would postmaster 'queries' that are running attribute to the sys 
CPU usage, or should they be under the 'us' CPU usage?




Try running vmstat 10 for a a minute or so then look at cs and int
columns.  If cs or int is well over 100k there could be an issue with
thrashing, where your app is making some change to the db that
requires all backends to be awoken at once and the machine just falls
over under the load.


We've restarted the postgresql cluster, so the issue is not happening at 
this moment. but running a vmstat 10 had my 'cs' average at 3K and 'in' 
averaging around 9.5K.


- Brian F

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Custom data type in C with one fixed and one variable attribute

Adrian Schreyer ams...@cam.ac.uk writes:
 The data type I have is

 typedef struct {
 int4   length;
 uint32 foo;
 char   bar[1];
 } oefp;

Seems reasonable enough.

 mydatatype *dt = (mydatatype*) palloc(VARHDRSZ + sizeof(uint32) +
 strlen(buffer));

 SET_VARSIZE(dt, VARHDRSZ + sizeof(uint32) + strlen(buffer));
 memcpy(dt-bar, buffer, strlen(buffer));
 dt-foo = foo;

Fine, but keep in mind that what you are creating here is a
non-null-terminated string.

 The problem is however that dt-bar contains not only the input string
 but random characters or other garbage as well, so something must go
 wrong at the end of the function. Any thoughts what it could be?

It sounds to me like you are inspecting dt-bar with something that
expects to see a null-terminated string.  You could either fix your
inspection code, or expend one more byte to make the string be
null-terminated as stored.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

2011-10-27 Thread Alan Hodgson

On October 27, 2011 01:09:51 PM Brian Fehrle wrote:
 We've restarted the postgresql cluster, so the issue is not happening at
 this moment. but running a vmstat 10 had my 'cs' average at 3K and 'in'
 averaging around 9.5K.

Random thought, is there any chance the server is physically overheating? I've 
seen CPUs throttle really low when overheating, which can make otherwise 
normal activity seem really slow.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] User feedback requested on temp tables usage for Hot Standby

2011-10-27 Thread Simon Riggs

Some people have asked for the ability to create temp tables on a Hot
Standby server.

I've got a rough implementation plan but it would have some
restrictions, so I would like to check my understanding of the use
case for this feature so I don't waste time implementing something
nobody actually finds useful.

My understanding is that the main use cases for that would be limited
to these two options only:

1. CREATE TEMP TABLE foo AS SELECT 

2. CREATE TEMP TABLE foo (..);
INSERT INTO foo ...

and sometimes a TRUNCATE foo;

In almost all cases people don't run multiple INSERTs, nor do they run
UPDATEs or DELETEs, so the above actions would cover 99% of use cases.

Can anyone give backup to that opinion, or alternate viewpoints?

Thanks,

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostGIS in a commercial project

2011-10-27 Thread Chris Travers

On Thu, Oct 27, 2011 at 9:44 AM, Martijn van Oosterhout
klep...@svana.org wrote:

 I note in the OPs case they are relying on the customer to install
 PostGIS.  The GPL only applies to *redistribution* not usage.  So if
 you're not supplying your customers with PostGIS then the fact that
 it's GPL seems completely irrelevent.

Also as a note here, if linking implied derivation, then all software
that ran on Windows would be illegal to distribute without Microsoft's
permission..   Yet at least here in the US, jailbreaking an iPhone
is legal in the opinion of the Copyright Office because it allows fair
use of the device, namely installing apps that Apple hasn't otherwise
authorized.  So I'd generally agree with the assessment above.

Best Wishes,
Chris Travers

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostGIS in a commercial project

2011-10-27 Thread Joshua D. Drake



On 10/27/2011 04:24 PM, Chris Travers wrote:


On Thu, Oct 27, 2011 at 9:44 AM, Martijn van Oosterhout
klep...@svana.org  wrote:


I note in the OPs case they are relying on the customer to install
PostGIS.  The GPL only applies to *redistribution* not usage.  So if
you're not supplying your customers with PostGIS then the fact that
it's GPL seems completely irrelevent.


Also as a note here, if linking implied derivation, then all software
that ran on Windows would be illegal to distribute without Microsoft's
permission..   Yet at least here in the US, jailbreaking an iPhone
is legal in the opinion of the Copyright Office because it allows fair
use of the device, namely installing apps that Apple hasn't otherwise
authorized.  So I'd generally agree with the assessment above.


Not to be a killjoy but unless any of us is an attorney, I suggest we 
defer to a person with a law degree. This seems more like a question for 
the SFLC than for the general community.


JD




Best Wishes,
Chris Travers




--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
The PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] User feedback requested on temp tables usage for Hot Standby

2011-10-27 Thread Ben Chobot

On Oct 27, 2011, at 5:13 PM, Simon Riggs wrote:

 Some people have asked for the ability to create temp tables on a Hot
 Standby server.
 
 I've got a rough implementation plan but it would have some
 restrictions, so I would like to check my understanding of the use
 case for this feature so I don't waste time implementing something
 nobody actually finds useful.
 
 My understanding is that the main use cases for that would be limited
 to these two options only:
 
 1. CREATE TEMP TABLE foo AS SELECT 
 
 2. CREATE TEMP TABLE foo (..);
INSERT INTO foo ...
 
 and sometimes a TRUNCATE foo;
 
 In almost all cases people don't run multiple INSERTs, nor do they run
 UPDATEs or DELETEs, so the above actions would cover 99% of use cases.
 
 Can anyone give backup to that opinion, or alternate viewpoints?

The times that we would use a temp table on a slave are times when we would 
want to materialize a large set of intermediate results while doing ad hoc 
queries. This seems to cover that….. although, just to be sure, do I understand 
you in that UDPATEs and DELETEs would not be allowed? That would be fine, but 
having multiple INSERTs would be very handy. 

Of course, even having a one-time insert temp table is better than no temp 
table at all. :)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Server move using rsync

2011-10-27 Thread Stephen Denne

We're intending to move a 470GB PostgreSQL 8.3.13 database using the following 
technique from http://www.postgresql.org/docs/8.3/interactive/backup-file.html
 
Another option is to use rsync to perform a file system backup. This is done 
by first running rsync while the database server is running, then shutting down 
the database server just long enough to do a second rsync. The second rsync 
will be much quicker than the first, because it has relatively little data to 
transfer, and the end result will be consistent because the server was down. 
This method allows a file system backup to be performed with minimal downtime.

Except that we plan on an initial rsync which we think might take a couple of 
days, then subsequent daily rsyncs for up to a week to keep it up to date till 
we stop the old database, rsync again, and start the new database.

A very rough approximation of our database would be half a dozen large tables 
taking up 1/3 of the disk space, and lots of indexes on those tables taking the 
other 2/3 of the space.

If we assume usage characteristics of:
Much less than 1% of indexed data changing per day, with almost all of those 
updates being within the 1% of most recently added data.
Much less than 1% of historical indexed data being deleted per day with most of 
the deletions expected to affect sets of contiguous file pages.
About 1% of new indexed data added per day

I'm curious of the impact of vacuum (automatic and manual) during that process 
on expected amount of work rsync will have to do, and time it will take, and on 
what the update pattern is on files of Btree indexes.

Is it worth making sure vacuum is not run, in order to reduce the amount of 
files that change during that period?

Do a number of additions evenly spread through the domain of an indexed field's 
values result in localized changes to the indexes files, or changes throughout 
the files?

How about for additions to the end of the domain of an indexed field's values 
(e.g. adding current dates)?

Is there any way during that week, that we can verify whether our partially 
completed database move process is going to result in a database that starts up 
ok?

Regards, Stephen Denne.
This email with any attachments is confidential and may be subject to legal 
privilege. If it is not intended for you please advise by replying immediately, 
destroy it and do not copy, disclose or use it in any way.

Please consider the environment before printing this e-mail
__
  This email has been scanned by the DMZGlobal Business Quality
  Electronic Messaging Suite.
Please see http://www.dmzglobal.com/dmzmessaging.htm for details.
__



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] JDBC connections very occasionally hang

2011-10-27 Thread Karl Wright

Hi folks,

I'm seeing something that on the face of it sounds very similar to the
issue reported at
http://archives.postgresql.org/pgsql-general/2011-10/msg00570.php.  I
am using Postgresql 8.4, and the problem occurs with both the 8.4 JDBC
type-3 driver and the 9.1 JDBC type-3 driver.  The test I have that
causes the failure runs for about 3 hours and is highly multithreaded.
 By the end of that time I usually see between one and three stuck
threads, all waiting inside the JDBC driver for a response from the
postgresql server.  I can provide a stack trace if requested.  The
actual queries it locks up on differ from run to run; I've seen it
hang on longer-running queries such as a REINDEX, or very basic
queries such as an update, or even on a BEGIN TRANSACTION.  Locking is
not likely to be the problem since the issue occurs with only one
thread involved with fair frequency.  The database is also running on
the same machine as the test client, so that would appear to rule out
network glitches.  Upon failure, there are no errors or warnings
recorded in the postgresql logs either.

Because of the volume of queries it will be difficult to determine by
simply turning on logging whether all the queries are in fact making
it to the server or not.  Is there any other diagnostics you could
recommend?

FWIW, this behavior seems to be new to 8.4; the same software ran
flawlessly and reliably on Postgresql 8.2 and 8.3.

Karl

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] JDBC connections very occasionally hang

Karl Wright daddy...@gmail.com writes:
 ... By the end of that time I usually see between one and three stuck
 threads, all waiting inside the JDBC driver for a response from the
 postgresql server.  I can provide a stack trace if requested.

How about a stack trace from the connected backend?  And what is its
state as shown by the pg_stat_activity and pg_locks views?  It's hard
to tell from what you say here whether the problem is on the server or
client side, which is surely the first thing to isolate.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] pglesslog for Postgres 9.1.1

2011-10-27 Thread Satoshi Nagayasu


Hi Louis,

2011/10/27 19:49, mailtolouis2020-postg...@yahoo.com wrote:

Hi,

I'm sorry I'm not good in C, anyone can help to put a patch or release a new 
version for that?



Regards
Louis


---

---

*From:* Tom Lane t...@sss.pgh.pa.us
*To:* mailtolouis2020-postg...@yahoo.com 
mailtolouis2020-postg...@yahoo.com
*Cc:* Postgres pgsql-general@postgresql.org
*Sent:* Wednesday, October 26, 2011 3:42 PM
*Subject:* Re: [GENERAL] pglesslog for Postgres 9.1.1

mailtolouis2020-postg...@yahoo.com mailto:mailtolouis2020-postg...@yahoo.com 
mailtolouis2020-postg...@yahoo.com mailto:mailtolouis2020-postg...@yahoo.com writes:
  remove.c:182: error: ‘XLOG_GIN_INSERT’ undeclared (first use in this 
function)
  remove.c:182: error: (Each undeclared identifier is reported only once
  remove.c:182: error: for each function it appears in.)
  remove.c:184: error: ‘XLOG_GIN_VACUUM_PAGE’ undeclared (first use in 
this function)
  remove.c:186: error: ‘XLOG_GIN_DELETE_PAGE’ undeclared (first use in 
this function)

That stuff got moved to gin_private.h in 9.1 ...

regards, tom lane


I'm taking part in.

Try this patch,

https://gist.github.com/1321650

and build as following.

$ make USE_PGXS=1 top_builddir=/path/to/postgresql-9.1.0

Regards,

--
NAGAYASU Satoshi satoshi.nagay...@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server move using rsync