Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-20 Thread Simon Riggs
On Thu, 2006-01-19 at 18:38 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  This seems to lead to a super-geometric progression in the number of
  files required,
 
 But we double the number of batches at each step, so there are going to
 be at most 20 or so levels, and that's only assuming a *horridly* wrong
 initial guess by the planner.  In practice I think it's reasonable to
 assume at most a couple rounds of doubling.  If you have more than that,
 the extra data-shuffling is going to exhaust your patience anyway.

What I'm saying is that if we start from 1 batch and move dynamically
upwards we quickly get an unmanageable number of files. However, if we
start at a particular number N, then we start with N-1 files, then move
to at most 2N(N-1) files etc..

So we can only get it wrong and double the number of batches about
twice before we get swamped with files. i.e. if we start at 1 we can
only reasonably get to 8 batches.

So we should start at a number higher than 1, attempting to make an
accurate guess about number of batches (N) required. If we have R rows
to aggregate and we get N correct, then the cost of the HashAgg is
2*R*(N-1)/N I/Os, which is cheaper than a sort, for *any* value of R for
both CPU and I/O costs. If we get it wrong, we have to read and re-write
more and more rows, which could eventually surpass the sort costs,
especially if we have growing transition state data from the aggregate.
I think the cost will be to re-write half of all rows already written
when we double N. If we fail early because we got Ndistinct wrong then
this could be cheap, though if we fail later on because of a growing
aggregate then this could easily be very expensive and quickly exceed
the cost of a sort.

My thought is to collect statistics about an aggregate at CREATE
AGGREGATE time. Simply send the aggregate 100 data values and see if the
output varies in size according to the input, if it does we take much
greater care about selecting HashAgg plans with that aggregate. ...and
that way we don't need the user to define the aggregate type directly.
This would only work with aggregates that return well known datatypes
such as int or char.

So getting the number of groups correct would be critical to making this
work, but HashAgg could be effective for even very large aggregates.

Any holes in that thinking?

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] BuildFarm: Do we need another FreeBSD/amd64 member?

2006-01-20 Thread Jim C. Nasby
On Thu, Jan 19, 2006 at 07:16:03PM -0500, Tom Lane wrote:
 Larry Rosenman [EMAIL PROTECTED] writes:
  I've got a fast FreeBSD/amd64 server available to run Buildfarm on.
 
  However, I see we already have a couple of others running it.
 
  My questions are:
  1) do we need another one?
  2) if yes, what options need coverage?
 
 Looks like we're fairly well covered on freebsd already.  Are you
 willing to consider running some less-popular OS on it?

Out of curiosity, is there any benefit to improving SMP versus
single-proc coverage?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Large Scale Aggregation (HashAgg Enhancement)

2006-01-20 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Any holes in that thinking?

Only that it's about five times more complicated than is currently known
to be necessary ;-).  How about we just implement the dynamic spill to
disk first, and not bother with the other stuff until we see problems in
the field?  Saying we have to do all this is a good recipe for not
getting any of it done.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] panic on 7.3

2006-01-20 Thread Rick Gigger

I got this message:

2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/ 
pg_clog/0292 failed: File exists


In 7.3.  It caused the server to restart.

Can anyone tell me what it means?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-20 Thread Mike Rylander
On 1/19/06, Pollard, Mike [EMAIL PROTECTED] wrote:
 Martijn van Oosterhout wrote:

  Please provides natural keys for any of the following:
 
  - A Person
  - A phone call: (from,to,date,time,duration) is not enough
  - A physical address
  - A phone line: (phone numbers arn't unique over time)
  - An internet account: (usernames not unique over time either)

 Ahh, a challenge.  Hmm, not sure about all of them, but here goes:

 A Person - well, you could use a bit map of their fingerprints, or maybe
 their retinal scan.  Of course, that could change due to serious injury.
 Maybe some kind of representation of their DNA?

Unless the person in question happens to be a chimera (yes, they do exist).

;-)

--
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] makesign() broken in tsearch2

2006-01-20 Thread Tom Lane
I noticed the following code in tsearch2:

typedef uint64 TPQTGist;

static TPQTGist
makesign(QUERYTYPE * a)
{
int i;
ITEM   *ptr = GETQUERY(a);
TPQTGistsign = 0;

for (i = 0; i  a-size; i++)
{
if (ptr-type == VAL)
sign |= 1  (ptr-val % SIGLEN);
ptr++;
}

return sign;
}

This is wrong because 1 is an int constant, not an int64, and
therefore the shift will be done in int width.  Correct would be

sign |= ((TPQTGist) 1)  (ptr-val % SIGLEN);

The effect of this is at least that the high-order half of sign
remains always zero.  Depending on what the machine does with shifts
exceeding the word width (which is undefined according to ANSI C),
the bottom half might be messed up too.  So we are failing to exploit
the full intended sign space, which presumably is costing something
in index efficiency.

It looks to me like the values calculated by this routine end up on
disk, and therefore we can't fix it without forcing an initdb, or
at least REINDEX of all affected indexes.  Is that correct?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Tom Lane
Rick Gigger [EMAIL PROTECTED] writes:
 I got this message:
 2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/ 
 pg_clog/0292 failed: File exists

 In 7.3.  It caused the server to restart.

 Can anyone tell me what it means?

7.3.what?

What file names exist in the pg_clog directory?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] TODO: GRANT/REVOKE: Allow column-level privileges

2006-01-20 Thread kevin brintnall
Fellow Hackers,

I've been working on this item for a little while, and I'm starting to see
some code come together.  I wanted to solicit some feedback before I got
too far along to make sure I'm on the right track.

Here's a rough overview of what I've done so far:

-

PARSER:

 * modified parser to accept SQL column privs syntax

 * created a PrivAttr Node which holds ( priv, attr[] ) pairs.  Currently,
   it's just a list of strings.  For example, when you call...

GRANT SELECT, UPDATE (col1, col2) ON table1, table2 to grantee;

   ... the parser creates a list of Nodes:

(select, NIL), (update, (col1, col2))

SYSTEM CATALOG:

 * add attacl aclinfo[] column to pg_attribute table and Form_pg_attribute.
 * add OID column to pg_attribute.  This permits dependencies to be
   registered correctly in pg_shdepend.
 * populated attacl column in existing pg_attribute bootstrap with NULLs
 * allocated an unused oid for each of the pg_attribute rows that are
   bootstrapped
 * created an oid index on pg_attribute

 * modified ExecuteGrantStmt to handle the PrivAttr structure instead of
   the list of strings
 * modified ExecuteGrantStmt to do a nested loop over all
   (column,relation) pairs in the GRANT and find oids for all of the
   attributes.

PSQL COMMAND LINE:

 * display column privileges with  \d+ table

STILL LEFT TO DO:

 * implement ExecGrant_Attribute() to modify pg_attribute
 * verify query against column privileges in addition to table privileges
 * register dependencies
 * pg_dump column privileges

-

I'd welcome any feedback on the design changes I've made, or any other
potential snags I should watch out for.

Thanks.

-- 
 kevin brintnall =~ [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] TODO: GRANT/REVOKE: Allow column-level privileges

2006-01-20 Thread Tom Lane
kevin brintnall [EMAIL PROTECTED] writes:
  * add OID column to pg_attribute.  This permits dependencies to be
registered correctly in pg_shdepend.

No, no ... the precedent in pg_depend is that columns are represented as
the table's OID plus a column number.  Please don't invent some random
other notation for a column, especially not one that is so expensive to
relate to the parent table.  Add a subobject ID to pg_shdepend instead.

 STILL LEFT TO DO:

My recollection is that there's quite some deal of code that assumes
pg_attribute rows are fixed-width.  You will have some issues there.
It's possible though that none of that code needs to access privileges,
in which case you'd be OK just dropping off the ACL data from the
in-memory copies of pg_attribute rows.  Another possible solution is the
pg_attrdef model, ie, keep the ACLs somewhere else.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Rick Gigger

Postgres version 7.3.4

... a whole bunch of other files
-rw---  1 postgres postgres 262144 Jan 18 22:42 027D
-rw---  1 postgres postgres 262144 Jan 19 07:38 027E
-rw---  1 postgres postgres 262144 Jan 19 08:25 027F
-rw---  1 postgres postgres 262144 Jan 19 09:07 0280
-rw---  1 postgres postgres 262144 Jan 19 09:59 0281
-rw---  1 postgres postgres 262144 Jan 19 11:07 0282
-rw---  1 postgres postgres 262144 Jan 19 12:22 0283
-rw---  1 postgres postgres 262144 Jan 19 13:29 0284
-rw---  1 postgres postgres 262144 Jan 19 14:26 0285
-rw---  1 postgres postgres 262144 Jan 19 15:58 0286
-rw---  1 postgres postgres 262144 Jan 19 19:55 0287
-rw---  1 postgres postgres 262144 Jan 19 23:47 0288
-rw---  1 postgres postgres 262144 Jan 20 02:35 0289
-rw---  1 postgres postgres 262144 Jan 20 04:21 028A
-rw---  1 postgres postgres 262144 Jan 20 06:16 028B
-rw---  1 postgres postgres 262144 Jan 20 07:20 028C
-rw---  1 postgres postgres 262144 Jan 20 08:22 028D
-rw---  1 postgres postgres 262144 Jan 20 09:24 028E
-rw---  1 postgres postgres 262144 Jan 20 10:24 028F
-rw---  1 postgres postgres 262144 Jan 20 11:04 0290
-rw---  1 postgres postgres 262144 Jan 20 11:50 0291
-rw---  1 postgres postgres 262144 Jan 20 13:27 0292
-rw---  1 postgres postgres 262144 Jan 20 14:24 0293
-rw---  1 postgres postgres 262144 Jan 20 14:53 0294
-rw---  1 postgres postgres 262144 Jan 20 17:10 0295

That is right now.  Right after it started up it went up to 0292.  
There are a lot of files before the ones listed here right now  
though.  Do you need to see their names?



On Jan 20, 2006, at 3:58 PM, Tom Lane wrote:


Rick Gigger [EMAIL PROTECTED] writes:

I got this message:
2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/
pg_clog/0292 failed: File exists



In 7.3.  It caused the server to restart.



Can anyone tell me what it means?


7.3.what?

What file names exist in the pg_clog directory?

regards, tom lane

---(end of  
broadcast)---

TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that  
your

   message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Bruce Momjian
Rick Gigger wrote:
 Postgres version 7.3.4
 
 ... a whole bunch of other files
 -rw---  1 postgres postgres 262144 Jan 18 22:42 027D
 -rw---  1 postgres postgres 262144 Jan 19 07:38 027E
 -rw---  1 postgres postgres 262144 Jan 19 08:25 027F
 -rw---  1 postgres postgres 262144 Jan 19 09:07 0280
 -rw---  1 postgres postgres 262144 Jan 19 09:59 0281
 -rw---  1 postgres postgres 262144 Jan 19 11:07 0282
 -rw---  1 postgres postgres 262144 Jan 19 12:22 0283
 -rw---  1 postgres postgres 262144 Jan 19 13:29 0284
 -rw---  1 postgres postgres 262144 Jan 19 14:26 0285
 -rw---  1 postgres postgres 262144 Jan 19 15:58 0286
 -rw---  1 postgres postgres 262144 Jan 19 19:55 0287
 -rw---  1 postgres postgres 262144 Jan 19 23:47 0288
 -rw---  1 postgres postgres 262144 Jan 20 02:35 0289
 -rw---  1 postgres postgres 262144 Jan 20 04:21 028A
 -rw---  1 postgres postgres 262144 Jan 20 06:16 028B
 -rw---  1 postgres postgres 262144 Jan 20 07:20 028C
 -rw---  1 postgres postgres 262144 Jan 20 08:22 028D
 -rw---  1 postgres postgres 262144 Jan 20 09:24 028E
 -rw---  1 postgres postgres 262144 Jan 20 10:24 028F
 -rw---  1 postgres postgres 262144 Jan 20 11:04 0290
 -rw---  1 postgres postgres 262144 Jan 20 11:50 0291
 -rw---  1 postgres postgres 262144 Jan 20 13:27 0292
 -rw---  1 postgres postgres 262144 Jan 20 14:24 0293
 -rw---  1 postgres postgres 262144 Jan 20 14:53 0294
 -rw---  1 postgres postgres 262144 Jan 20 17:10 0295
 
 That is right now.  Right after it started up it went up to 0292.  
 There are a lot of files before the ones listed here right now  
 though.  Do you need to see their names?

I assume you are missing one of these fixes in 7.3.X current which were
done _after_ 7.3.4 was released:

 * Fix race condition in transaction log management
   There was a narrow window in which an I/O operation could be
   initiated for the wrong page, leading to an Assert
   failure or data corruption.

 * Repair incorrect order of operations in GetNewTransactionId()
   This bug could result in failure under out-of-disk-space
   conditions, including inability to restart even after
   disk space is freed.

---


 
 
 On Jan 20, 2006, at 3:58 PM, Tom Lane wrote:
 
  Rick Gigger [EMAIL PROTECTED] writes:
  I got this message:
  2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/
  pg_clog/0292 failed: File exists
 
  In 7.3.  It caused the server to restart.
 
  Can anyone tell me what it means?
 
  7.3.what?
 
  What file names exist in the pg_clog directory?
 
  regards, tom lane
 
  ---(end of  
  broadcast)---
  TIP 1: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that  
  your
 message can get through to the mailing list cleanly
 
 
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Rick Gigger
It is the version that shipped with fedora core 1.  The version  
string from psql is (PostgreSQL) 7.3.4-RH.  I assume that it must  
have been the first bug since I had plenty of disk space.



On Jan 20, 2006, at 5:35 PM, Bruce Momjian wrote:


Rick Gigger wrote:

Postgres version 7.3.4

... a whole bunch of other files
-rw---  1 postgres postgres 262144 Jan 18 22:42 027D
-rw---  1 postgres postgres 262144 Jan 19 07:38 027E
-rw---  1 postgres postgres 262144 Jan 19 08:25 027F
-rw---  1 postgres postgres 262144 Jan 19 09:07 0280
-rw---  1 postgres postgres 262144 Jan 19 09:59 0281
-rw---  1 postgres postgres 262144 Jan 19 11:07 0282
-rw---  1 postgres postgres 262144 Jan 19 12:22 0283
-rw---  1 postgres postgres 262144 Jan 19 13:29 0284
-rw---  1 postgres postgres 262144 Jan 19 14:26 0285
-rw---  1 postgres postgres 262144 Jan 19 15:58 0286
-rw---  1 postgres postgres 262144 Jan 19 19:55 0287
-rw---  1 postgres postgres 262144 Jan 19 23:47 0288
-rw---  1 postgres postgres 262144 Jan 20 02:35 0289
-rw---  1 postgres postgres 262144 Jan 20 04:21 028A
-rw---  1 postgres postgres 262144 Jan 20 06:16 028B
-rw---  1 postgres postgres 262144 Jan 20 07:20 028C
-rw---  1 postgres postgres 262144 Jan 20 08:22 028D
-rw---  1 postgres postgres 262144 Jan 20 09:24 028E
-rw---  1 postgres postgres 262144 Jan 20 10:24 028F
-rw---  1 postgres postgres 262144 Jan 20 11:04 0290
-rw---  1 postgres postgres 262144 Jan 20 11:50 0291
-rw---  1 postgres postgres 262144 Jan 20 13:27 0292
-rw---  1 postgres postgres 262144 Jan 20 14:24 0293
-rw---  1 postgres postgres 262144 Jan 20 14:53 0294
-rw---  1 postgres postgres 262144 Jan 20 17:10 0295

That is right now.  Right after it started up it went up to 0292.
There are a lot of files before the ones listed here right now
though.  Do you need to see their names?


I assume you are missing one of these fixes in 7.3.X current which  
were

done _after_ 7.3.4 was released:

 * Fix race condition in transaction log management
   There was a narrow window in which an I/O operation could be
   initiated for the wrong page, leading to an Assert
   failure or data corruption.

 * Repair incorrect order of operations in GetNewTransactionId()
   This bug could result in failure under out-of-disk-space
   conditions, including inability to restart even after
   disk space is freed.

-- 
-






On Jan 20, 2006, at 3:58 PM, Tom Lane wrote:


Rick Gigger [EMAIL PROTECTED] writes:

I got this message:
2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/
pg_clog/0292 failed: File exists



In 7.3.  It caused the server to restart.



Can anyone tell me what it means?


7.3.what?

What file names exist in the pg_clog directory?

regards, tom lane

---(end of
broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that
your
   message can get through to the mailing list cleanly




---(end of  
broadcast)---

TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



--
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square,  
Pennsylvania 19073


---(end of  
broadcast)---

TIP 6: explain analyze is your friend




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Andrew Dunstan


Updates for FC1 are available here:

http://download.fedoralegacy.org/fedora/1/updates/i386/

they have 7.3.9 dated in March last year.

Or grab the source for 7.3.13 and build it yourself.

cheers

andrew

Rick Gigger wrote:

It is the version that shipped with fedora core 1.  The version  
string from psql is (PostgreSQL) 7.3.4-RH.  I assume that it must  
have been the first bug since I had plenty of disk space.



On Jan 20, 2006, at 5:35 PM, Bruce Momjian wrote:


Rick Gigger wrote:


Postgres version 7.3.4

... a whole bunch of other files
-rw---  1 postgres postgres 262144 Jan 18 22:42 027D
-rw---  1 postgres postgres 262144 Jan 19 07:38 027E
-rw---  1 postgres postgres 262144 Jan 19 08:25 027F
-rw---  1 postgres postgres 262144 Jan 19 09:07 0280
-rw---  1 postgres postgres 262144 Jan 19 09:59 0281
-rw---  1 postgres postgres 262144 Jan 19 11:07 0282
-rw---  1 postgres postgres 262144 Jan 19 12:22 0283
-rw---  1 postgres postgres 262144 Jan 19 13:29 0284
-rw---  1 postgres postgres 262144 Jan 19 14:26 0285
-rw---  1 postgres postgres 262144 Jan 19 15:58 0286
-rw---  1 postgres postgres 262144 Jan 19 19:55 0287
-rw---  1 postgres postgres 262144 Jan 19 23:47 0288
-rw---  1 postgres postgres 262144 Jan 20 02:35 0289
-rw---  1 postgres postgres 262144 Jan 20 04:21 028A
-rw---  1 postgres postgres 262144 Jan 20 06:16 028B
-rw---  1 postgres postgres 262144 Jan 20 07:20 028C
-rw---  1 postgres postgres 262144 Jan 20 08:22 028D
-rw---  1 postgres postgres 262144 Jan 20 09:24 028E
-rw---  1 postgres postgres 262144 Jan 20 10:24 028F
-rw---  1 postgres postgres 262144 Jan 20 11:04 0290
-rw---  1 postgres postgres 262144 Jan 20 11:50 0291
-rw---  1 postgres postgres 262144 Jan 20 13:27 0292
-rw---  1 postgres postgres 262144 Jan 20 14:24 0293
-rw---  1 postgres postgres 262144 Jan 20 14:53 0294
-rw---  1 postgres postgres 262144 Jan 20 17:10 0295

That is right now.  Right after it started up it went up to 0292.
There are a lot of files before the ones listed here right now
though.  Do you need to see their names?



I assume you are missing one of these fixes in 7.3.X current which  were
done _after_ 7.3.4 was released:

 * Fix race condition in transaction log management
   There was a narrow window in which an I/O operation could be
   initiated for the wrong page, leading to an Assert
   failure or data corruption.

 * Repair incorrect order of operations in GetNewTransactionId()
   This bug could result in failure under out-of-disk-space
   conditions, including inability to restart even after
   disk space is freed.

-- 
-






On Jan 20, 2006, at 3:58 PM, Tom Lane wrote:


Rick Gigger [EMAIL PROTECTED] writes:


I got this message:
2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/
pg_clog/0292 failed: File exists




In 7.3.  It caused the server to restart.




Can anyone tell me what it means?



7.3.what?

What file names exist in the pg_clog directory?

regards, tom lane

---(end of
broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that
your
   message can get through to the mailing list cleanly




---(end of  
broadcast)---

TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



--
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square,  
Pennsylvania 19073


---(end of  
broadcast)---

TIP 6: explain analyze is your friend




---(end of broadcast)---
TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Tom Lane
Rick Gigger [EMAIL PROTECTED] writes:
 Postgres version 7.3.4

You realize of course that that's pretty old ...

 That is right now.  Right after it started up it went up to 0292.  

So it was the latest file eh?  I thought maybe you had some problem
with a corrupted XID leading to trying to touch a clog file
out-of-order, but that seems ruled out.

 2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/
 pg_clog/0292 failed: File exists

Digging in the 7.3 sources, it seems that error message could only have
come from here:

fd = BasicOpenFile(path, O_RDWR | PG_BINARY, S_IRUSR | S_IWUSR);
if (fd  0)
{
if (errno != ENOENT)
elog(PANIC, open of %s failed: %m, path);
fd = BasicOpenFile(path, O_RDWR | O_CREAT | O_EXCL | PG_BINARY,
   S_IRUSR | S_IWUSR);
if (fd  0)
elog(PANIC, creation of file %s failed: %m, path);
}

AFAICS, it is simply not possible for the second open() to fail with
that errno, *unless* someone else created the same file in the
microseconds between the two open calls.

The code doing this has a lock on the particular clog buffer it's trying
to write out, so no-one else could be trying to write the same buffer;
but now that I look at it, it's entirely legal for someone else to be
trying to write a different clog buffer.  This leads to the following
theory:

1. The clog page that would be first in the 0292 segment got created in
clog buffers, but there was no reason to write it out for awhile.  (In
normal operation, only a checkpoint would be cause to write out the
frontmost page of clog.)

2. More than 2K transactions elapsed, so the page that would be second
in the 0292 segment also got set up in clog buffers.  (Rick, is the load
on your machine such that several thousand transactions might have
elapsed between checkpoints?)  Perhaps there were even enough
transactions so that more than two pages were dirty and pending write
in the clog buffers, yet the file hadn't been created yet.

3. Two different backends decided to try to write different clog pages
concurrently.  Probably one was writing the frontmost page because it
was doing a checkpoint, and another needed to read in an older clog page
so it had to swap out one of the other dirty buffers.

4. Kaboom.

If this theory is correct, the bug has been there since the clog code
was first written.  But the conditions for having it happen are narrow
enough that it's not too surprising we haven't seen it before.

I think that a sufficient fix might just be to remove the O_EXCL flag
from the second open() call --- ie, if someone else creates the file
in this narrow window, it should be considered OK.  Comments?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Tom Lane
I wrote:
 If this theory is correct, the bug has been there since the clog code
 was first written.  But the conditions for having it happen are narrow
 enough that it's not too surprising we haven't seen it before.

Actually, there seem to be a couple of unresolved bug reports that look
like the same thing ...

http://archives.postgresql.org/pgsql-hackers/2006-01/msg00216.php
http://archives.postgresql.org/pgsql-bugs/2005-12/msg00130.php
http://archives.postgresql.org/pgsql-general/2005-10/msg00975.php

The spelling of the message has changed since 7.3, but the code behavior
is still fundamentally the same.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] strange behavior on locks

2006-01-20 Thread Alfranio Correia Junior
Hi, 

I running PostgreSQL 8.0.x with database that does not have foreign keys
and sometimes the following messages appears:

ERROR:  deadlock detected
DETAIL:  Process 10029 waits for ShareLock on transaction 65272; blocked
by process 32436.
Process 32436 waits for ShareLock on transaction 65117; blocked
by process 10029.

I also see (right after executing a ps) that there are lots of messages
select waiting.

1 - Does it mean that process 10029 owns transaction  65117 and 
process 32436 owns transaction 65272 as a transaction acquires an
exclusive lock in its transaction id ?
2 - How is this situation possible ?
What could cause the status (select waiting) ?

Best regards, 

Alfranio

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] strange locks

2006-01-20 Thread Alfranio Correia Junior
HI,

I am running PostgreSQL 8.0.x with a database that does not have foreign
keys. However, sometimes the following messages appear: 

deadlock detected: Process 10029 waits for ShareLock on transaction
65272; blocked by process 32436.Process 32436 waits for ShareLock on
transaction 65117; blocked by process 10029.

I also see (right after executing a ps) that there are lots of messages
select waiting.

1 - Does it mean that process 10029 owns transaction  65117 and process
32436 owns transaction 65272 as a transaction acquires an exclusive lock
in its transaction id ?

2 - How is this situation possible ?


3 - What could cause the status (select waiting) ?

Best regards, Alfranio

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Rick Gigger

Thanks.  I'm not quite sure what version I am going to upgrade to yet.

Rick

On Jan 20, 2006, at 5:59 PM, Andrew Dunstan wrote:



Updates for FC1 are available here:

http://download.fedoralegacy.org/fedora/1/updates/i386/

they have 7.3.9 dated in March last year.

Or grab the source for 7.3.13 and build it yourself.

cheers

andrew

Rick Gigger wrote:

It is the version that shipped with fedora core 1.  The version   
string from psql is (PostgreSQL) 7.3.4-RH.  I assume that it must   
have been the first bug since I had plenty of disk space.



On Jan 20, 2006, at 5:35 PM, Bruce Momjian wrote:


Rick Gigger wrote:


Postgres version 7.3.4

... a whole bunch of other files
-rw---  1 postgres postgres 262144 Jan 18 22:42 027D
-rw---  1 postgres postgres 262144 Jan 19 07:38 027E
-rw---  1 postgres postgres 262144 Jan 19 08:25 027F
-rw---  1 postgres postgres 262144 Jan 19 09:07 0280
-rw---  1 postgres postgres 262144 Jan 19 09:59 0281
-rw---  1 postgres postgres 262144 Jan 19 11:07 0282
-rw---  1 postgres postgres 262144 Jan 19 12:22 0283
-rw---  1 postgres postgres 262144 Jan 19 13:29 0284
-rw---  1 postgres postgres 262144 Jan 19 14:26 0285
-rw---  1 postgres postgres 262144 Jan 19 15:58 0286
-rw---  1 postgres postgres 262144 Jan 19 19:55 0287
-rw---  1 postgres postgres 262144 Jan 19 23:47 0288
-rw---  1 postgres postgres 262144 Jan 20 02:35 0289
-rw---  1 postgres postgres 262144 Jan 20 04:21 028A
-rw---  1 postgres postgres 262144 Jan 20 06:16 028B
-rw---  1 postgres postgres 262144 Jan 20 07:20 028C
-rw---  1 postgres postgres 262144 Jan 20 08:22 028D
-rw---  1 postgres postgres 262144 Jan 20 09:24 028E
-rw---  1 postgres postgres 262144 Jan 20 10:24 028F
-rw---  1 postgres postgres 262144 Jan 20 11:04 0290
-rw---  1 postgres postgres 262144 Jan 20 11:50 0291
-rw---  1 postgres postgres 262144 Jan 20 13:27 0292
-rw---  1 postgres postgres 262144 Jan 20 14:24 0293
-rw---  1 postgres postgres 262144 Jan 20 14:53 0294
-rw---  1 postgres postgres 262144 Jan 20 17:10 0295

That is right now.  Right after it started up it went up to 0292.
There are a lot of files before the ones listed here right now
though.  Do you need to see their names?



I assume you are missing one of these fixes in 7.3.X current  
which  were

done _after_ 7.3.4 was released:

 * Fix race condition in transaction log management
   There was a narrow window in which an I/O operation could be
   initiated for the wrong page, leading to an Assert
   failure or data corruption.

 * Repair incorrect order of operations in GetNewTransactionId()
   This bug could result in failure under out-of-disk-space
   conditions, including inability to restart even after
   disk space is freed.

 
-- -






On Jan 20, 2006, at 3:58 PM, Tom Lane wrote:


Rick Gigger [EMAIL PROTECTED] writes:


I got this message:
2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/
pg_clog/0292 failed: File exists




In 7.3.  It caused the server to restart.




Can anyone tell me what it means?



7.3.what?

What file names exist in the pg_clog directory?

regards, tom lane

---(end of
broadcast)---
TIP 1: if posting/reading through Usenet, please send an  
appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so  
that

your
   message can get through to the mailing list cleanly




---(end of   
broadcast)---

TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



--
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square,   
Pennsylvania 19073


---(end of   
broadcast)---

TIP 6: explain analyze is your friend




---(end of  
broadcast)---

TIP 6: explain analyze is your friend






---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Rick Gigger

=Rick Gigger [EMAIL PROTECTED] writes:

Postgres version 7.3.4


You realize of course that that's pretty old ...



Yes.  I will be upgrading immediately.


That is right now.  Right after it started up it went up to 0292.


So it was the latest file eh?  I thought maybe you had some problem
with a corrupted XID leading to trying to touch a clog file
out-of-order, but that seems ruled out.


Well that sounds like a good thing.




2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/
pg_clog/0292 failed: File exists


Digging in the 7.3 sources, it seems that error message could only  
have

come from here:

fd = BasicOpenFile(path, O_RDWR | PG_BINARY, S_IRUSR | S_IWUSR);
if (fd  0)
{
if (errno != ENOENT)
elog(PANIC, open of %s failed: %m, path);
fd = BasicOpenFile(path, O_RDWR | O_CREAT | O_EXCL |  
PG_BINARY,

   S_IRUSR | S_IWUSR);
if (fd  0)
elog(PANIC, creation of file %s failed: %m, path);
}


Yes I found that too (on accident with google) but didn't really have  
the slightest clue what exactly would have caused it.




AFAICS, it is simply not possible for the second open() to fail with
that errno, *unless* someone else created the same file in the
microseconds between the two open calls.

The code doing this has a lock on the particular clog buffer it's  
trying

to write out, so no-one else could be trying to write the same buffer;
but now that I look at it, it's entirely legal for someone else to be
trying to write a different clog buffer.  This leads to the following
theory:

1. The clog page that would be first in the 0292 segment got  
created in

clog buffers, but there was no reason to write it out for awhile.  (In
normal operation, only a checkpoint would be cause to write out the
frontmost page of clog.)

2. More than 2K transactions elapsed, so the page that would be second
in the 0292 segment also got set up in clog buffers.  (Rick, is the  
load

on your machine such that several thousand transactions might have
elapsed between checkpoints?)  Perhaps there were even enough
transactions so that more than two pages were dirty and pending write
in the clog buffers, yet the file hadn't been created yet.


I don't know if 2K could have passed since the last checkpoint.  Part  
of the reason I haven't upgraded in so long is that it has been  
running like champ for about 3 years.   Recently though the load on  
the site just shot through the roof.  Not only am I going to upgrade  
the version of postgres but I need to do some major tuning.  I am  
still using a lot of defaults.  I am using the default checkpoint  
settings but I am not sure how often they are happening.  Actually  
now that I think about it I was getting about 400 pages requests /  
minute and each of those would have have been doing at least 2  
transactions so yes, I guess that is very likely.



3. Two different backends decided to try to write different clog pages
concurrently.  Probably one was writing the frontmost page because it
was doing a checkpoint, and another needed to read in an older clog  
page

so it had to swap out one of the other dirty buffers.

4. Kaboom.


Yeah Kaboom.  It was really bad timing too.  :)


If this theory is correct, the bug has been there since the clog code
was first written.  But the conditions for having it happen are narrow
enough that it's not too surprising we haven't seen it before.


Wow it's great to be the first.


I think that a sufficient fix might just be to remove the O_EXCL flag
from the second open() call --- ie, if someone else creates the file
in this narrow window, it should be considered OK.  Comments?


Well just a little fyi, I don't know if any of this will help but I  
was suffering from severe table bloat.  The data in my session table  
is unfortunately quite large and being updated constantly so the  
session table and it's two indexes and especially it's toast table we  
impossible to vacuum.  Also the vacuum and fsm settings were the  
defaults making the problem worse.




regards, tom lane



Thanks so much for the help.

Rick

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Surrogate keys

2006-01-20 Thread Christopher Browne
 On 1/19/06, Pollard, Mike [EMAIL PROTECTED] wrote:
 Martijn van Oosterhout wrote:

  Please provides natural keys for any of the following:
 
  - A Person
  - A phone call: (from,to,date,time,duration) is not enough
  - A physical address
  - A phone line: (phone numbers arn't unique over time)
  - An internet account: (usernames not unique over time either)

 Ahh, a challenge.  Hmm, not sure about all of them, but here goes:

 A Person - well, you could use a bit map of their fingerprints, or maybe
 their retinal scan.  Of course, that could change due to serious injury.
 Maybe some kind of representation of their DNA?

 Unless the person in question happens to be a chimera (yes, they do exist).

 ;-)

Yeah, I saw that episode of CSI!  :-)
-- 
cbbrowne,@,gmail.com
http://linuxdatabases.info/info/lsf.html
Cat's motto: No matter what you've done wrong, always try to make it
look like the dog did it. -- Unknown

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] TODO: GRANT/REVOKE: Allow column-level privileges

2006-01-20 Thread kevin brintnall
On Fri, Jan 20, 2006 at 07:09:46PM -0500, Tom Lane wrote:
 kevin brintnall [EMAIL PROTECTED] writes:
   * add OID column to pg_attribute.  This permits dependencies to be
 registered correctly in pg_shdepend.
 
 No, no ... the precedent in pg_depend is that columns are represented as
 the table's OID plus a column number.  Please don't invent some random
 other notation for a column, especially not one that is so expensive to
 relate to the parent table.  Add a subobject ID to pg_shdepend instead.

I was referring to the dependency that exists between a grantee and any
pg_attribute ACL entries that mention the grantee.  When the role is
dropped, the ACL entries that mention that role have to be removed.

Specifically, I propose creating an entry such as the following in
pg_shdepend for every grantee G, for every column C in which G is
mentioned:

classid= AttributeRelationId /* 1249 */
objid  = C.oid
refclassid = AuthIdRelationId /* 1260 */
refobjid   = G.oid
deptype= 'a'/* SHARED_DEPENDENCY_ACL */

Are you suggesting that the pair (reloid,attnum) is superior for
identifying a pg_attribute entry?  Are there any other possible uses for
pg_attribute.oid?

  STILL LEFT TO DO:
 
 My recollection is that there's quite some deal of code that assumes
 pg_attribute rows are fixed-width.  You will have some issues there.
 It's possible though that none of that code needs to access privileges,
 in which case you'd be OK just dropping off the ACL data from the
 in-memory copies of pg_attribute rows.  Another possible solution is the
 pg_attrdef model, ie, keep the ACLs somewhere else.

I'm employing the same hack^H^H^H^Hmethod that is currently used in
pg_class.

-- 
 kevin brintnall =~ [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] TODO: GRANT/REVOKE: Allow column-level privileges

2006-01-20 Thread Tom Lane
kevin brintnall [EMAIL PROTECTED] writes:
 Are you suggesting that the pair (reloid,attnum) is superior for
 identifying a pg_attribute entry?

Yes.  It's just as unique, and it makes it easy to see the relationship
between the table and its columns.  Moreover, it's what we're already
using in pg_depend.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] strange behavior on locks

2006-01-20 Thread Tom Lane
Alfranio Correia Junior [EMAIL PROTECTED] writes:
 What could cause the status (select waiting) ?

Perhaps you are using SELECT FOR UPDATE?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Surrogate keys

2006-01-20 Thread Mike Rylander
On 1/21/06, Christopher Browne [EMAIL PROTECTED] wrote:
  On 1/19/06, Pollard, Mike [EMAIL PROTECTED] wrote:
  Martijn van Oosterhout wrote:
 
   Please provides natural keys for any of the following:
  
   - A Person
   - A phone call: (from,to,date,time,duration) is not enough
   - A physical address
   - A phone line: (phone numbers arn't unique over time)
   - An internet account: (usernames not unique over time either)
 
  Ahh, a challenge.  Hmm, not sure about all of them, but here goes:
 
  A Person - well, you could use a bit map of their fingerprints, or maybe
  their retinal scan.  Of course, that could change due to serious injury.
  Maybe some kind of representation of their DNA?
 
  Unless the person in question happens to be a chimera (yes, they do exist).
 
  ;-)

 Yeah, I saw that episode of CSI!  :-)

Heh.  I didn't realize they did that already.  I was thinking of the
show I Am My Own Twin from the Discovery Health Channel.

Couldn't find a link from the official page, but:
http://www.globalspin.com/mt/archives/000547.html .

 --
 cbbrowne,@,gmail.com
 http://linuxdatabases.info/info/lsf.html
 Cat's motto: No matter what you've done wrong, always try to make it
 look like the dog did it. -- Unknown

 ---(end of broadcast)---
 TIP 6: explain analyze is your friend



--
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] strange behavior on locks

2006-01-20 Thread Michael Adler
On Fri, Jan 20, 2006 at 11:47:55PM -0500, Tom Lane wrote:
 Alfranio Correia Junior [EMAIL PROTECTED] writes:
  What could cause the status (select waiting) ?
 
 Perhaps you are using SELECT FOR UPDATE?

or SELECT func_with_side_effects() ? 

 -Mike

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Rick Gigger


On Jan 20, 2006, at 6:02 PM, Tom Lane wrote:


Rick Gigger [EMAIL PROTECTED] writes:

Postgres version 7.3.4


You realize of course that that's pretty old ...


That is right now.  Right after it started up it went up to 0292.


So it was the latest file eh?  I thought maybe you had some problem
with a corrupted XID leading to trying to touch a clog file
out-of-order, but that seems ruled out.


2006-01-20 11:50:51 PANIC:  creation of file /var/lib/pgsql/data/
pg_clog/0292 failed: File exists


Digging in the 7.3 sources, it seems that error message could only  
have

come from here:

fd = BasicOpenFile(path, O_RDWR | PG_BINARY, S_IRUSR | S_IWUSR);
if (fd  0)
{
if (errno != ENOENT)
elog(PANIC, open of %s failed: %m, path);
fd = BasicOpenFile(path, O_RDWR | O_CREAT | O_EXCL |  
PG_BINARY,

   S_IRUSR | S_IWUSR);
if (fd  0)
elog(PANIC, creation of file %s failed: %m, path);
}

AFAICS, it is simply not possible for the second open() to fail with
that errno, *unless* someone else created the same file in the
microseconds between the two open calls.

The code doing this has a lock on the particular clog buffer it's  
trying

to write out, so no-one else could be trying to write the same buffer;
but now that I look at it, it's entirely legal for someone else to be
trying to write a different clog buffer.  This leads to the following
theory:

1. The clog page that would be first in the 0292 segment got  
created in

clog buffers, but there was no reason to write it out for awhile.  (In
normal operation, only a checkpoint would be cause to write out the
frontmost page of clog.)

2. More than 2K transactions elapsed, so the page that would be second
in the 0292 segment also got set up in clog buffers.  (Rick, is the  
load

on your machine such that several thousand transactions might have
elapsed between checkpoints?)  Perhaps there were even enough
transactions so that more than two pages were dirty and pending write
in the clog buffers, yet the file hadn't been created yet.


So what I think I'm getting killed on right now are the disk writes.   
So I was thinking of changing away from the default checkpoint  
settings.  My load is going to continue to go up so 2000+  
transactions are going to start happening pretty fast.  I have lots  
of disk space so I was going to increase the time between  
checkpoints.  Will that increase the chances of this happening again  
or was this such a strange freak of nature coincidence that it can't  
really even happen again.  Also I've decided to upgrade all the way  
to the latest 8.1 code.



3. Two different backends decided to try to write different clog pages
concurrently.  Probably one was writing the frontmost page because it
was doing a checkpoint, and another needed to read in an older clog  
page

so it had to swap out one of the other dirty buffers.

4. Kaboom.

If this theory is correct, the bug has been there since the clog code
was first written.  But the conditions for having it happen are narrow
enough that it's not too surprising we haven't seen it before.

I think that a sufficient fix might just be to remove the O_EXCL flag
from the second open() call --- ie, if someone else creates the file
in this narrow window, it should be considered OK.  Comments?

regards, tom lane

---(end of  
broadcast)---

TIP 6: explain analyze is your friend




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Tom Lane
Rick Gigger [EMAIL PROTECTED] writes:
 I don't know if 2K could have passed since the last checkpoint.
 ...
 now that I think about it I was getting about 400 pages requests /  
 minute and each of those would have have been doing at least 2  
 transactions so yes, I guess that is very likely.

Good, 'cause if you didn't have a couple thousand transactions between
checkpoints then we need another theory ;-)

 You realize of course that that's pretty old ...

 Yes.  I will be upgrading immediately.

You'll want to include this patch:
http://archives.postgresql.org/pgsql-committers/2006-01/msg00289.php
(or see adjacent messages if you plan to move to something newer than
7.3.*).  We probably will not put out another set of releases until
next month, unless something really big comes along.  This one doesn't
qualify as really big IMHO, because it's not a PANIC-grade failure in
the later branches.  But having been burnt once, I'm sure you'll want
a patched copy ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] could not access status of transaction 0

2006-01-20 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 We seem to be getting this error (in german) once in a while on a rather
 complex database:

 FEHLER:  konnte auf den Status von Transaktion 0 nicht zugreifen
 DETAIL:  kann Datei /var/databases/postgres/data/pg_subtrans/57DA
 nicht erstellen: Die Datei existiert bereits

 which roughly translates to

 ERROR: could not access status of transaction 0
 DETAIL: could not create file
 /var/databases/postgres/data/pg_subtrans/57DA: File exists

I think we've finally identified the reason for this:
http://archives.postgresql.org/pgsql-committers/2006-01/msg00287.php

 It might be interesting to note that we use slony to replicate a few
 tables of this database to multiple slaves and according to our logs it
 always was the slony-user connected to the database that triggered this
 error.

AFAICT Slony is blameless; the triggering condition is a very high
transaction rate (2000 or more transactions between checkpoints)
plus the bad luck to hit a fairly narrow race-condition window.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] panic on 7.3

2006-01-20 Thread Rick Gigger

Thanks very much!

I've decided to go straight to 8.1 though.  There are just too many  
performance improvements at this point that I might regret not having  
and I don't want to do a dump reload again.  I am about to compile it  
now.  If it isn't a panic grade failure in the latest 8.1 code then  
I'd just assume take the stock release source code.  I don't care at  
all if this kills one connection at the ultra-low frequency with  
which it occurs but what I can't have is the whole server rebooting  
itself in the middle of processing hundreds of transactions.  Once  
that happens all of the web clients hang onto their bad connections  
and then eventually die.  Considering that I'm moving to 8.1 and am  
not too familiar with applying patches am I crazy for just going with  
the stock 8.1 code?


On Jan 20, 2006, at 10:36 PM, Tom Lane wrote:


Rick Gigger [EMAIL PROTECTED] writes:

I don't know if 2K could have passed since the last checkpoint.
...
now that I think about it I was getting about 400 pages requests /
minute and each of those would have have been doing at least 2
transactions so yes, I guess that is very likely.


Good, 'cause if you didn't have a couple thousand transactions between
checkpoints then we need another theory ;-)


You realize of course that that's pretty old ...



Yes.  I will be upgrading immediately.


You'll want to include this patch:
http://archives.postgresql.org/pgsql-committers/2006-01/msg00289.php
(or see adjacent messages if you plan to move to something newer than
7.3.*).  We probably will not put out another set of releases until
next month, unless something really big comes along.  This one doesn't
qualify as really big IMHO, because it's not a PANIC-grade failure in
the later branches.  But having been burnt once, I'm sure you'll want
a patched copy ...

regards, tom lane




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org