Re: [HACKERS] Phantom Command ID

2006-09-26 Thread Heikki Linnakangas

Tom Lane wrote:
We could rename pg_attribute as pg_userattribute, and remove all the 
system attributes from that. To stay backwards-compatible, we could have 
a pg_attribute view on top of that contained the system attributes as well.



I don't really think this is necessary.  How many client programs have
you seen that don't explicitly exclude attnum0 anyway?  The places that
will need work are inside the backend, and a view won't help them.
  


None, there probably isn't any client programs like that. It would be 
nice for programs to be able to discover what system attributes there 
is, though.


--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Phantom Command ID

2006-09-26 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 If we're going to fool with these, I'd like to renew the suggestion I
 made awhile back that none of the system columns should have explicit
 entries in pg_attribute, but rather their lookup should be special-cased
 in the parser.

 What was the original reason for the proposal? Space savings?

Partly that, and partly that it'd make it much easier to alter the set
of system attributes.

 We could rename pg_attribute as pg_userattribute, and remove all the 
 system attributes from that. To stay backwards-compatible, we could have 
 a pg_attribute view on top of that contained the system attributes as well.

I don't really think this is necessary.  How many client programs have
you seen that don't explicitly exclude attnum0 anyway?  The places that
will need work are inside the backend, and a view won't help them.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Phantom Command ID

2006-09-26 Thread Jim C. Nasby
On Tue, Sep 26, 2006 at 12:35:54PM +0100, Heikki Linnakangas wrote:
 Tom Lane wrote:
 We could rename pg_attribute as pg_userattribute, and remove all the 
 system attributes from that. To stay backwards-compatible, we could have 
 a pg_attribute view on top of that contained the system attributes as 
 well.
 
 
 I don't really think this is necessary.  How many client programs have
 you seen that don't explicitly exclude attnum0 anyway?  The places that
 will need work are inside the backend, and a view won't help them.
   
 
 None, there probably isn't any client programs like that. It would be 
 nice for programs to be able to discover what system attributes there 
 is, though.

+1; we need to have some way for users to find that info out, and I
can't think of a better way than pg_attribute.

If we want to create a set of views that are more human friendly I'm all
for it (it's why we started the newsysviews project afterall), but I
don't know if y'all want to open that can of worms back up.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Phantom Command ID

2006-09-26 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:

Another question is, what should cmin and cmax system columns return?


If we're going to fool with these, I'd like to renew the suggestion I
made awhile back that none of the system columns should have explicit
entries in pg_attribute, but rather their lookup should be special-cased
in the parser. And whatever we do with cmin/cmax, the infomask should
become exposed as well.


I just looked back at that discussion in the archives 
(http://archives.postgresql.org/pgsql-hackers/2005-02/msg00615.php). 
What was the original reason for the proposal? Space savings?


We could rename pg_attribute as pg_userattribute, and remove all the 
system attributes from that. To stay backwards-compatible, we could have 
a pg_attribute view on top of that contained the system attributes as well.


--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Phantom Command ID

2006-09-21 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
  

A big question is, do we need to implement spilling to disk?



My thought is no, at least not in the first cut ... this is something
that can be added later if it proves critical, and right at the moment
my guess is that it never will.  The data structure design sounds fine.
  


I thought so too.

We could also limit the size of the hash table, which takes up most of 
the memory, and only keep the latest phantom cids there. Presumably, if 
current command id is 1000, you're not likely to set cmax to 500 on any 
tuple in that transaction anymore.


--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Phantom Command ID

2006-09-21 Thread Heikki Linnakangas
Another question is, what should cmin and cmax system columns return? If 
we overlay cmin and cmax, cmin and cmax on-disk will always be the same 
value. And with phantom cids, it wouldn't be meaningful outside the 
inserting/deleting transaction.


The options that I can think of are:

1. Only return cmin and cmax when they mean something, that is within 
the inserting / deleting transaction. This is not good if you want to 
use them for debugging (and what other use do they have?)


2. Cmin and cmax return the value that's stored on disk, whether or not 
they make sense.


3. Remove cmin and cmax system columns to avoid confusion, and replace 
them with cminmax, that returns what's on disk.


--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Phantom Command ID

2006-09-21 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 We could also limit the size of the hash table, which takes up most of 
 the memory, and only keep the latest phantom cids there. Presumably, if 
 current command id is 1000, you're not likely to set cmax to 500 on any 
 tuple in that transaction anymore.

The downside of that though is that if you did generate any such, you'd
assign a fresh (duplicate) phantom cid --- so you're bloating the array
in exchange for reducing the hash size.

It is quite easy to have current command counter much greater than the
CID of a still-live command: consider for example an UPDATE that is
firing triggers as it goes, and each trigger executes some queries.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Phantom Command ID

2006-09-21 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Another question is, what should cmin and cmax system columns return?

If we're going to fool with these, I'd like to renew the suggestion I
made awhile back that none of the system columns should have explicit
entries in pg_attribute, but rather their lookup should be special-cased
in the parser.  And whatever we do with cmin/cmax, the infomask should
become exposed as well.

 2. Cmin and cmax return the value that's stored on disk, whether or not 
 they make sense.

This is pretty much the approach we've been taking with the past overlay
hacks --- what is returned is not always what you might expect from the
column header.  I think this is tolerable as long as the infomask can be
examined to determine what's really being shown, but it's probably not
the cleanest way.

 3. Remove cmin and cmax system columns to avoid confusion, and replace 
 them with cminmax, that returns what's on disk.

Don't forget it could be xvac or cphantom too ;-).  I think I agree
with this approach but not that particular name exactly.  I'm inclined
to suggest that we just continue to use cmin for the field --- cmax
could be dropped or become an alias for cmin.

A fourth possibility is to abandon the rule that these columns never
read as null, and to have them show their contents when meaningful
(as determined by infomask) and null otherwise.  However, then we'd have
to support all of cmin, cmax, cphantom, and xvac in order to ensure that
we always have a column that can show the on-disk value.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Phantom Command ID

2006-09-20 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 A big question is, do we need to implement spilling to disk?

My thought is no, at least not in the first cut ... this is something
that can be added later if it proves critical, and right at the moment
my guess is that it never will.  The data structure design sounds fine.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Phantom Command ID

2006-09-20 Thread Jim C. Nasby
On Wed, Sep 20, 2006 at 04:02:00PM -0400, Tom Lane wrote:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
  A big question is, do we need to implement spilling to disk?
 
 My thought is no, at least not in the first cut ... this is something
 that can be added later if it proves critical, and right at the moment
 my guess is that it never will.  The data structure design sounds fine.

What would the failure mode be? Would we just keep going until the box
ran out of memory? I think it'd be better to have some kind of hard
limit so that a single backend can't grind a production server into a
swap-storm. (Arguably, not having a limit is exposing a DoS
vulnerability).
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Phantom Command ID

2006-09-20 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 What would the failure mode be? Would we just keep going until the box
 ran out of memory? I think it'd be better to have some kind of hard
 limit so that a single backend can't grind a production server into a
 swap-storm. (Arguably, not having a limit is exposing a DoS
 vulnerability).

[ shrug... ]  If we tried to guarantee such a thing we'd be putting
arbitrary limits into hundreds if not thousands of different bits of the
backend.  I think the correct answer for an admin who is worried about
such a thing is to make sure that the process ulimit is a sufficiently
small fraction of the machine's available RAM.  Only if we can't
gracefully handle running up against ulimit is it our problem (hence,
we have a stack-size overflow check, but not any such thing for data size).

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Phantom Command ID

2006-09-20 Thread Jim C. Nasby
On Wed, Sep 20, 2006 at 04:22:47PM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  What would the failure mode be? Would we just keep going until the box
  ran out of memory? I think it'd be better to have some kind of hard
  limit so that a single backend can't grind a production server into a
  swap-storm. (Arguably, not having a limit is exposing a DoS
  vulnerability).
 
 [ shrug... ]  If we tried to guarantee such a thing we'd be putting
 arbitrary limits into hundreds if not thousands of different bits of the
 backend.  I think the correct answer for an admin who is worried about
 such a thing is to make sure that the process ulimit is a sufficiently
 small fraction of the machine's available RAM.  Only if we can't
 gracefully handle running up against ulimit is it our problem (hence,
 we have a stack-size overflow check, but not any such thing for data size).

I didn't realize we had a lot of ways a backend could run a machine out
of memory, or at least ways that didn't have some kind of limit (ie:
work_mem). Are any of them very easy to run into?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Phantom Command ID

2006-09-20 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 I didn't realize we had a lot of ways a backend could run a machine out
 of memory, or at least ways that didn't have some kind of limit (ie:
 work_mem). Are any of them very easy to run into?

work_mem has nothing to do with trying to guarantee no swapping DoS.
If it did, it wouldn't be USERSET, and it wouldn't be per query step.
The fact is that ulimit does what you want in that regard already;
why should we try to reinvent that wheel?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq