Re: [HACKERS] Phantom Command ID
Tom Lane wrote: We could rename pg_attribute as pg_userattribute, and remove all the system attributes from that. To stay backwards-compatible, we could have a pg_attribute view on top of that contained the system attributes as well. I don't really think this is necessary. How many client programs have you seen that don't explicitly exclude attnum0 anyway? The places that will need work are inside the backend, and a view won't help them. None, there probably isn't any client programs like that. It would be nice for programs to be able to discover what system attributes there is, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Phantom Command ID
Heikki Linnakangas [EMAIL PROTECTED] writes: Tom Lane wrote: If we're going to fool with these, I'd like to renew the suggestion I made awhile back that none of the system columns should have explicit entries in pg_attribute, but rather their lookup should be special-cased in the parser. What was the original reason for the proposal? Space savings? Partly that, and partly that it'd make it much easier to alter the set of system attributes. We could rename pg_attribute as pg_userattribute, and remove all the system attributes from that. To stay backwards-compatible, we could have a pg_attribute view on top of that contained the system attributes as well. I don't really think this is necessary. How many client programs have you seen that don't explicitly exclude attnum0 anyway? The places that will need work are inside the backend, and a view won't help them. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Phantom Command ID
On Tue, Sep 26, 2006 at 12:35:54PM +0100, Heikki Linnakangas wrote: Tom Lane wrote: We could rename pg_attribute as pg_userattribute, and remove all the system attributes from that. To stay backwards-compatible, we could have a pg_attribute view on top of that contained the system attributes as well. I don't really think this is necessary. How many client programs have you seen that don't explicitly exclude attnum0 anyway? The places that will need work are inside the backend, and a view won't help them. None, there probably isn't any client programs like that. It would be nice for programs to be able to discover what system attributes there is, though. +1; we need to have some way for users to find that info out, and I can't think of a better way than pg_attribute. If we want to create a set of views that are more human friendly I'm all for it (it's why we started the newsysviews project afterall), but I don't know if y'all want to open that can of worms back up. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Phantom Command ID
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Another question is, what should cmin and cmax system columns return? If we're going to fool with these, I'd like to renew the suggestion I made awhile back that none of the system columns should have explicit entries in pg_attribute, but rather their lookup should be special-cased in the parser. And whatever we do with cmin/cmax, the infomask should become exposed as well. I just looked back at that discussion in the archives (http://archives.postgresql.org/pgsql-hackers/2005-02/msg00615.php). What was the original reason for the proposal? Space savings? We could rename pg_attribute as pg_userattribute, and remove all the system attributes from that. To stay backwards-compatible, we could have a pg_attribute view on top of that contained the system attributes as well. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Phantom Command ID
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: A big question is, do we need to implement spilling to disk? My thought is no, at least not in the first cut ... this is something that can be added later if it proves critical, and right at the moment my guess is that it never will. The data structure design sounds fine. I thought so too. We could also limit the size of the hash table, which takes up most of the memory, and only keep the latest phantom cids there. Presumably, if current command id is 1000, you're not likely to set cmax to 500 on any tuple in that transaction anymore. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Phantom Command ID
Another question is, what should cmin and cmax system columns return? If we overlay cmin and cmax, cmin and cmax on-disk will always be the same value. And with phantom cids, it wouldn't be meaningful outside the inserting/deleting transaction. The options that I can think of are: 1. Only return cmin and cmax when they mean something, that is within the inserting / deleting transaction. This is not good if you want to use them for debugging (and what other use do they have?) 2. Cmin and cmax return the value that's stored on disk, whether or not they make sense. 3. Remove cmin and cmax system columns to avoid confusion, and replace them with cminmax, that returns what's on disk. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Phantom Command ID
Heikki Linnakangas [EMAIL PROTECTED] writes: We could also limit the size of the hash table, which takes up most of the memory, and only keep the latest phantom cids there. Presumably, if current command id is 1000, you're not likely to set cmax to 500 on any tuple in that transaction anymore. The downside of that though is that if you did generate any such, you'd assign a fresh (duplicate) phantom cid --- so you're bloating the array in exchange for reducing the hash size. It is quite easy to have current command counter much greater than the CID of a still-live command: consider for example an UPDATE that is firing triggers as it goes, and each trigger executes some queries. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Phantom Command ID
Heikki Linnakangas [EMAIL PROTECTED] writes: Another question is, what should cmin and cmax system columns return? If we're going to fool with these, I'd like to renew the suggestion I made awhile back that none of the system columns should have explicit entries in pg_attribute, but rather their lookup should be special-cased in the parser. And whatever we do with cmin/cmax, the infomask should become exposed as well. 2. Cmin and cmax return the value that's stored on disk, whether or not they make sense. This is pretty much the approach we've been taking with the past overlay hacks --- what is returned is not always what you might expect from the column header. I think this is tolerable as long as the infomask can be examined to determine what's really being shown, but it's probably not the cleanest way. 3. Remove cmin and cmax system columns to avoid confusion, and replace them with cminmax, that returns what's on disk. Don't forget it could be xvac or cphantom too ;-). I think I agree with this approach but not that particular name exactly. I'm inclined to suggest that we just continue to use cmin for the field --- cmax could be dropped or become an alias for cmin. A fourth possibility is to abandon the rule that these columns never read as null, and to have them show their contents when meaningful (as determined by infomask) and null otherwise. However, then we'd have to support all of cmin, cmax, cphantom, and xvac in order to ensure that we always have a column that can show the on-disk value. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Phantom Command ID
Heikki Linnakangas [EMAIL PROTECTED] writes: A big question is, do we need to implement spilling to disk? My thought is no, at least not in the first cut ... this is something that can be added later if it proves critical, and right at the moment my guess is that it never will. The data structure design sounds fine. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Phantom Command ID
On Wed, Sep 20, 2006 at 04:02:00PM -0400, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: A big question is, do we need to implement spilling to disk? My thought is no, at least not in the first cut ... this is something that can be added later if it proves critical, and right at the moment my guess is that it never will. The data structure design sounds fine. What would the failure mode be? Would we just keep going until the box ran out of memory? I think it'd be better to have some kind of hard limit so that a single backend can't grind a production server into a swap-storm. (Arguably, not having a limit is exposing a DoS vulnerability). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Phantom Command ID
Jim C. Nasby [EMAIL PROTECTED] writes: What would the failure mode be? Would we just keep going until the box ran out of memory? I think it'd be better to have some kind of hard limit so that a single backend can't grind a production server into a swap-storm. (Arguably, not having a limit is exposing a DoS vulnerability). [ shrug... ] If we tried to guarantee such a thing we'd be putting arbitrary limits into hundreds if not thousands of different bits of the backend. I think the correct answer for an admin who is worried about such a thing is to make sure that the process ulimit is a sufficiently small fraction of the machine's available RAM. Only if we can't gracefully handle running up against ulimit is it our problem (hence, we have a stack-size overflow check, but not any such thing for data size). regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Phantom Command ID
On Wed, Sep 20, 2006 at 04:22:47PM -0400, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: What would the failure mode be? Would we just keep going until the box ran out of memory? I think it'd be better to have some kind of hard limit so that a single backend can't grind a production server into a swap-storm. (Arguably, not having a limit is exposing a DoS vulnerability). [ shrug... ] If we tried to guarantee such a thing we'd be putting arbitrary limits into hundreds if not thousands of different bits of the backend. I think the correct answer for an admin who is worried about such a thing is to make sure that the process ulimit is a sufficiently small fraction of the machine's available RAM. Only if we can't gracefully handle running up against ulimit is it our problem (hence, we have a stack-size overflow check, but not any such thing for data size). I didn't realize we had a lot of ways a backend could run a machine out of memory, or at least ways that didn't have some kind of limit (ie: work_mem). Are any of them very easy to run into? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Phantom Command ID
Jim C. Nasby [EMAIL PROTECTED] writes: I didn't realize we had a lot of ways a backend could run a machine out of memory, or at least ways that didn't have some kind of limit (ie: work_mem). Are any of them very easy to run into? work_mem has nothing to do with trying to guarantee no swapping DoS. If it did, it wouldn't be USERSET, and it wouldn't be per query step. The fact is that ulimit does what you want in that regard already; why should we try to reinvent that wheel? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq