Re: [HACKERS] Partitioning option for COPY
On Sun, 22 Nov 2009, Emmanuel Cecchet wrote: As I explained to Tom, if the after row trigger is called asynchronously I get a relcache leak on the child table at the end of the copy operation. If the trigger is called synchronously (like a before row trigger) it works fine. Also calling the after row trigger synchronously allows me to detect any potential problem between the actions of the trigger and the routing decision. I am open to any suggestion for a more elegant solution. Well, I think there are still some issues there that at least need to be better documented. For example, create or replace function fi() returns trigger as ' begin if (NEW.p is not null) then if (select count(*) from i where i.i = NEW.p) = 0 then raise exception ''No parent''; end if; end if; return NEW; end; ' language 'plpgsql'; create or replace function fc() returns trigger as ' begin if (NEW.p is not null) then if (select count(*) from c where c.i = NEW.p) = 0 then raise exception ''No parent''; end if; end if; return NEW; end; ' language 'plpgsql'; create or replace function fp() returns trigger as ' begin if (NEW.p is not null) then if (select count(*) from p where p.i = NEW.p) = 0 then raise exception ''No parent''; end if; end if; return NEW; end; ' language 'plpgsql'; drop table i; drop table c; drop table p cascade; create table i(i int, p int); create trigger tri after insert on i for each row execute procedure fi(); create table c(i int, p int); create trigger trc after insert on c for each row execute procedure fc(); create table p(i int, p int); create table p1 (check (i 0 and i = 10)) inherits (p); create table p2 (check (i 10 and i = 20)) inherits (p); create table p3 (check (i 20 and i = 30)) inherits (p); create trigger trp1 after insert on p1 for each row execute procedure fp(); create trigger trp2 after insert on p2 for each row execute procedure fp(); create trigger trp3 after insert on p3 for each row execute procedure fp(); insert into i values (1,3),(2,1),(3,NULL); copy c from stdin; 1 3 2 1 3 \N \. copy p from stdin with (partitioning); 1 3 2 1 3 \N \. gives me a successful load into i and c, but not into p with the current patch AFAICS while a load where the 3 row is first does load. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning option for COPY
On Sun, 22 Nov 2009, Emmanuel Cecchet wrote: Stephan Szabo wrote: On Sun, 22 Nov 2009, Emmanuel Cecchet wrote: As I explained to Tom, if the after row trigger is called asynchronously I get a relcache leak on the child table at the end of the copy operation. If the trigger is called synchronously (like a before row trigger) it works fine. Also calling the after row trigger synchronously allows me to detect any potential problem between the actions of the trigger and the routing decision. I am open to any suggestion for a more elegant solution. Well, I think there are still some issues there that at least need to be better documented. For example, create or replace function fi() returns trigger as ' begin if (NEW.p is not null) then if (select count(*) from i where i.i = NEW.p) = 0 then raise exception ''No parent''; end if; end if; return NEW; end; ' language 'plpgsql'; create or replace function fc() returns trigger as ' begin if (NEW.p is not null) then if (select count(*) from c where c.i = NEW.p) = 0 then raise exception ''No parent''; end if; end if; return NEW; end; ' language 'plpgsql'; create or replace function fp() returns trigger as ' begin if (NEW.p is not null) then if (select count(*) from p where p.i = NEW.p) = 0 then raise exception ''No parent''; end if; end if; return NEW; end; ' language 'plpgsql'; drop table i; drop table c; drop table p cascade; create table i(i int, p int); create trigger tri after insert on i for each row execute procedure fi(); create table c(i int, p int); create trigger trc after insert on c for each row execute procedure fc(); create table p(i int, p int); create table p1 (check (i 0 and i = 10)) inherits (p); create table p2 (check (i 10 and i = 20)) inherits (p); create table p3 (check (i 20 and i = 30)) inherits (p); create trigger trp1 after insert on p1 for each row execute procedure fp(); create trigger trp2 after insert on p2 for each row execute procedure fp(); create trigger trp3 after insert on p3 for each row execute procedure fp(); insert into i values (1,3),(2,1),(3,NULL); copy c from stdin; 1 3 2 1 3 \N \. copy p from stdin with (partitioning); 1 3 2 1 3 \N \. gives me a successful load into i and c, but not into p with the current patch AFAICS while a load where the 3 row is first does load. Well, if you don't insert anything in p (the table, try to avoid using the same name for the table and the column in an example), copy will insert (1,3) in p1 and then the trigger will evaluate select count(*) from p where p.i = NEW.p = NEW.p is 3 and the only p.i available is 1. This should return 0 rows and raise the exception. This seems normal to me. The only reason it works for i is because you inserted the values before the copy. Am I missing something? I believe so unless I am. There are three separate cases being run for comparison purposes. Multi-row insert on i where an after trigger on i checks the parents within i, a copy on c where an after trigger on c checks the parents within c, a copy on p (with inheritance) where an after trigger on p* checks the parents within the p hierarchy. So, in the case of the multi-row insert, it's inserting (1,3), but it doesn't immediately check, it inserts (2,1) and (3,NULL) before running the checks. The same seems to happen for the base copy. Copy with inheritance seems to be working differently. That may or may not be okay, but if it's different it needs to be prominently mentioned in documentation. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Trigger with WHEN clause (WIP)
On Fri, 16 Oct 2009, Tom Lane wrote: I note BTW that we have some ad-hoc logic already that arranges to suppress queuing of AFTER events for FK triggers, if the FK column value has not changed. It might be interesting to look at whether that hack could be unified with the user-accessible feature. It's essentially a WHEN OLD.x IS NOT DISTINCT FROM NEW.x test. Note that one of those cases (RI_TRIGGER_FK) is a bit special due to the transaction id test. It might be worth seeing if a better solution is possible to cover the case in the comment if the above becomes possible, though. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cardinality()
On Sun, 1 Mar 2009, Tom Lane wrote: I wrote: The standard doesn't have multi-dimensional arrays, so it's entirely possible that somewhere in it there is wording that makes cardinality() equivalent to the length of the first dimension. But I concur with Andrew that this is flat wrong when extended to m-d arrays. I poked around in the SQL:2008 draft a bit. AFAICT the most precise statement about cardinality() is in 6.27 numeric value function: cardinality expression ::= CARDINALITYleft paren collection value expression right paren 7) The result of cardinality expression is the number of elements of the result of the collection value expression. Now the standard is only considering 1-D arrays, but I fail to see any way that it could be argued that the appropriate reading of number of elements for a multi-D array is the length of the first dimension. Does the standard allow you to make arrays of arrays, for example with something like ARRAY[ARRAY[1,2], ARRAY[3,4]]? If so, it might be possible that cardinality(that expression) would be returning the number of arrays in the outer array. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] incoherent view of serializable transactions
On Tue, 23 Dec 2008, Kevin Grittner wrote: The page locking provides this because every index page or data page the serializable transaction looks at is locked against updates until the end of the transaction. If it can see all the COLUMN=0 rows through an index, the index locks protect the transaction. If a table scan is required, the entire table is locked against all modifications. (That's right, it is not unusual to have entire tables locked against any modification until the end of a database transaction.) Well, predicate locking for serializable also should only lock the appropriate conditions. Getting a deadlock between two serializable transactions for conditions that can be serialized would seemingly also be disallowed by the definition of serializable since there would exist no serial ordering of the transactions that has that effect. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] deadlock with truncate and foreing keys
On Mon, 18 Feb 2008, Tom Lane wrote: Alexey Nalbat [EMAIL PROTECTED] writes: create table t1 ( id integer primary key, name text ); create table t2 ( id integer references t1 ); insert into t1 values ( 1 ); insert into t2 values ( 1 ); Then two concurrent transactions start. /* 1 */ begin; /* 1 */ truncate t2; /* 2 */ begin; /* 2 */ update t1 set name='foo' where id=1; /* 1 */ insert into t2 values ( 1 ); Here we have deadlock. Hmm, this happens because RI_FKey_keyequal_upd_pk does fk_rel = heap_open(riinfo.fk_relid, AccessShareLock); but right offhand I see no reason for it to do so --- it doesn't *do* anything with fk_rel except close it again. Likewise RI_FKey_keyequal_upd_fk doesn't seem to really need to touch the pk_rel. Is there something I'm missing in that? Maybe this is a vestige of earlier coding that did need to touch both rels to perform the keysequal check? Probably something like that - maybe ri_BuildQueryKeyFull might have needed it open. Actually, I'm wondering if the ri_BuildQueryKeyFull call is also unnecessary now - I don't think we ever use the qkey that comes out of it unless I'm missing some code. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Polymorphic arguments and composite types
On Fri, 5 Oct 2007, Simon Riggs wrote: On Fri, 2007-10-05 at 11:24 -0700, Stephan Szabo wrote: On Fri, 5 Oct 2007, Simon Riggs wrote: On Fri, 2007-10-05 at 10:59 -0700, Stephan Szabo wrote: On Fri, 5 Oct 2007, Simon Riggs wrote: On Fri, 2007-10-05 at 10:32 -0700, Stephan Szabo wrote: On Fri, 5 Oct 2007, Simon Riggs wrote: Because we already do exactly that here: select 1, (select col2 from c), 3; The inner select returns a ROW, yet we treat it as a single column value. The inner select does not return a row. It's not a row subquery, it's a scalar subquery. Thanks Stephan, Tom already explained that. My comments above were in response to Why would you think that? Right, but I guess I couldn't see why you would consider that the same as treating a rowtype as a scalar, because when I look at that my brain converts that to a scalar subquery, so I guess I simply see the scalar. If we supported select 1, (select 2,3), select 4 giving something like (1,(2,3),4), I'd also have confusion over the case, but that should error. Well, my brain didn't... All I've said was that we should document it, to help those people that don't know they're SQL standard as good as the best people on this list. Where would you document this beyond 4.2 though? While I don't exactly like the wording of 4.2.9, it seems like it's already trying to say that. Yeh, it does, but you're forgetting that my original complaint was that you couldn't use it in an ANY clause, which 4.2 does not exclude. Bearing in mind you can use a scalar subquery in lots of places, I thought it worth reporting. Well, but I'd argue that we're now talking about separate issues. The first is how scalar subqueries act, as far as not being a rowtype. The second is related to the question of ANY and scalar subqueries specifically. The third is related to where you can use scalar subqueries. The ANY clause at 9.19.4 mentions a subquery, but doesn't say it can't be a scalar subquery; it doesn't restrict this to non-scalar subqueries. While it's true that it isn't a scalar subquery (although it's not a restriction on the kind of subquery, it's the definition of what (select ...) turn into when used there), I don't see how the text doesn't basically say that op ANY (subquery returning a single array) works the way it currently does. I think it'd be more applicable to mention in the array one that using a subquery as the right hand side turns it into the other form. I'm not convinced it's necessary, but also I'd think that one general mention would likely be better than separate ones in each of ANY and ALL. It might be reasonable to try to note where subqueries are scalar subqueries, but I think that'll be prone to being wrong or misinterpreted as well. Searching in Arrays, 8.14.5 doesn't say it can't be a subquery either. True, although I don't know if it's right to mention there since that section appears to link to the other section saying that the other section describes the method. Section 9.20.3 mentions ANY (array expression). The term array expression is not defined nor is there a link to where it is defined, nor is the term indexed. I'm not sure why we're using a separate term for that. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Polymorphic arguments and composite types
On Sat, 6 Oct 2007, Simon Riggs wrote: On Sat, 2007-10-06 at 10:15 -0700, Stephan Szabo wrote: Yeh, it does, but you're forgetting that my original complaint was that you couldn't use it in an ANY clause, which 4.2 does not exclude. Bearing in mind you can use a scalar subquery in lots of places, I thought it worth reporting. Well, but I'd argue that we're now talking about separate issues. It's simpler than that. I asked a question because the manual isn't specific on my original point. I'll do a doc patch to make sure nobody makes the same mistake I did and we record all the good points people have made. Section 9.20.3 mentions ANY (array expression). The term array expression is not defined nor is there a link to where it is defined, nor is the term indexed. I'm not sure why we're using a separate term for that. The term array expression is used in the manual, but not defined. Right. I meant, if those are the only uses, why did we use a specific term array expression rather than relying on saying that the expression given must have array type. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Polymorphic arguments and composite types
On Fri, 5 Oct 2007, Simon Riggs wrote: Because we already do exactly that here: select 1, (select col2 from c), 3; The inner select returns a ROW, yet we treat it as a single column value. The inner select does not return a row. It's not a row subquery, it's a scalar subquery. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Polymorphic arguments and composite types
On Fri, 5 Oct 2007, Simon Riggs wrote: On Fri, 2007-10-05 at 10:32 -0700, Stephan Szabo wrote: On Fri, 5 Oct 2007, Simon Riggs wrote: Because we already do exactly that here: select 1, (select col2 from c), 3; The inner select returns a ROW, yet we treat it as a single column value. The inner select does not return a row. It's not a row subquery, it's a scalar subquery. Thanks Stephan, Tom already explained that. My comments above were in response to Why would you think that? Right, but I guess I couldn't see why you would consider that the same as treating a rowtype as a scalar, because when I look at that my brain converts that to a scalar subquery, so I guess I simply see the scalar. If we supported select 1, (select 2,3), select 4 giving something like (1,(2,3),4), I'd also have confusion over the case, but that should error. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Polymorphic arguments and composite types
On Fri, 5 Oct 2007, Simon Riggs wrote: On Fri, 2007-10-05 at 10:59 -0700, Stephan Szabo wrote: On Fri, 5 Oct 2007, Simon Riggs wrote: On Fri, 2007-10-05 at 10:32 -0700, Stephan Szabo wrote: On Fri, 5 Oct 2007, Simon Riggs wrote: Because we already do exactly that here: select 1, (select col2 from c), 3; The inner select returns a ROW, yet we treat it as a single column value. The inner select does not return a row. It's not a row subquery, it's a scalar subquery. Thanks Stephan, Tom already explained that. My comments above were in response to Why would you think that? Right, but I guess I couldn't see why you would consider that the same as treating a rowtype as a scalar, because when I look at that my brain converts that to a scalar subquery, so I guess I simply see the scalar. If we supported select 1, (select 2,3), select 4 giving something like (1,(2,3),4), I'd also have confusion over the case, but that should error. Well, my brain didn't... All I've said was that we should document it, to help those people that don't know they're SQL standard as good as the best people on this list. Where would you document this beyond 4.2 though? While I don't exactly like the wording of 4.2.9, it seems like it's already trying to say that. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] regexp_matches and regexp_split are inconsistent
On Fri, 10 Aug 2007, Tom Lane wrote: I noticed the following behavior in CVS HEAD, using a pattern that is capable of matching no characters: regression=# SELECT foo FROM regexp_matches('ab cde', $re$\s*$re$, 'g') AS foo; foo --- {} {} { } {} {} {} {} (7 rows) regression=# SELECT foo FROM regexp_split_to_table('ab cde', $re$\s*$re$) AS foo; foo - a b c d e (5 rows) If you count carefully, you will see that regexp_matches() reports a match of the pattern at the start of the string and at the end of the string, and also just before 'c' (after the match to the single space). However, regexp_split() disregards these degenerate matches of the same pattern. Is this what we want? Arguably regexp_split is doing the most reasonable thing for its intended usage, but the strict definition of regexp matching seems to require what regexp_matches does. I think we need to either change one function to match the other, or else document the inconsistency. Thoughts? I'm not sure how many languages do this, but at least perl seems to work similarly, which makes me guess that it's probably similar in a bunch of languages. If it is, then we should probably just document the inconsistency. Perl seems to document the split behavior with Empty leading (or trailing) fields are produced when there are positive width matches at the beginning (or end) of the string; a zero-width match at the beginning (or end) of the string does not produce an empty field. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Can't ri_KeysEqual() consider two nulls as equal?
On Tue, 17 Apr 2007, Tom Lane wrote: A recent discussion led me to the idea that FK triggers are fired unnecessarily during an UPDATE if the foreign-key column(s) contain any NULLs, because ri_KeysEqual() treats two nulls as unequal, and therefore we conclude the row has changed when it has not. I claim that both ri_KeysEqual() and ri_OneKeyEqual() could consider two nulls to be equal. Furthermore it seems like ri_AllKeysUnequal() should do so too; the case can't arise at the moment because the sole caller already knows that one of the key sets contains no nulls, but if this weren't so, the optimization would be actively wrong if we concluded that two nulls were unequal. Do you have any suggestions for alternate names? Keeping them using Equal seems to be dangerous since people would likely expect it to act like normal equality (with nulls being different). ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Can't ri_KeysEqual() consider two nulls as equal?
On Tue, 17 Apr 2007, Tom Lane wrote: A recent discussion led me to the idea that FK triggers are fired unnecessarily during an UPDATE if the foreign-key column(s) contain any NULLs, because ri_KeysEqual() treats two nulls as unequal, and therefore we conclude the row has changed when it has not. I claim that both ri_KeysEqual() and ri_OneKeyEqual() could consider two nulls to be equal. For ri_KeysEqual, I think so, since we actually aren't testing equality as much as difference between the rows that might invalidate the constraint. And, it does seem like with the code in trigger.c that the other checks in the _upd functions in ri_triggers.c are redundant, but I'm vaguely afraid I've forgotten something. For ri_OneKeyEqual, I think like ri_AllKeysUnequal we know that the old row doesn't have NULLs in the places it's currently called (although I don't think this is commented). It seems like it should stay consistent with ri_KeysEqual and that not putting the foo = NULL or foo = DEFAULT seems better for the current calling cases besides. Furthermore it seems like ri_AllKeysUnequal() should do so too; the case can't arise at the moment because the sole caller already knows that one of the key sets contains no nulls, but if this weren't so, the optimization would be actively wrong if we concluded that two nulls were unequal. Hmm, probably so, although at least this does appear to be commented at the calling site to mention that it's depending on the fact that there are no NULLs. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Eliminating unnecessary left joins
On Wed, 11 Apr 2007, Jim Nasby wrote: I agree with others that the way that query is constructed is a bit odd, but it does bring another optimization to mind: when doing an inner-join between a parent and child table when RI is defined between them, if the query only refers to the child table you can drop the parent table from the join, because each row in the child table must have one and only one row in the parent. I don't think that's quite true without qualifications. First, I think it needs to be an immediate constraint (and I don't remember how we handle set constraints inside functions that might be called from a statement, so it might need to be not deferrable). Second, I think you also need to take care of NULLs since child rows with NULLs in the key pass the constraint but have no rows in the parent and would get culled by the inner join. Also, there's a possible issue that constraints do not actually guarantee that they always hold true, merely that they hold true at particular times. I don't know if it's possible to get a statement executed such that it would see the table state between the action and constraint check or if such is allowed by spec. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Column storage positions
On Wed, 21 Feb 2007, Martijn van Oosterhout wrote: On Wed, Feb 21, 2007 at 12:06:30PM -0500, Phil Currier wrote: Well, for two reasons: 1) If you have a table with one very-frequently-accessed varchar() column and several not-frequently-accessed int columns, it might actually make sense to put the varchar column first. The system won't always be able to make the most intelligent decision about table layout. Umm, the point of the exercise is that if you know there are int columns, then you can skip over them, whereas you can never skip over a varchar column. So there isn't really any situation where it would be better to put the varchar first. IIRC, in the first message in this thread, or another recent thread of this type, someone tried a reordering example with alternating smallints and varchar() and found that the leftmost varchar was actually slower to access after reordering, so I'm not sure that we can say there isn't a situation where it would affect things. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Column storage positions
On Wed, 21 Feb 2007, Alvaro Herrera wrote: Stephan Szabo escribi?: On Wed, 21 Feb 2007, Martijn van Oosterhout wrote: On Wed, Feb 21, 2007 at 12:06:30PM -0500, Phil Currier wrote: Well, for two reasons: 1) If you have a table with one very-frequently-accessed varchar() column and several not-frequently-accessed int columns, it might actually make sense to put the varchar column first. The system won't always be able to make the most intelligent decision about table layout. Umm, the point of the exercise is that if you know there are int columns, then you can skip over them, whereas you can never skip over a varchar column. So there isn't really any situation where it would be better to put the varchar first. IIRC, in the first message in this thread, or another recent thread of this type, someone tried a reordering example with alternating smallints and varchar() and found that the leftmost varchar was actually slower to access after reordering, so I'm not sure that we can say there isn't a situation where it would affect things. Offsets are cached in tuple accesses, but the caching is obviously disabled for all attributes past any variable-length attribute. So if you put a varlena attr in front, caching is completely disabled for all attrs (but that first one). The automatic reordering algorithm must put all fixed-len attrs at the front, so that their offets (and that of the first variable length attr) can be cached. Did I miss something in what you were trying to say? I assume you must already know this. I think so. What I was mentioning was that I was pretty sure that there was a message with someone saying that they actually tried something that did this and that they found left-most varchar access was slightly slower after the reordering although general access was faster. I believe the table case was alternating smallint and varchar columns, but I don't know what was tested for the retrieval. If that turns out to be able to be supported by other tests, then for some access patterns, the rearranged version might be slower. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Foreign keys for non-default datatypes, redux
On Mon, 12 Feb 2007, Tom Lane wrote: I wrote: * Add an oid[] column to pg_constraint that stores the equality operator OIDs for a foreign-key constraint, in the same column order as conkey[] and confkey[]. It turns out this isn't sufficient: ri_Check_Pk_Match() wants to generate PK = PK checks, and the PK = FK operator isn't the right one for that. Ugh, right, for modifications of the pk side with no action to make sure there isn't a new row with that key. The information I suggested adding to pg_constraint isn't enough to let it find out which operator is the right one. We could handle this in a couple of ways: 1. Add yet another column with PK=PK operator OIDs to pg_constraint. 2. Add a column with the underlying PK index's OID to pg_constraint, and expect ri_Check_Pk_Match to dredge the info from that. This is probably possible, but not exactly trivial because of which-column-is-which considerations. 3. Leave pg_constraint alone and expect ri_Check_Pk_Match to look in pg_depend to find out the underlying PK index, then proceed as in #2. From an efficiency standpoint #1 seems the best, and yet it seems a bit ugly. Not that the others aren't. Comments, other ideas? I think #1, while ugly, is probably less ugly than the others, although I guess it means even more work if the underlying type of the column is changed. Is there any reason to think that in the future we might need more such things for some constraints? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Foreign keys for non-default datatypes, redux
On Fri, 9 Feb 2007, Tom Lane wrote: Almost a year ago, we talked about the problem that referential integrity should be selecting comparison operators on the basis of b-tree index opclasses, instead of assuming that the appropriate operator is always named =: http://archives.postgresql.org/pgsql-hackers/2006-02/msg00960.php http://archives.postgresql.org/pgsql-hackers/2006-03/msg00161.php I'm about to go off and implement that at last. To refresh folks' memory, what I think we agreed to was that at the time of definition of a foreign-key constraint, we should identify the specific equality operator to be used for (each column of) the constraint. The method for doing this is to be: * First, identify the unique index that is relied on to enforce uniqueness of the PK entries (we do this already of course). * Look to see if there is an equality operator in this index's opfamily accepting exactly the PK and FK data types (ie, PK = FK). If so, use that. * Else, check to see if there is an implicit promotion from the FK datatype to the PK datatype. If so, use the equality operator PK = PK, which must exist since the opfamily supports an index on the PK datatype. * Else fail (this means that the present warning about inefficient foreign keys will become a hard error). I assume you're speaking of the version where we just change the constraints to use statements with the OPERATOR() syntax and potential casts rather than the discussion at the end about changing the pk checks to avoid planning entirely? My intention is that we'd record pg_depend entries making the RI constraint dependent on not only the index, but the specific operators to use. This would not have been too critical a year ago given that opclasses were effectively immutable; but in the current opfamily design it's entirely likely that we'd select cross-type equality operators that are considered loose and potentially droppable from the opfamily. So we need dependencies to prevent the operators from disappearing out from under us. (Come to think of it, we might want to record dependencies on the casts too, if we're using implicit casts?) I think we probably should, so the above seems reasonable to me. * Add an oid[] column to pg_constraint that stores the equality operator OIDs for a foreign-key constraint, in the same column order as conkey[] and confkey[]. * Add an OID column to pg_trigger giving the OID of the constraint owning the trigger (or 0 if none). Add this information to struct Trigger as well, so that it gets passed to trigger functions. Given the pg_constraint OID, the RI triggers could fetch the constraint row and look at conkey[], confkey[], and the new operator oid[] array to determine what they need to know. This would actually mean that they don't need pg_trigger.tgargs at all. I am pretty strongly tempted to stop storing anything in tgargs for RI triggers --- it's ugly, and updating the info during RENAME commands is a pain in the rear. On the other hand removing it might break client-side code that expects to look at tgargs to learn about FK constraints. I'd personally think that pg_constraint is a lot easier to work with, but there might be some code out there left over from way back before pg_constraint existed --- anyone know of any such issue? I'd say we probably want to keep the tgargs info for at least a version or two after changing the implementation. Getting rid of using the args info sounds like a good idea. One side question is what should we do about the places in the current system where it checks for the key sets being empty? AFAIK, we still don't actually support letting you define a constraint that way, and I haven't heard any complaints about that, and I'm not even sure if that actually made it into the spec proper. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Foreign keys for non-default datatypes, redux
On Sat, 10 Feb 2007, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: One side question is what should we do about the places in the current system where it checks for the key sets being empty? I don't see that this affects that either way. I can't quite imagine what the semantics would be, though --- there's no such thing as a unique constraint with no columns, so how can there be an RI constraint with none? Well, the code currently has checks with comments based on SQL3 text AFAICT. /* -- * SQL3 11.9 referential constraint definition * General rules 2) a): * If Rf and Rt are empty (no columns to compare given) * constraint is true if 0 (SELECT COUNT(*) FROM T) * * Note: The special case that no columns are given cannot * occur up to now in Postgres, it's just there for * future enhancements. * -- */ The reason I was wondering is that it uses tgnargs == 4 as the check, and if we change the meanings of tgnargs, we'd need to change the check. Personally, I think we should probably just pull out the special case for now, but thought it should be brought up. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] referential Integrity and SHARE locks
On Thu, 8 Feb 2007, Marc Munro wrote: Oops, forgot to include pgsql-hackers when I responded to this the first time. On Tue, 2007-06-02 at 20:53 -0500, Tom Lane wrote: Marc Munro [EMAIL PROTECTED] writes: The RI triggers currently fire when a record is updated. Under my proposal they would fire in the same way but before the record is locked rather than after. Or am I missing your point? IOW, some other transaction could update or delete the tuple meanwhile? Doesn't seem very promising. That other transaction, T1, would have run the same RI triggers and so would have the same parent records locked. That's not true in the case of delete, since the referencing table triggers are on insert and update. Second, the parent record locks are not exclusive which means that both can be granted, so I don't see how this stops the second from continuing before the first. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] referential Integrity and SHARE locks
On Thu, 8 Feb 2007, Marc Munro wrote: On Thu, 2007-08-02 at 10:06 -0800, Stephan Szabo wrote: On Thu, 8 Feb 2007, Marc Munro wrote: . . . That other transaction, T1, would have run the same RI triggers and so would have the same parent records locked. That's not true in the case of delete, since the referencing table triggers are on insert and update. . . . Let me see if I have this scenario right: Transaction T1 updates child record C1, with RI causing the parent P1 to be locked before the child. In the meantime transaction T2, successfully deletes C1 as it has not yet been locked. (Please tell me if I have misunderstood what you are saying) Yes in this case, T1 must abort because the record it was going to update has disappeared from underneath it. I don't see how this is significantly different from the same race for the record if the table had no RI constraints. The only difference that I can see, is that T1 now has some locks that it must relinquish as the transaction aborts. . . . Second, the parent record locks are not exclusive which means that both can be granted, so I don't see how this stops the second from continuing before the first. I don't think this does stop the second from continuing before the first. What will stop it, is the eventual lock that is taken on the child (triggering) record. But at that point, you've already had to compose the new row in order to call the trigger for the ri check, right? So, one of those new rows will be out of date by the time it actually gets the lock. Presumably that means that you need to recalculate the new row, but you've already done a check and gotten a lock based on the old new row. Also, another big problem is the fact that SQL requires that the action already have happened before the check in cases where the constraint references the same table. The row being updated or inserted might reference a row that will be updated or inserted by a later action of the same statement. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Referential Integrity and SHARE locks
On Mon, 5 Feb 2007, Simon Riggs wrote: On Sun, 2007-02-04 at 09:38 +, Simon Riggs wrote: The TODO I was requesting you consider was this: Develop non-conflicting locking scheme to allow RI checks to co-exist peacefully with non-PK UPDATEs on the referenced table. That is, IMHO, a general statement of an important unresolved issue with our Referential Integrity implementation. That is in no way intended as any form of negative commentary on the excellent detailed work that has got us so far already. Well, if we really want to solve that completely then we really need column locking, or at least locking at the level of arbitrary (possibly overlapping) unique constraints, not just the PK because foreign keys don't necessarily reference the primary key. But the PK case is certainly the most common and it'd certainly be nice to cover that case. ... It occurs to me that if we had visibility in unique indexes, this would allow the index rows to be separately lockable to the main row. That's exactly what we need here. I've implemented a work-around using this principle, utilising RULEs and a duplicated PK column-only table. This still allows FK checks to work correctly, yet doesn't require the backend hack Csaba mentioned. My feeling is that more work in this area is required, even if we can't yet agree a TODO item. I actually like the general idea your TODO item had, although I would say non-referenced column update rather than non-PK update. Even if we put it far out due to questions about what would be acceptable implementation. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Referential Integrity and SHARE locks
On Sat, 3 Feb 2007, Simon Riggs wrote: On Fri, 2007-02-02 at 16:50 -0500, Tom Lane wrote: No, I don't. I think knowledge of which columns are in a PK is quite a few levels away from the semantics of row locking. To point out just one problem, what happens when you add or drop a PK? Or drop and replace with a different column set? Yes, I know dropping one requires exclusive lock on the table, but the transaction doing it could hold row locks within the table, and now it's very unclear what they mean. There are issues, yes. Dropping PKs is a very irregular occurrence nor is it likely to be part of a complex transaction. It wouldn't bother me to say that if a transaction already holds a RowExclusiveLock or a RowShareLock it cannot upgrade to an AccessExclusiveLock. The lock check seems like a strange constraint, given that it's not necessarily going to be anything that conflicts with the row locks. I'm not sure there'd be a better idea given this sort of scheme, but it still seems strange. The TODO I was requesting you consider was this: Develop non-conflicting locking scheme to allow RI checks to co-exist peacefully with non-PK UPDATEs on the referenced table. That is, IMHO, a general statement of an important unresolved issue with our Referential Integrity implementation. That is in no way intended as any form of negative commentary on the excellent detailed work that has got us so far already. Well, if we really want to solve that completely then we really need column locking, or at least locking at the level of arbitrary (possibly overlapping) unique constraints, not just the PK because foreign keys don't necessarily reference the primary key. But the PK case is certainly the most common and it'd certainly be nice to cover that case. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Referential Integrity and SHARE locks
On Fri, 2 Feb 2007, Simon Riggs wrote: It sounds like if we don't put a SHARE lock on the referenced table then we can end the transaction in an inconsistent state if the referenced table has concurrent UPDATEs or DELETEs. BUT those operations do impose locking rules back onto the referencing tables that would not be granted until after any changes to the referencing table complete, whereupon they would restrict or cascade. So an inconsistent state doesn't seem possible to me. What locking back to the referencing table are you thinking about? The row locks are insufficient because that doesn't prevent an insert of a new row that matches the criteria previously locked against AFAIK. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [SQL] Case Preservation disregarding case
On Tue, 31 Oct 2006, Chuck McDevitt wrote: We treated quoted identifiers as case-specific, as the spec requires. In the catalog, we stored TWO columns... The column name with case converted as appropriate (as PostgreSQL already does), used for looking up the attribute, And a second column, which was the column name with the case exactly as entered by the user. Wouldn't using that second column's value tend to often violate 5.2SR10 (at least that's the reference item in SQL92)? AFAICT, that rule basically says that the regular identifier is equivalent to the case-folded one for purposes of information and definition schema and similar purposes which seems like it would be intended to include things like column labeling for output. There's a little bit of flexibility there on both similar purposes and equivalence, though. 10) The identifier body of a regular identifier is equivalent to an identifier body in which every letter that is a lower- case letter is replaced by the equivalent upper-case letter or letters. This treatment includes determination of equiva- lence, representation in the Information and Definition Schemas, representation in the diagnostics area, and similar uses. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Foreign keys
On Sun, 10 Sep 2006, Gregory Stark wrote: Chris Mair [EMAIL PROTECTED] writes: What's the purpose of letting you insert 1000 records, then, at the end say: hah, all is rolled back becauase the 2nd record was invalid. PG justly throws the exception immediately to let you know it's futile inserting 998 more records. Well there's plenty of cases where people want that and we support it with deferred constraints. However the OP sounds like he wants something else. I think what he wants is when he inserts a record and it fails due to foreign key constraints to report all the violated constraints, not just the first one found. I never run into this problem myself because I think of foreign key constraints as more akin to C assertions. They're a backstop to make sure the application is working correctly. I never write code that expects foreign key constraint errors and tries to handle them. But there's nothing saying that's the only approach. The feature request seems pretty reasonable to me. I'm not sure how hard it would be with the ri triggers as written. I'm not sure there's anywhere for triggers to store their return values so I'm unclear this can even be done using triggers. I think if we were going to do this that all the constraint violations for unique, not null, check and foreign keys should be handled similarly, so we'd probably want something more general than just a way for the ri triggers to do this. I don't have a good idea of the right solution for that though. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Foreign keys
On Sun, 10 Sep 2006, Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Stephan Szabo [EMAIL PROTECTED] writes: I think if we were going to do this that all the constraint violations for unique, not null, check and foreign keys should be handled similarly, so we'd probably want something more general than just a way for the ri triggers to do this. I don't have a good idea of the right solution for that though. It seems pretty unwieldy to me: it's not hard to imagine a long INSERT throwing millions of separate foreign-key errors before it's done, for instance. And then there's the cascading-errors problem, ie, bogus reports that happen because some prior step failed ... not least being your client crashing and failing to tell you anything about what happened because it ran out of memory for the error list. My advice is to rethink the client code that wants such a behavior. Well you're still talking about the case of multiple queries deferring all constraint checks to the end of the transaction. Well, or insert ... select or update or delete. Most deferred conditions can happen within one statement as well. In any case the same logic that leads to it being desirable to report all the errors to the user in a UI and not just report them one by one also applies to the database. I'm not sure it's the most important issue in the world, but it does seem like a it would be nice feature if it reported all the errors in the statement, not just the first one it finds. SQL seems to have a notion of setting the size of the diagnostics area for a transaction to hold a number of conditions. There are a few odd bits, for example it's mostly unordered, but the sqlcode returned must match to the first condition and we presumably want to make sure that if there are any errors that we return an exception sql code not a completion one. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposal for GUID datatype
On Sat, 9 Sep 2006, Jan de Visser wrote: On Saturday 09 September 2006 01:33, [EMAIL PROTECTED] wrote: I don't think so. If it isn't 128 bits - and you want to fit it into 128 bits, it means padding. Where should the padding go? As application specific, it is up to the application to convert. I am not saying that. I am just saying that you shouldn't limit yourself to any particular input formats. I'd wonder if it'd be better to have a set of literal formats and input functions like to_guid(text, text) for more complicated cases. The broader we make the literal format, the harder it is to determine if the input actually is what was intended. For example, did the user mean to put that ipv6 address in this guid column and what about this other ipv6 address looking thing which is abbreviated, are we putting in what the user expects? ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Foreign key column reference ordering and information_schema
Now that I've got a little time again... Per the report from Clark C Evans a while back and associated discussion, it seems like recent versions of the SQL spec changed the rules for foreign key column references such that the columns of the referenced unique constraint must be named in order (this is somewhat silly since unique(a,b) really should imply unique(b,a) but...). The information_schema definition seems to require this in order for one to use the information to find out the column references. I don't think we can switch to the newer definition directly since that will break dumps, but we could provide a variable which controls whether we allow the set allowed by SQL92 (not necessarily ordered) with the default being true for compatibility. But, that still doesn't give us a path to being able to change the default, or for that matter making it safe to use information_schema (since it would still be possible to change the value, make a constraint and change it back). One way to do this would be to accept the SQL92 form and treat it as if the command had given it in the ordered form, in other words, given create table pk(a int, b int, unique(a,b)); create table fk(c int, d int, foreign key(d,c) references (b,a)); the constraint is stored as if it were given foreign key(c,d) references(a,b). Does anyone have objections to either or both parts of this, and for the first, does anyone have a good idea of a name for the variable that would control this? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Foreign key column reference ordering and information_schema
On Wed, 17 May 2006, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Per the report from Clark C Evans a while back and associated discussion, it seems like recent versions of the SQL spec changed the rules for foreign key column references such that the columns of the referenced unique constraint must be named in order (this is somewhat silly since unique(a,b) really should imply unique(b,a) but...). I do not believe that that reading is correct. If the SQL committee had intended such a change, it would surely have been called out as a compatibility issue in Annex E of SQL2003. Which it isn't. where SQL2003 has If the referenced table and columns specifies a reference column list, then there shall be a one-to-one correspondence between the set of column names contained in that reference column list and the set of column names contained in the unique column list of a unique constraint of the referenced table such that corresponding column names are equivalent. Let referenced columns be the column or columns identified by that reference column list and let referenced column be one such column. Each referenced column shall identify a column of the referenced table and the same column shall not be identified more than once. I think SQL2003 is actually just trying to say the same thing in more precise language: you have to be able to match up the columns in the reference list with some unique constraint. I don't think the one to one bit is meant to imply a left-to-right-ordered correspondence; that's certainly not the mathematical meaning of a one-to-one function for instance. No, but the part which says corresponding column names are equivalent seems to imply it to me. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Foreign key column reference ordering and information_schema
ergh, hit send before finishing On Wed, 17 May 2006, Stephan Szabo wrote: On Wed, 17 May 2006, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Per the report from Clark C Evans a while back and associated discussion, it seems like recent versions of the SQL spec changed the rules for foreign key column references such that the columns of the referenced unique constraint must be named in order (this is somewhat silly since unique(a,b) really should imply unique(b,a) but...). I do not believe that that reading is correct. If the SQL committee had intended such a change, it would surely have been called out as a compatibility issue in Annex E of SQL2003. Which it isn't. where SQL2003 has If the referenced table and columns specifies a reference column list, then there shall be a one-to-one correspondence between the set of column names contained in that reference column list and the set of column names contained in the unique column list of a unique constraint of the referenced table such that corresponding column names are equivalent. Let referenced columns be the column or columns identified by that reference column list and let referenced column be one such column. Each referenced column shall identify a column of the referenced table and the same column shall not be identified more than once. I think SQL2003 is actually just trying to say the same thing in more precise language: you have to be able to match up the columns in the reference list with some unique constraint. I don't think the one to one bit is meant to imply a left-to-right-ordered correspondence; that's certainly not the mathematical meaning of a one-to-one function for instance. No, but the part which says corresponding column names are equivalent seems to imply it to me. Or are you thinking that the corresponding column names are equivalent is just a description of how to make the correspondence? That seems like a very odd way to phrase that since just saying that the sets of column names are equivalent would be enough for that and all the extra words seem to only obscure the point. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Foreign key column reference ordering and information_schema
On Wed, 17 May 2006, Tom Lane wrote: I'm more inclined to think that we've messed up the information_schema somehow ... As usual, you're right. ;) Actually, it wasn't precisely that we messed it up as much as the 99 defintion was wrong. It's pointed out in the 2003 schemata incompatibilities annex as having been incomplete and changed for 2003. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] constraints and sql92 information_schema compliance
On Wed, 15 Mar 2006, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: The main options seem to be: When we're allowing other order access, immediately reorder the constraint information to match the primary key order. This helps out with IS since the loaded constraint should display properly, but theoretically could change the visual representation after load for people who don't care about this option. Change the representation unconditionally on dump. Basically reorder the constraint at dump time to always generate a dump in SQL03 order. This has the same downside as the above except only after another dump/restore. Change the representation on dump only if the flag is set (probably exporting this as an option to pg_dump as well). This could be a little more difficult to use, but pretty much causes the user to drive the choice. I'm missing something. On what basis do you claim that there's a SQL03 order, ie some ordering mandated by the spec? What problem is this really solving? SQL2003 seems to change the relevant piece to: If the referenced table and columns specifies a reference column list, then there shall be a one-to-one correspondence between the set of column names contained in that reference column list and the set of column names contained in the unique column list of a unique constraint of the referenced table such that corresponding column names are equivalent. Let referenced columns be the column or columns identified by that reference column list and let referenced column be one such column. Each referenced column shall identify a column of the referenced table and the same column shall not be identified more than once. I read the section on corresponding column names are equivalent to imply that (b,a) and (a,b) aren't equivalent for this purpose because the corresponding column names are different. That's a difference from SQL92's version which basically just says the sets are the same. Basically, it's a compliance point, and something that's necessary to make information_schema work for foreign key constraints because you can't seem to use information_schema to read how the columns line up without it because they didn't put an ordering column on the side storing the referenced keys. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] constraints and sql92 information_schema compliance
[Resurrecting an old thread] On Sat, 25 Feb 2006, Clark C. Evans wrote: On Sat, Feb 25, 2006 at 12:51:51PM -0800, Stephan Szabo wrote: | * for foreign-key and check constraints, the default names | are $1, $2, etc.; it would be great if they were upgraded | to use the default names given by primary and unique key | constraints: table_uk_1stcol, table_pk | | Err... what version are you using? I get constraint names like tt_a_fkey | from devel, and I thought at least 8.1 does the same. 7.4.8, so it's a bit old -- glad to hear this made it! | * when creating a foreign key constraint on two columns, say | from A (x, y) to B (x, y), if the unique index on B is (x,y) | you can make a foreign key from A-B using (y,x) | | I don't understand which particular case you're complaining about, but as | far as I can see, we have to allow that case by the rest of the spec. To be clear, I'm talking about... CREATE TABLE x (y text, z text, PRIMARY KEY(y,z)); CREATE TABLE a (b text, c text); ALTER TABLE a ADD FOREIGN KEY (b, c) REFERENCES x(z, y); For this case, the information schema details: 1. the foreign key constraint as a reference to the primary key constraint and lists the tuple (b,c) 2. the primary key constraint lists the keys (y,z) In particular, the column ordering (z, y) in the reference clause is *lost*. Hence, if you were to blindly reconstruct a join critiera from the information schema, you'd wrongly assume that useful join critiera is: ON (a.b == x.y AND a.c == x.z) when the correct join critiera should be: ON (a.b == x.z AND a.c == x.y) I assert the problem here is that the FOREIGN KEY constraint construction should have *failed* since the *tuple* (z,y) does not infact match any unique key in table x. Looking at this more, I'm not sure that making it match the unique key exactly helps information_schema.constraint_column_usage at least. Given the following: create table ta(a int, b int, primary key(a,b)); create table tb(a int, b int, foreign key (a,b) references ta); create table tc(a int, b int, foreign key (b,a) references ta); I don't see how you can differentiate the foreign keys in the last two without a position column, which doesn't seem to be in at least our current view (although I haven't checked 2003 to see if they changed it). Both of those should be valid, although the second is wierd. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] constraints and sql92 information_schema compliance
On Wed, 15 Mar 2006, Clark C. Evans wrote: On Tue, Mar 14, 2006 at 08:14:12PM -0800, Stephan Szabo wrote: | CREATE TABLE x (y text, z text, PRIMARY KEY(y,z)); | CREATE TABLE a (b text, c text); | ALTER TABLE a ADD FOREIGN KEY (b, c) REFERENCES x(z, y); ... | I assert the problem here is that the FOREIGN KEY constraint | construction should have *failed* since the *tuple* (z,y) | does not infact match any unique key in table x. | | Looking at this more, I'm not sure that making it match the unique key | exactly helps information_schema.constraint_column_usage at least. My problem is that the column order can be provided in the reference clause in a way that does *not* match a canidate key: in the example above, there isn't a primary key nor a unique key index on (z,y). I think that's only true if z1=z2 and y1=y2 could have a different answer than y1=y2 and z1=z2. Otherwise, you're effectively definting both uniquenesses in a single constraint. | I don't see how you can differentiate the foreign keys in the last two | without a position column, which doesn't seem to be in at least our | current view (although I haven't checked 2003 to see if they changed it). | Both of those should be valid, although the second is wierd. There isn't a problem with the examples you provided, although the resulting join isn't what the user intended. I think the ability Actually, it's precisely what I intended. In the second case the labels happen to represent the other value. to omit the reference column list is a bad idea; but alias, it is quite different from the problem I'm reporting. The point is that because rows in a table don't have order (unless information_schema has special rules) the two constraints above seem to look the same to me in their representation in information_schema.constraint_column_usage. If that's true then forcing the referenced columns to match exactly doesn't actually fix the problem with the representation in infomration schema. The same ambiguity exists. We need to offer this for spec complience reasons, but I don't think it actually fixes the problem you would have with information_schema. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] constraints and sql92 information_schema compliance
On Tue, 14 Mar 2006, Stephan Szabo wrote: We need to offer this for spec complience reasons, but I don't think it actually fixes the problem you would have with information_schema. Which of course is wrong, as i figured out when the discussion came up the first time and forgot when I came to it while trying to work something out while doing it. :( Not enough sleep aparently. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] constraints and sql92 information_schema compliance
On Wed, 15 Mar 2006, Clark C. Evans wrote: On Tue, Mar 14, 2006 at 10:01:16PM -0800, Stephan Szabo wrote: | The point is that because rows in a table don't have order (unless | information_schema has special rules) the two constraints above seem to | look the same to me in their representation in | information_schema.constraint_column_usage. If that's true then forcing | the referenced columns to match exactly doesn't actually fix the problem | with the representation in infomration schema. The same ambiguity exists. Actually, there is no ambiguity; it's just that constraint_column_usage view is absolutely useless. You want to be using key_column_usage. Yeah, I remembered afterwards, so I had to send that later message. This came up because I was trying to figure out at what point (if ever) reordering should happen when the variable is set to allow references to the unique constraint in other orders (see below), and while looking at that, I poked at information_schema to see if I could come up with a good reason to do one way or another and queried the wrong one and then worried that I wasn't going to actually be solving the fundamental problem. --- The main options seem to be: When we're allowing other order access, immediately reorder the constraint information to match the primary key order. This helps out with IS since the loaded constraint should display properly, but theoretically could change the visual representation after load for people who don't care about this option. Change the representation unconditionally on dump. Basically reorder the constraint at dump time to always generate a dump in SQL03 order. This has the same downside as the above except only after another dump/restore. Change the representation on dump only if the flag is set (probably exporting this as an option to pg_dump as well). This could be a little more difficult to use, but pretty much causes the user to drive the choice. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] constraints and sql92 information_schema compliance
On Wed, 15 Mar 2006, Clark C. Evans wrote: On Tue, Mar 14, 2006 at 11:11:29PM -0800, Stephan Szabo wrote: | When we're allowing other order access, immediately reorder the | constraint information to match the primary key order. Let me try to parrot. In PostgreSQL, the pairing information between the foreign-key and unique-key constraint is available; even though it isn't represented in the information_schema. Hence, this option re-orders the foreign-key columns to match the corresponding canidate key constraint (unique _or_ foreign key). If so, I like it. Right, at create time (a,b) references t(d,c) where the key is actually t(c,d) would get treated as if the user had actually written (b,a) references t(c,d) if it's set up to accept that at all. | This helps out | with IS since the loaded constraint should display properly, but | theoretically could change the visual representation after load for people | who don't care about this option. I doubt that the actual ordering of the columns in the foreign key constraint matters to people; so I don't see a downside with this option other than the perhaps unexpected difference. The main case I could see is if an app thinks it knows what the key should look like (and looks at the catalogs or psql output or pg_dump output or the output of a function that gives back the key information in an api potentially) and now sees the key disappear and/or a new key appear after the upgrade. This option seems like the best apart from that one sticking point. | Change the representation unconditionally on dump. Basically reorder the | constraint at dump time to always generate a dump in SQL03 order. This has | the same downside as the above except only after another dump/restore. You could do both? Well, if you do the first, you're effectively doing this one as well, since it'll always dump in SQL03 order from that point forward. It looks like we can't really meaningfully change the behavior against old versions, so this would only affect dumps of 8.2 servers or later in any case. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal for SYNONYMS
On Thu, 9 Mar 2006, Josh Berkus wrote: Jonah, This email is a preliminary design for the implementation of synonyms in PostgreSQL. Comments and suggestions are welcomed. 1) Is there a SQL standard for this? 2) For my comprehension, what's the difference between a SYNONYM and a single-object (possibly updatable) view? I think with the plan as described, the permissions handling is slightly different from how we handle views. As I understood the synonym plan, a person with select on the synonym but not on the referenced table wouldn't be able to select through the synonym, while if the view was created by someone with select a person with select on the view could select through the view. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Proposal for SYNONYMS
On Thu, 9 Mar 2006, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: On Thu, 9 Mar 2006, Josh Berkus wrote: 2) For my comprehension, what's the difference between a SYNONYM and a single-object (possibly updatable) view? I think with the plan as described, the permissions handling is slightly different from how we handle views. As I understood the synonym plan, a person with select on the synonym but not on the referenced table wouldn't be able to select through the synonym, while if the view was created by someone with select a person with select on the view could select through the view. I was under the impression that privileges on the synonym wouldn't mean anything at all, with the exception that we'd track its ownership to determine who is allowed to drop the synonym. The point about views is a good one. I don't buy the argument that we should do synonyms instead of updatable views because it's easier. We *will* do updatable views at some point because (a) the spec requires it and (b) it's clearly useful. I'm not eager to be stuck with synonyms forever because somebody thought they could implement one and not the other. Well, the permissions handling would still be different between a view and a synonym AFAICS even if we dropped separate permissions on synonyms, so I don't think they're drop in replacements for each other even after updatable views. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Inherited Constraints
On Wed, 8 Mar 2006, Hannu Krosing wrote: ??hel kenal p??eval, E, 2006-03-06 kell 17:25, kirjutas Bruce Momjian: Hannu Krosing wrote: ?hel kenal p?eval, E, 2006-03-06 kell 12:12, kirjutas Bruce Momjian: Added to TODO: o Prevent parent tables from altering or dropping constraints like CHECK that are inherited by child tables Dropping constraints should only be possible with CASCADE. and we already have this in TODO: o %Prevent child tables from altering or dropping constraints like CHECK that were inherited from the parent table so I think we now have all the failure cases documented. If you want to be consistent, then ALTER TABLE ONLY ADD CONSTRAINT .. should also be forbidden, so you can't create non-inherited constraints I don't have a problem with creating ONLY constraints on parents and children. We just don't want them to be removed/modified if they are shared. Well, when you delete a constraint from child, the constraint becomes an ONLY constraint on parent. Only if there's a single child, otherwise you have a partially-ONLY constraint unless you made it ONLY constraints on the parent and all other children (but then removing the parent constraint wouldn't remove it from the other children presumably). If you allow ONLY constraints on parents, then why disallow dropping them from childs ? I agree with this in any case. I think both are fairly broken. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Foreign keys for non-default datatypes
On Fri, 3 Mar 2006, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: On Thu, 2 Mar 2006, Tom Lane wrote: 1. If the index opclass contains an exact operator for the case PKtype = FKtype, use that operator. Is this rule to read explicitly naming '=' or just the item in that position in the opclass? The operator occupying the equality position in the opclass. Okay. I think it's an acceptable idea to fail if we're going to extend the cross-type indexing support, but AFAICS we have to at the very least allow all of the standard numeric types in all combinations to work to meet the spec, and I don't think the above rules and current opclasses will give that to us (and I don't honestly understand some of the bits of this to know if there's a problem with extending the opclasses to allow that). The cases that are likely to be problematic are things like a FLOAT8 column referencing a NUMERIC primary key. However, that sort of mishmash is fraught with all kinds of risks anyway (think roundoff error) so the fact that the spec nominally allows it doesn't tell me that we ought to encourage it. There's a bit of difference between not encouraging it and disallowing it entirely, but I'm willing to buy that argument. I do think that numeric reference int needs to be allowed though, and I thought that's also currently not there (although int reference numeric should work I think). ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Foreign keys for non-default datatypes
On Fri, 3 Mar 2006, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: There's a bit of difference between not encouraging it and disallowing it entirely, but I'm willing to buy that argument. I do think that numeric reference int needs to be allowed though, and I thought that's also currently not there (although int reference numeric should work I think). Um, not sure which way you mean there. The case that would work in the proposal as I gave it is where the referencing column's type is implicitly promotable to the referenced column's type. So for example an FK int column could reference a PK numeric column but (without more btre support) not an FK numeric column referencing a PK int. Is that what you meant? Yes. If you try numeric-references-int today, you get the costly sequential scan warning, because the selected = operator is numeric_eq and that's not compatible with the PK's index. Basically, if any implicit promotion happens on the PK side, you're going to get the warning because you're no longer using an = operator that matches the index. We have not seen many complaints about getting that warning since 8.0, so I think that in practice people aren't using these cases and it'd be OK to make them a hard error instead. I would also argue that if implicit promotion does happen on the PK side, it's very questionable what semantics the FK constraint has anyway --- you can no longer be sure that the operator you are using has a notion of equality that's compatible with the PK index's notion. I'd argue that this case makes sense in a purely theoretical sense, numeric(8,0) references int is a sensible operation (when you don't take into account specific PostgreSQL limitations) and it's required by spec, but I agree that promotion on the PK side is problematic because of the issues about equality so we may not be able to do better. I'm just worried that we're moving further from compliance with the spec. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Foreign keys for non-default datatypes
On Fri, 3 Mar 2006, Tom Lane wrote: The reason I'm hesitant to add a bunch more cross-type operators is mainly that we have too darn many operators named = already. I've seen in recent profiling tests that it's taking the parser a noticeable amount of time to decide which one is meant. So I don't want to add a lot more without a fairly credible use-case, and right now this doesn't look like one to me. Honestly, as long as we *could* reasonably add one, I don't have a complaint, because we aren't taking a step that involves a huge amount of work if we get a complaint or separately decide to be more compliant on this. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Foreign keys for non-default datatypes
On Fri, 3 Mar 2006, Tom Lane wrote: BTW, I had another thought about this: if we go this way, then the plans associated with RI check queries would essentially always be trivial index lookups (for everything except RI_Initial_Check). It'd be within the realm of feasibility to throw away the current cached-plan RI infrastructure and simply do direct indexscans, in the style that we currently use for most system-catalog accesses. Would we have to do anything odd if we want to be testing only some of the index columns and possibly not in the index order (like match partial where some of the fk side is null)? I don't honestly see us doing match partial any time soon, but I'd like to have an idea of what'd be involved. If we did this then RI checks would no longer be subvertible by rules or user triggers. I don't think that it'd really help because it's the actions that are generally subvertible not the checks and since those are looking at the potentially not indexed fk side, I don't think the above would apply. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Foreign keys for non-default datatypes
On Thu, 2 Mar 2006, Tom Lane wrote: [ returning to a week-old thread... ] Stephan Szabo [EMAIL PROTECTED] writes: On Thu, 23 Feb 2006, Tom Lane wrote: Any thoughts about details? My feeling is that we should tie RI semantics to btree opclasses, same as we have done for ORDER BY and some other SQL constructs, but I don't have a concrete proposal right offhand. The btree idea may not cover cross-type FKs anyway. ISTM that the btree opclass is too restrictive right now since I'm guessing we'd want to allow say int4 - numeric which I don't think is in either btree opclass, but I don't know if they're not there because it wasn't worth putting in or if there's a more fundamental reason. I thought a bit more about this and realized that there's really a fairly natural way to associate an FK constraint with a btree index opclass. To wit, we already identify a unique index that the FK constraint depends on to enforce uniqueness of the PK column --- and in the current system, only btree indexes can be unique. So we can just use the opclass(es) of that index. (If we ever add uniqueness support to GiST, it would not be unreasonable to expect that the opclasses that support uniqueness identify exactly which operator they think defines equality, so we could still make it work for that.) To handle the cross-type cases, I propose that we make two checks: 1. If the index opclass contains an exact operator for the case PKtype = FKtype, use that operator. Is this rule to read explicitly naming '=' or just the item in that position in the opclass? 2. Otherwise, if there's an implicit coercion from the FK column type to the PK column type, apply that coercion and use the opclass's native equality operator. If neither of these work, I think it'd be OK to fail (ie, reject creation of the FK constraint). This would have the interesting property that the current warning condition FK constraint will require costly sequential scans would become an error case. I don't really have a problem with that --- if there are any practically-useful cases where people still get that warning, it means there are gaps we ought to fill in the btree cross-type indexing support, not that we ought to go out of our way to continue supporting a very inefficient mode of FK operation. In any case, the current code is willing to try to enforce FKs that use an equality operator that we have no hard evidence works like equality at all, and that surely isn't a good idea. I think it's an acceptable idea to fail if we're going to extend the cross-type indexing support, but AFAICS we have to at the very least allow all of the standard numeric types in all combinations to work to meet the spec, and I don't think the above rules and current opclasses will give that to us (and I don't honestly understand some of the bits of this to know if there's a problem with extending the opclasses to allow that). ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [SQL] Interval subtracting
On Wed, 1 Mar 2006, Hannu Krosing wrote: Ühel kenal päeval, K, 2006-03-01 kell 15:18, kirjutas Bruce Momjian: Stephan Szabo wrote: justify_days doesn't currently do anything with this result --- it thinks its charter is only to reduce day components that are = 30 days. However, I think a good case could be made that it should normalize negative days too; that is, the invariant on its result should be 0 = days 30, not merely days 30. What about cases like interval '1 month -99 days', should that turn into interval '-3 mons +21 days' or '-2 mons -9 days'? I think it should be the later. It is best to have a single sign, and I think it is possible in all cases: '2 mons -1 days' could be adjusted to '1 mons 29 days'. But unfortunately '2 mons -1 days' '1 mons 29 days' If I want something to happen 1 day less than two months from dome date, then the only way to say that consistently *is* '2 mons -1 days'. Right, but would you call justify_days on such an interval? '2 months -1 days' '1 mon 29 days', but '1 mon 60 days' is also '3 mons' in general usage. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] constraints and sql92 information_schema compliance
On Sat, 25 Feb 2006, Clark C. Evans wrote: On Sat, Feb 25, 2006 at 11:51:55AM -0800, Josh Berkus wrote: | This has been discussed previously in a couple of threads. I believe the | desire is to make it work as specified in SQL-2003, but I do not remember | whether or not anyone volunteered to do the work to make it happen. | | I believe that the newsysviews follow the SQL03 permissions structure. Fantastic! The SQL92 permission structure was braindead. After some time working with the information schema, I have three suggestions: * for foreign-key and check constraints, the default names are $1, $2, etc.; it would be great if they were upgraded to use the default names given by primary and unique key constraints: table_uk_1stcol, table_pk Err... what version are you using? I get constraint names like tt_a_fkey from devel, and I thought at least 8.1 does the same. * when creating a foreign key constraint on two columns, say from A (x, y) to B (x, y), if the unique index on B is (x,y) you can make a foreign key from A-B using (y,x) I don't understand which particular case you're complaining about, but as far as I can see, we have to allow that case by the rest of the spec. If A(x) is comparable to B(x) and B(y) and A(y) is comparable to B(x) and B(y), all of A(x,y)-B(x,y), A(y,x)-B(x,y), A(x,y)-B(y,x) and A(y,x)-B(y,x) seem to be allowed by the definition in the constraint section (as only the sets must be equal, with no mention of ordering). ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] constraints and sql92 information_schema compliance
On Sat, 25 Feb 2006, Stephan Szabo wrote: On Sat, 25 Feb 2006, Clark C. Evans wrote: On Sat, Feb 25, 2006 at 11:51:55AM -0800, Josh Berkus wrote: | This has been discussed previously in a couple of threads. I believe the | desire is to make it work as specified in SQL-2003, but I do not remember | whether or not anyone volunteered to do the work to make it happen. | | I believe that the newsysviews follow the SQL03 permissions structure. Fantastic! The SQL92 permission structure was braindead. After some time working with the information schema, I have three suggestions: * for foreign-key and check constraints, the default names are $1, $2, etc.; it would be great if they were upgraded to use the default names given by primary and unique key constraints: table_uk_1stcol, table_pk Err... what version are you using? I get constraint names like tt_a_fkey from devel, and I thought at least 8.1 does the same. * when creating a foreign key constraint on two columns, say from A (x, y) to B (x, y), if the unique index on B is (x,y) you can make a foreign key from A-B using (y,x) I don't understand which particular case you're complaining about, but as far as I can see, we have to allow that case by the rest of the spec. If A(x) is comparable to B(x) and B(y) and A(y) is comparable to B(x) and B(y), all of A(x,y)-B(x,y), A(y,x)-B(x,y), A(x,y)-B(y,x) and A(y,x)-B(y,x) seem to be allowed by the definition in the constraint section (as only the sets must be equal, with no mention of ordering). The sets in this case being the referenced columns and the unique columns in the unique constraint. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] constraints and sql92 information_schema compliance
On Sat, 25 Feb 2006, Clark C. Evans wrote: On Sat, Feb 25, 2006 at 12:51:51PM -0800, Stephan Szabo wrote: | * for foreign-key and check constraints, the default names | are $1, $2, etc.; it would be great if they were upgraded | to use the default names given by primary and unique key | constraints: table_uk_1stcol, table_pk | | Err... what version are you using? I get constraint names like tt_a_fkey | from devel, and I thought at least 8.1 does the same. 7.4.8, so it's a bit old -- glad to hear this made it! | * when creating a foreign key constraint on two columns, say | from A (x, y) to B (x, y), if the unique index on B is (x,y) | you can make a foreign key from A-B using (y,x) | | I don't understand which particular case you're complaining about, but as | far as I can see, we have to allow that case by the rest of the spec. To be clear, I'm talking about... CREATE TABLE x (y text, z text, PRIMARY KEY(y,z)); CREATE TABLE a (b text, c text); ALTER TABLE a ADD FOREIGN KEY (b, c) REFERENCES x(z, y); For this case, the information schema details: 1. the foreign key constraint as a reference to the primary key constraint and lists the tuple (b,c) 2. the primary key constraint lists the keys (y,z) In particular, the column ordering (z, y) in the reference clause is *lost*. Hence, if you were to blindly reconstruct a join critiera from the information schema, you'd wrongly assume that useful join critiera is: ON (a.b == x.y AND a.c == x.z) when the correct join critiera should be: ON (a.b == x.z AND a.c == x.y) I assert the problem here is that the FOREIGN KEY constraint construction should have *failed* since the *tuple* (z,y) does not infact match any unique key in table x. I disagree because the spec doesn't say that the columns must be equal or the list of columns must be the equal but that the set of columns must be equal. And in the definitions section, set is defined as an unordered collection of distinct objects. | If A(x) is comparable to B(x) and B(y) and A(y) is comparable to B(x) | and B(y), all of A(x,y)-B(x,y), A(y,x)-B(x,y), A(x,y)-B(y,x) and | A(y,x)-B(y,x) seem to be allowed by the definition in the constraint | section (as only the sets must be equal, with no mention of ordering). Ordering of tuples (fields within a row object) are significant in SQL; and hence the two above are not comparable. You misunderstand what comparable means in the above. Comparable is the constraint on the column types (for example numeric types are comparable to other numeric types). ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] constraints and sql92 information_schema compliance
On Sat, 25 Feb 2006, Stephan Szabo wrote: On Sat, 25 Feb 2006, Clark C. Evans wrote: On Sat, Feb 25, 2006 at 12:51:51PM -0800, Stephan Szabo wrote: | * for foreign-key and check constraints, the default names | are $1, $2, etc.; it would be great if they were upgraded | to use the default names given by primary and unique key | constraints: table_uk_1stcol, table_pk | | Err... what version are you using? I get constraint names like tt_a_fkey | from devel, and I thought at least 8.1 does the same. 7.4.8, so it's a bit old -- glad to hear this made it! | * when creating a foreign key constraint on two columns, say | from A (x, y) to B (x, y), if the unique index on B is (x,y) | you can make a foreign key from A-B using (y,x) | | I don't understand which particular case you're complaining about, but as | far as I can see, we have to allow that case by the rest of the spec. To be clear, I'm talking about... CREATE TABLE x (y text, z text, PRIMARY KEY(y,z)); CREATE TABLE a (b text, c text); ALTER TABLE a ADD FOREIGN KEY (b, c) REFERENCES x(z, y); For this case, the information schema details: 1. the foreign key constraint as a reference to the primary key constraint and lists the tuple (b,c) 2. the primary key constraint lists the keys (y,z) In particular, the column ordering (z, y) in the reference clause is *lost*. Hence, if you were to blindly reconstruct a join critiera from the information schema, you'd wrongly assume that useful join critiera is: ON (a.b == x.y AND a.c == x.z) when the correct join critiera should be: ON (a.b == x.z AND a.c == x.y) I assert the problem here is that the FOREIGN KEY constraint construction should have *failed* since the *tuple* (z,y) does not infact match any unique key in table x. I disagree because the spec doesn't say that the columns must be equal or the list of columns must be the equal but that the set of columns must be equal. And in the definitions section, set is defined as an unordered collection of distinct objects. Okay, I'll take that back for SQL2003. They must have realized that this was broken with information schema and changed it. That's an interesting incompatibility with old versions, but it's easy to implement. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] constraints and sql92 information_schema compliance
On Sun, 26 Feb 2006, Clark C. Evans wrote: Stephen, So, a quick re-cap of the questions/concerns I had: * Making the default constraint names include the table - This was implemented in 8.x, thank you! * Forbidding the creation of a foreign key constraint where the column list for the referenced table doesn't *exactly* match a canidate key on that table. - I think you've agreed to something like this, or am I mis-understanding? Well, SQL03 requires it to match exactly (I haven't checked 99). SQL92 explicitly requires us to support not matching exactly and we can't really remove it for some amount of time due to compatibility. About the best we're likely to be able to do is change pg_dump to dump it in the 03 order and possibly give an optional way to turn on an exact check (default off) for the next version, probably changing the default 1 or 2 versions after that. Personally, I disagree with the 03 requirement and think that it's more an example of them misdesigning the information schema, but we should probably move in that direction for compatibility with more recent versions of spec. * Issue a warning when creating a constraint who's name is not unique within its (the constraint's) schema. - This request seems to have gotten lost in the vigorous discussion ;) I don't have a problem with it (once, I argued for following the spec constraint on this way back when), however I think this was proposed and rejected before as excess noise. You might want to look back through the archives. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] constraints and sql92 information_schema compliance
On Sun, 26 Feb 2006, Clark C. Evans wrote: On Sat, Feb 25, 2006 at 04:50:41PM -0500, Rod Taylor wrote: On Sat, Feb 25, 2006 at 02:01:00PM -0800, Stephan Szabo wrote: | On Sat, 25 Feb 2006, Clark C. Evans wrote: | | CREATE TABLE x (y text, z text, PRIMARY KEY(y,z)); | CREATE TABLE a (b text, c text); | ALTER TABLE a ADD FOREIGN KEY (b, c) REFERENCES x(z, y); | | I assert the problem here is that the FOREIGN KEY constraint | construction should have *failed* since the *tuple* (z,y) | does not infact match any unique key in table x. | | I disagree because the spec doesn't say that the columns must be equal | or the list of columns must be the equal but that the set of columns must | be equal. And in the definitions section, set is defined as an unordered | collection of distinct objects. Let's use the example Rod gave us above. If the comparison for foreign key constraints should be done as an unorderd set, then why does the following fail? | k=# insert into x values ('foo', 'bar'); | INSERT 0 1 | k=# insert into a values ('foo', 'bar'); | ERROR: insert or update on table a violates foreign key constraint | a_b_fkey | DETAIL: Key (b,c)=(foo,bar) is not present in table x. While the SQL1992 specification may be horribly incorrect; the current behavior is not compliant with it... so this isn't a great defense. If PostgreSQL didn't store the order of the columns referenced, it couldn't provide the error above (which makes sense, given the extension). No, because you're misunderstanding what the SQL92 spec says. The spec says that the comparison between the (z,y) in the references and the key definition (y,z) is unordered, not that the comparisons between (b,c) and (z,y) are unordered. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Request: set opclass for generated unique and primary
On Thu, 23 Feb 2006, Pavel Stehule wrote: Right, but does the pattern_ops one have to be unique? Sorry, I don't uderstand Are you trying to guarantee uniqueness on the pattern_ops rules. My understanding is that pattern_ops is not just a special index that allows like comparisons using the index while necessarily giving the same rules throughout, it uses a different set of comparison rules from the default index type. I think this means that you could have a locale in which the insertion of sequence a and then b would fail for a normal unique constraint and succeed for one based on pattern_ops. If the rule you are trying to guarantee is that the two values are unique using the standard definition of unique, the latter index is insufficient for doing so and having a unique index on (foo) and an non-unique index on (foo whatever_pattern_ops) is not duplication. If the rule you are trying to guarantee is that the two values are unique using the rules as defined by pattern ops, then a unique index on (foo whatever_pattern_ops) and an index on (foo) is not duplication, and it's not really a unique constraint by a strict definition, and as such I'm not convinced it should use the unique constraint syntax. If the rule you are trying to guarantee that it's unique to both rules (ie if either fails it's disallowed), you need both unique indexes in order to guarantee that. I looked to source code and I propouse syntax: CREATE TABLE name .. colname type PRIMARY KEY | UNIQUE [USING INDEX [TABLESPACE ...] [OPERATOR CLASS opclass] CREATE CONSTRAINT name PRIMARY KEY | UNIQUE '(' colname opt_class, I don't think so this need big patch. What do you think? I don't think it's that easy. In general I don't think there's a guarantee that the rules the opclass uses are directly comparable to those of the equality operator. As such, the constraint doesn't necessarily have the same guarantees as a standard unique constraint as given by spec. That means we need to be careful whenever we're making assumptions about a unique constraint. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Foreign keys for non-default datatypes
On Thu, 23 Feb 2006, Tom Lane wrote: I looked into the problem reported here: http://archives.postgresql.org/pgsql-admin/2006-02/msg00261.php To wit, a pg_restore of a foreign key constraint involving user-defined types produces pg_restore: WARNING: foreign key constraint luuid_fkey will require costly sequential scans DETAIL: Key columns luuid and luuid are of different types: public.uniqueidentifier and public.uniqueidentifier. A small variation of the example (I tried it with the contrib isbn type instead, just to see if it was uniqueidentifier's fault) produces complete failure: isbn=# ALTER TABLE ONLY beta ADD CONSTRAINT beta_luuid_fkey FOREIGN KEY (luuid) REFERENCES alpha(luuid); ERROR: foreign key constraint beta_luuid_fkey cannot be implemented DETAIL: Key columns luuid and luuid are of incompatible types: public.isbn and public.isbn. The problem is that pg_dump likes to set a restrictive search path: the above follows isbn# SET search_path = delta, pg_catalog; while the data type and its operators all are defined in the public schema. So when ATAddForeignKeyConstraint checks to see if there's a suitable = operator, it doesn't find the intended operator. In the isbn case it doesn't find anything at all; in the uniqueidentifier case there's an implicit cast to text and so the texteq operator is found, but it's not a member of the index's opclass and thus the warning appears. Even if ATAddForeignKeyConstraint weren't trying to be helpful by checking the operator, we'd be facing the exact same risks at runtime --- the RI triggers blithely assume that foo = bar will do the right thing. This has been a hazard in the RI code since day one, of course, but I think it's time to face up to it and do something about it. The RI code ought not be assuming that = will find an appropriate operator --- it should be doing something based on semantics, not a pure name search, and definitely not a search-path-dependent search. This ties into Stephan's nearby concerns about whether unique indexes using nondefault opclasses make sense as infrastructure for foreign key checks. The answer of course is that they make sense if and only if the = operator used for the RI check is a member of the index opclass. Right, when I'd said or we allow you to specify a different equality operator for that case which matches the one in the opclass I sort of was thinking that direction, although I was considering either matching the index one or allowing you to specify the operator if we went that way. Those still bug me a little (especially matching the index one) because writing out the constraint's check by hand by looking at the spec would give different results. Any thoughts about details? My feeling is that we should tie RI semantics to btree opclasses, same as we have done for ORDER BY and some other SQL constructs, but I don't have a concrete proposal right offhand. The btree idea may not cover cross-type FKs anyway. ISTM that the btree opclass is too restrictive right now since I'm guessing we'd want to allow say int4 - numeric which I don't think is in either btree opclass, but I don't know if they're not there because it wasn't worth putting in or if there's a more fundamental reason. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_config, pg_service.conf, postgresql.conf ....
On Wed, 22 Feb 2006, Mark Woodward wrote: Mark Woodward wrote: I'm not sure that I agree. At least in my experience, I wouldn't have more than one installation of PostgreSQL in a production machine. It is potentially problematic. I agree with you for production environments, but for development, test, support (and pre-sales) machines there are reasonable requirements for several. Oh, sure, for dev, margeting, etc. It doesn't matter. When you have to manage a thousand systems, standards save tons of work. Even if you have only one installation - something to tell you *where* the binaries are installed is convenient - as there are quite a few common locations (e.g. packages installing in /usr or /usr/local, source builds in /usr/local/pgsql or /opt/pgsql). I've seen many *uncommon* variants: (e.g. /usr/local/postgresql, /usr/local/postgresql-version, /usr/local/pgsql/version, ...). Admittedly, given that the binaries are likely to be in the cluster-owners default PATH, it is not as hard to find them as the data directory. However, this is all about convenience it would seem, since (for many *nix platforms) two simple searches will give you most of what is needed: $ locate postmaster $ locate pg_hba.conf That's not the issue. I find it frustrating sometimes because when I describe one scenario, people debate it using other scenarios. Maybe I lack the communications skills to convey the problem accurately. I don'tn think it is that. I think it's to some extent that you are starting from a basis that hasn't yet been agreed upon. First, you should show that your scenario is reasonable. I haven't seen a reason to assume that the configuration file will be more up to date than other documentation of the setup. Without that, the theoretical benefit of the configuration is not fully realized, and in fact could be negative (for example, what if in your second scenario it is the important db that's not in the config). Second, you should show that it belongs in the main package. I think you could write this without touching the main package. There's then a question of whether having it in the main package has any negative effect on people that aren't using it (which includes opportunity cost of features that might be lost because they don't fit the scenario -- for example, if someone does have multiple versions of postgresql, does this preclude a feature to make their administration better) and a question of whether there are any pieces that must be in the main package. I think this is a reasonable idea because it can help suggest a way of doing this to people that might otherwise be doing it by the seat of their pants, but that's a somewhat different argument. I don't think that I'd trust the configuration file to be correct if I couldn't trust the admin to be doing a good job, and to me that includes at least marginally reasonable documentation. I think that having some system for doing this (whether it's in the main package or not) is better than multiple people writing their own. I am not sure whether having it in the main package doesn't have a small negative effect on people that need for other reasons to do their own thing but I haven't looked at it seriously. But since I don't have time to do it, I'm not going to expect someone else to do it if they disagree. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Request: set opclass for generated unique and primary
On Wed, 22 Feb 2006, Pavel Stehule wrote: Why would you need it? I can't to use unique index for like_op without setting opclass, because I have to use czech locale. I can create second index, but then I have two equal indexes. Example: number | description 000102 blabla bla 000103 fooo number: varchar primary key. Sometimes I need search all values with one prefix ~ like '0001%'. That's all. USING INDEX [TABLESPACE ..] [OPCLASS ..] This is unworkable --- consider a table with more than one unique constraint and/or multiple-column constraints. I forgot (full syntax is): CREATE TABLE number varchar PRIMARY KEY USING OPCLAS varchar_pattern_ops, My problem with this is that the above isn't a primary key by a strict definition and may not have the normal semantics for primary keys. For example, is said key a valid target for foreign keys? I don't think it necessarily is, unless we can always guarantee that it's unique for normal equality as well or we allow you to specify a different equality operator for that case which matches the one in the opclass. I seem to recall someone proposing extending the syntax of the UNIQUE constraints themselves, but there really isn't enough use-case to justify it AFAICS. Especially not when you can always use CREATE UNIQUE INDEX. I can always use second unique index. But it's redundant. This problem is related to using nonC locale. Why do you need both the unique index with varchar_pattern_ops and one with the default ops? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Request: set opclass for generated unique and primary
On Wed, 22 Feb 2006, Pavel Stehule wrote: I seem to recall someone proposing extending the syntax of the UNIQUE constraints themselves, but there really isn't enough use-case to justify it AFAICS. Especially not when you can always use CREATE UNIQUE INDEX. I can always use second unique index. But it's redundant. This problem is related to using nonC locale. Why do you need both the unique index with varchar_pattern_ops and one with the default ops? Because LIKE op don't use index on default ops with non C locale. I found it on tables of czech communities. Primary key is NUTS - 4..6 numbers. I have to search values with some prefix - op Like and on primary key can't to use std. index. Right, but does the pattern_ops one have to be unique? And if it does, do you need the normal unique constraint as well? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] pg_config, pg_service.conf, postgresql.conf ....
On Tue, 21 Feb 2006, Mark Woodward wrote: Mark Woodward wrote: The pg_config program needs to display more information, specifically where the location of pg_service.conf would reside. pg_config --sysconfdir Hmm, that doesn't show up with pg_config --help. [EMAIL PROTECTED]:~$ pg_config --sysconfdir pg_config: invalid argument: --sysconfdir Try pg_config --help for more information Also, I know I've been harping on this for years (literally), but since the PosgteSQL programs already have the notion that there is some static directory for which to locate files (pg_service.conf), couldn't we also use this directory to include pg_hba.conf, pg_ident.conf, and perhaps even postgresql.conf? Considering that pg_service.conf is a client configuration file and the others are server configuration files, I don't think there is a causal relationship between putting them one place or another, independent of the individual merit of placing them in particular spot. I'm not sure that sentence makes sense. Well, here's the thinking involved: pg_service.conf may currently be considered a client side utility, but it need not only be considered that. I think it should. The meaning of what a client side configuration needs and what a server side configuration needs are different, and intermixing them only either leaks information (server side information visible on clients) or involves working around that with multiple configuration files, which pretty much defeats the purpose of sharing the configuration. In addition, the service on the client side is not the same as a cluster on the server side AFAIK (and if I'm right, that's pretty necessary), so trying to equate them seems like a bad idea for things like pg_ctl. I think you'd need a different concept. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [SQL] Interval subtracting
On Sat, 18 Feb 2006, Tom Lane wrote: Milen A. Radev [EMAIL PROTECTED] writes: Milorad Poluga : SELECT '10 years 1 mons 1 days'::interval - '9 years 10 mons 15 days'::interval ?column? --- 3 mons -14 days Why not '2 mons 16 days' ? Please read the last paragraph in section 8.5.1.4 of the manual (http://www.postgresql.org/docs/8.1/static/datatype-datetime.html#AEN4775) . It mentions the functions named justify_days and justify_hours that could do what you need. justify_days doesn't currently do anything with this result --- it thinks its charter is only to reduce day components that are = 30 days. However, I think a good case could be made that it should normalize negative days too; that is, the invariant on its result should be 0 = days 30, not merely days 30. What about cases like interval '1 month -99 days', should that turn into interval '-3 mons +21 days' or '-2 mons -9 days'? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] how is that possible
On Fri, 10 Feb 2006 ohp@pyrenet.fr wrote: After a typo, I've just noticed the following : ~ 14:58:33: createdb test CREATE DATABASE ~ 14:58:42: psql test Welcome to psql 8.1.2, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit test=# create table t1 (i int primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index t1_pkey for table t1 CREATE TABLE test=# create table t2 (i int references t2 1 on deley te cascade on update cascade ); CREATE TABLE test=# insert into t2 values (default); INSERT 0 1 test=# select * from t1; i --- (0 rows) test=# select * from t2; i --- (1 row) test=# \q should'nt the insert fail or have I mised something? Why do you think it should have failed? It looks okay to me. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] how is that possible
On Fri, 10 Feb 2006 ohp@pyrenet.fr wrote: My understanding is that null or not, their should have been a foreign key violation. Not as far as I can tell. MATCH (without partial or full) returns true if any column in the row value constructor is NULL. MATCH FULL returns true if all columns in the row value constructor are NULL and returns false if it's a mix of NULLs and non-NULLs. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Copy From Insert UNLESS
On Mon, 6 Feb 2006, Josh Berkus wrote: Are you sure that a new type of constraint is the way to go for this? It doesn't solve our issues in the data warehousing space. The spec we started with for Error-tolerant COPY is: 1) It must be able to handle parsing errors (i.e. bad char set); 2) It must be able to handle constraint violations; 3) It must output all row errors to a log or errors table which makes it possible to determine which input row failed and why; 4) It must not slow significantly (like, not more than 15%) the speed of bulk loading. On that basis, Alon started working on a low-level error trapper for COPY. It seems like your idea, which would involve a second constraint check, would achieve neigher #1 nor #4. I think in his system it wouldn't check the constraints twice, it'd just potentially check them at a different time than the normal constraint timing, so I think it'd cover #4. I'd wonder if there'd be any possibility of having violations get unnoticed in that case, but I'm not coming up with an obvious way that could happen. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Copy From Insert UNLESS
On Mon, 6 Feb 2006, James William Pye wrote: On Sun, Feb 05, 2006 at 07:14:49PM -0800, Stephan Szabo wrote: On Sun, 5 Feb 2006, James William Pye wrote: However, constraints referenced in an UNLESS clause that are deferred, in any fashion, should probably be immediated within the context of the command. Perhaps a WARNING or NOTICE would be appropriately informative if UNLESS were to actually alter the timing of a given constraint. The problem is that even immediate constraints are supposed to be checked at end of statement, not at row time. I see. Immediated is not the word that I am actually looking for then. :( Perhaps Postgres should specify our current immediate as a new constraint mode. instant, maybe? Sadly, I think it will be difficult to get away from using that or some other synonym if such an idea were to be implemented. [Getting the feeling that this has been discussed before. ;] Only parts of it. :) Our implementation of UNIQUE is particularly bad for this. Yes. Changing how UNIQUE constraints are implemented will likely be the first step in this patch. Any facility that can alter the tuple before it being inserted into the heap should probably be exercised prior to the application of the tuple against UNLESS's behavior. The problem is that you can un-violate a unique constraint by changing some other row that's already in the table. And I think that it might even be legal to do so in an after trigger (and in fact, some other row's after trigger). [join] Basically a violation at the time the row is created is irrelevant if the violation is gone by the end of statement. Okay. I can't help but think such a trigger as being questionable at best. However, per spec, it should be possible. =\ Yeah, it's pretty odd in the insert case. It's easy in the update case to make a case where it matters, definately less so for insert. Once the UNIQUE constraint code is relocated, I think implementing more standards compliant constraint timing might be substantially easier. However, I don't think this should effect UNLESS. Rather, I think UNLESS should, more or less, demand that specified constraints be checked at the same time as they are currently. This is meant to be an optimization at multiple levels; reduce code redundancy(rewriting constraint checks for use prior to the actual insertion), computational redundancy(potentially, running the rewritten checks more than once), and reduce unnecessary I/O(avoiding heap_insert()'ing an evil tuple into the target table despite the fact that the statement may later inviolate it). Although, perhaps, it could be configurable with an option; INSERT INTO t UNLESS [DEFERRED] CONSTRAINT VIOLATION. =) I'd say that if we were going to check the constraints at a different time, we'd want a better name/description than UNLESS CONSTRAINT VIOLATION since the unadorned INSERT or COPY might run with no constraint violations. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Copy From Insert UNLESS
On Fri, 3 Feb 2006, James William Pye wrote: Despite the fact that my experimental patch uses error trapping, that is *not* what I have in mind for the implementation. I do not want to trap errors upon insert or copy from. Rather, I wish to implement functionality that would allow alternate destinations for tuples that violate user specified constraints on the table, which, by default, will be to simply drop the tuple. My proposed syntax is along the lines of: INSERT INTO table [ ( column [, ...] ) ] * [UNLESS CONSTRAINT VIOLATION [ON (constraint_name [, ...]) [THEN INSERT INTO table]] [, OR ...]] { DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) | query } and COPY tablename [ ( column [, ...] ) ] FROM { 'filename' | STDIN } * [UNLESS CONSTRAINT VIOLATION [ON (constraint_name [, ...]) [THEN INSERT INTO table]] [, OR ...]] ... The purpose of the arguably unintuitive THEN INSERT INTO x portion is to provide the mechanism in which a user can specify the destination table for tuples that violated the associated set of constraints. Using the OR portion allows the user to specify additional sets of constraints for different destinations. A tuple will be withheld from the target table if ANY of the constraints listed in any of the constraint_name sets is violated. Constraint sets should not [may not?] reference the same constraint multiple times, even among different sets. Example: \d dest_table Table public.dest_table Column | Type | Modifiers +-+--- i | integer | not null j | integer | Indexes: dest_table_pkey PRIMARY KEY, btree (i) Check constraints: dest_table_j_check CHECK (j 0) CREATE TEMP TABLE pkey_failures (i int, j int); CREATE TEMP TABLE check_failures (i int, j int); COPY dest_table FROM STDIN UNLESS CONSTRAINT VIOLATION ON (dest_table_pkey) THEN INSERT INTO pkey_failures OR (dest_table_j_check) THEN INSERT INTO check_failures; For most constraints, this proposed implementation should be fairly easy to implement. Have you considered how this might work with spec-compliant constraint timing? I think even in inserting cases, a later trigger before statement end could in some cases un-violate a constraint, so checking before insert won't actually be the same behavior as the normal constraint handling which seems bad for this kind of system. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Copy From Insert UNLESS
On Sun, 5 Feb 2006, James William Pye wrote: On Sun, Feb 05, 2006 at 02:08:12PM -0800, Stephan Szabo wrote: Have you considered how this might work with spec-compliant constraint timing? I haven't gone so far as to look into the spec, yet. [Noise of rustling papers] However, constraints referenced in an UNLESS clause that are deferred, in any fashion, should probably be immediated within the context of the command. Perhaps a WARNING or NOTICE would be appropriately informative if UNLESS were to actually alter the timing of a given constraint. The problem is that even immediate constraints are supposed to be checked at end of statement, not at row time. Our implementation of UNIQUE is particularly bad for this. Basically a violation at the time the row is created is irrelevant if the violation is gone by the end of statement. I think even in inserting cases, a later trigger before statement end could in some cases un-violate a constraint, so checking before insert won't actually be the same behavior as the normal constraint handling which seems bad for this kind of system. Any facility that can alter the tuple before it being inserted into the heap should probably be exercised prior to the application of the tuple against UNLESS's behavior. The problem is that you can un-violate a unique constraint by changing some other row that's already in the table. And I think that it might even be legal to do so in an after trigger (and in fact, some other row's after trigger). This isn't necessarily a killer to the idea though, it probably just means the semantics are harder to nail down. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Multiple logical databases
On Thu, 2 Feb 2006, Mark Woodward wrote: Now, the answer, obviously, is to create multiple postgresql database clusters and run postmaster for each logical group of databases, right? That really is a fine idea, but Say, in pgsql, I do this: \c newdb It will only find the database that I have in that logical group. If another postmaster is running, obviously, psql doesn't know anything about it. From the DB admin perspective, maybe there should be some heirarchical structure to this. What if there were a program, maybe a special parent postmaster process, I don't know, that started a list of child postmasters based on some site config? The parent postmaster would hold all the configuration parameters of the child postmaster processes, so there would only be on postgresql.conf. This also answers how do we get postgresql options in a database, because the parent postmaster only needs to bootstrap the others, it can be configured to run lean and mean, and the real settings can be inspected and changed at will. A trigger will send a HUP to child postmasters when their settings change. The parent postmaster only needs one connection for each child and one admin, right? Does anyone see this as useful? Not as described above, no. Perhaps with a more concrete plan that actually talks about these things in more details. For example, you posit the \c thing as an issue, I don't personally agree, but you also don't address it with a solution. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Foreign key trigger timing bug?
On 12/9/2005 8:27 PM, Stephan Szabo wrote: On Fri, 9 Dec 2005, Jan Wieck wrote: On 12/8/2005 8:53 PM, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Yeah. I really don't understand it, but it appears to me to be explicitly different in the spec for on delete cascade even compared to the rest of the referential actions. One problem I see is, what do we do if the BEFORE trigger then returns NULL (to skip the delete). The cascaded operations are already done. Do we have to execute the cascaded deletes in a subtransaction or do we disallow the skip in this case? I think we'd have disallow skipping. Especially since skipping would probably end up with a violated constraint. That seems to me to be a sufficient reason to not follow the spec in this respect. A BEFORE trigger should be run BEFORE anything happens, full stop. I can't think of any good reason why the spec's semantics are better. (It's not like our triggers are exactly spec-compatible anyway.) It doesn't lead to a violated constraint. bar references foo on delete cascade, now delete from foo will first delete from bar, then the before trigger on foo skips the delete. That's not the right case I think. Pseudo example: create table a create table b references a on delete cascade create before trigger on b that sometimes skips a delete to b insert into a and b. delete from a -- AFAICS, you can end up with a row in b that no longer has its associated row in a since the a row will be deleted but there's no guarantee its referencing rows in b will have successfully been deleted. Yes, you can deliberately break referential integrity with that. But you know what? I think the overall waste of performance and developer time, required to fix this rather exotic (and idiotic) problem, is too high to seriously consider it. Well, the case that brought up the original question was one where the before trigger updated rows that were going to be affected by the cascaded delete. Before this worked by accident, now it gives an error (even though the key wasn't changed due to some other possibilities of violation forcing the check). The problem is that if we're not consistent about what violation cases are acceptable, it's hard to diagnose if something is an actual bug or merely an acceptable side effect. :) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Foreign key trigger timing bug?
On Fri, 9 Dec 2005, Jan Wieck wrote: On 12/8/2005 8:53 PM, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Yeah. I really don't understand it, but it appears to me to be explicitly different in the spec for on delete cascade even compared to the rest of the referential actions. One problem I see is, what do we do if the BEFORE trigger then returns NULL (to skip the delete). The cascaded operations are already done. Do we have to execute the cascaded deletes in a subtransaction or do we disallow the skip in this case? I think we'd have disallow skipping. Especially since skipping would probably end up with a violated constraint. That seems to me to be a sufficient reason to not follow the spec in this respect. A BEFORE trigger should be run BEFORE anything happens, full stop. I can't think of any good reason why the spec's semantics are better. (It's not like our triggers are exactly spec-compatible anyway.) It doesn't lead to a violated constraint. bar references foo on delete cascade, now delete from foo will first delete from bar, then the before trigger on foo skips the delete. That's not the right case I think. Pseudo example: create table a create table b references a on delete cascade create before trigger on b that sometimes skips a delete to b insert into a and b. delete from a -- AFAICS, you can end up with a row in b that no longer has its associated row in a since the a row will be deleted but there's no guarantee its referencing rows in b will have successfully been deleted. And besides, as the other post (Trigger preventing delete causes circumvention of FK) in GENERAL shows, triggers can break RI anyway. Yeah, although fixing the cases where the trigger interacted badly with before triggers was the point of the posts that started this. The original problem was with a case where it acted differently, but it's fairly related. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Foreign key trigger timing bug?
On Thu, 8 Dec 2005, Jan Wieck wrote: On 12/7/2005 4:50 PM, Stephan Szabo wrote: On Wed, 7 Dec 2005, Bruce Momjian wrote: I had an open 8.1 item that was: o fix foreign trigger timing issue Would someone supply text for a TODO entry on this, as I don't think we fixed it in 8.1. I'd split this into two separate items now. Fix before delete triggers on cascaded deletes to run after the cascaded delete is done. This is odd, but seems to be what the spec requires. Ugh, that sounds ugly. Yeah. I really don't understand it, but it appears to me to be explicitly different in the spec for on delete cascade even compared to the rest of the referential actions. One problem I see is, what do we do if the BEFORE trigger then returns NULL (to skip the delete). The cascaded operations are already done. Do we have to execute the cascaded deletes in a subtransaction or do we disallow the skip in this case? I think we'd have disallow skipping. Especially since skipping would probably end up with a violated constraint. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Foreign key trigger timing bug?
On Wed, 7 Dec 2005, Bruce Momjian wrote: I had an open 8.1 item that was: o fix foreign trigger timing issue Would someone supply text for a TODO entry on this, as I don't think we fixed it in 8.1. I'd split this into two separate items now. Fix before delete triggers on cascaded deletes to run after the cascaded delete is done. This is odd, but seems to be what the spec requires. Fix problems with referential action caused before triggers that modify rows that would also be modified by the referential action. Right now, this has a few symptoms, either you can get spurious seeming errors from the constraint or you can end up with invalid data in the referencing table. As far as I can see, the spec doesn't have much to say about this because the spec doesn't seem to allow before triggers to modify tables. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: Réf. : RE: [HACKERS] Running PostGre on DVD
On Tue, 15 Nov 2005 [EMAIL PROTECTED] wrote: I don't understand why an user can't WILLINGLY (by EXPLICITLY setting an OPTION) allow a privileged administrator to run PostGre. Well, to start with, it increases the support costs of the product as a whole to the community. Adding an option with severe security implications is not free, at least not if you want to be reasonably diligent about minimizing and documenting the risks. Generally the community tries to take that seriously, so IMHO just assuming that anyone who sets it knows the risks isn't acceptable. Why don't we actually start looking at the actual implications and see what we can do about them, rather than either assuming they're too great or too minimal. Maybe we'll come up with solutions to current problems as well. I'm deeply disappointed to be forced to compile my own PostGre and I will not. Well, given that such an option isn't likely to go in before 8.2 given the policy on dot version changes, I don't think you can get out of compiling a copy unless you have a year before shipping. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 8.1 substring bug?
On Fri, 11 Nov 2005, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: It's even sillier than that: test=# SELECT substring ('1234567890' FOR 4::bigint); substring --- (1 row) test=# SELECT substring ('1234567890' FOR 4::int); substring --- 1234 (1 row) This has been complained of before. The problem is that there is no implicit cast from bigint to int, but there is one from bigint to text, so the only acceptable mapping the parser can find is to convert bigint to text and apply the pattern-match version of substring(). (There are some other things happening here because of the weird SQL99 syntax, but that's the bottom line.) It looks to me like we should be supporting any exact numeric with scale 0 there (at least AFAICS from SQL92 and SQL03), so I don't think the current behavior is compliant. It doesn't look like adding a numeric overload of the function works, and the function also becomes ambiguous for int2 inputs. :( ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 8.1 substring bug?
On Fri, 11 Nov 2005, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: It looks to me like we should be supporting any exact numeric with scale 0 there (at least AFAICS from SQL92 and SQL03), so I don't think the current behavior is compliant. It doesn't look like adding a numeric overload of the function works, and the function also becomes ambiguous for int2 inputs. :( Currently (see gram.y, about line 7600) the grammar converts SUBSTRING(foo FOR bar) into pg_catalog.substring(foo, 1, bar) and then leaves the normal function-call-analysis code to make the best of it with that. If bar isn't implicitly castable to integer then you've got trouble. Right, I was thinking we could get around it with another substring that took two numerics, but then I think FROM int2 FOR int2 would be ambiguous. I was toying with the idea of making it translate instead to pg_catalog.substring(foo, 1, (bar)::int4) since AFAICS there isn't any other reasonable mapping once you have committed to having the 1 in there. This does not solve the general problem, but it'd address the particular case anyway ... And, it's fairly reasonable to assume at least right now that any reasonable string offset or length fits in an int4. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing the overhead of NUMERIC data
On Thu, 3 Nov 2005, Simon Riggs wrote: On Wed, 2005-11-02 at 19:12 -0500, Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Could someone please quantify how much bang we might get for what seems like quite a lot of bucks? I appreciate the need for speed, but the saving here strikes me as marginal at best, unless my instincts are all wrong (quite possible) Two bytes per numeric value is not a lot, agreed. I'm optimising for Data Warehousing. If you have a very large table with a higher proportion of numerics on it, then your saving can be 5% of tablesize which could be very useful. For the general user, it might produce less benefit, I accept. At the moment we've established we can do this fairly much for free. i.e. nobody cares about the drop in digits (to 255) and the other coding I don't believe the above is safe to say, yet. AFAICS, this has been discussed only on hackers (and patches) in this discussion, whereas this sort of change should probably be brought up on general as well to get a greater understanding of whether there are people who care. I expect that there won't be, but given that I'm still not sure what the plan to support applications upward is for this change, I think it's probably a good idea to query a larger segment of the population. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pl/pgsql breakage in 8.1b4?
On Fri, 28 Oct 2005, Philip Yarra wrote: Just testing pl/pgsql functions in 8.1beta4, I see failures for syntax that works in 8.0.3. The simplest test case for this is: The function below fails for me similarly in 8.0.3 on execution. 8.1 merely tells you at creation time. Using bar and foo as both parameter names and the field names doesn't really work. create table ptest(foo int, bar varchar(10)); create or replace function modify_ptest( foo int, bar varchar) returns numeric as $$ declare res numeric; begin update ptest set bar = modify_ptest.bar where foo = modify_ptest.foo; res := 0; return res; end; $$ LANGUAGE plpgsql; The error message from psql is: ERROR: syntax error at or near $1 at character 19 QUERY: update ptest set $1 = modify_ptest.bar where $2 = modify_ptest.foo CONTEXT: SQL statement in PL/PgSQL function modify_ptest near line 7 LINE 1: update ptest set $1 = modify_ptest.bar where $2 = modify... ^ I assume what I'm trying should still work, though I couldn't find comparable examples in the doco. Looks to me like a problem with parameter aliasing. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle
On Wed, 19 Oct 2005, Dann Corbit wrote: -Original Message- From: Terry Fielder [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 19, 2005 2:05 PM To: Dann Corbit Cc: Tino Wildenhain; Marc G. Fournier; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org; pgsql-general@postgresql.org Subject: Re: 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle buysInnobase) Dann Corbit wrote: Try this query in Oracle, SQL*Server, DB/2, Informix, etc.: connxdatasync=# select 1 where cast('a' as varchar(30)) = cast('a ' as varchar(30)); ?column? -- (0 rows) I see how you can interpret the SQL Standard to make the above response a correct one. But is it the response that you would like? When the compared datatypes are VARCHAR: YES What is the value of doing that? I can see plenty of harm and absolutely no return. We are talking about blank padding before comparison. Do you really want 'Danniel ' considered distinct from 'Danniel ' in a comparison? In real life, what does that buy you? It buys you the ability to store things where trailing spaces are signficant (for example passwords) within the existing limitations of not having a full set of the collation behavior. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Work-in-progress referential action trigger
On Fri, 2 Sep 2005, Stephan Szabo wrote: [Hackers now seems more appropriate] On Thu, 1 Sep 2005, Stephan Szabo wrote: On Tue, 23 Aug 2005, Stephan Szabo wrote: Here's my current work in progress for 8.1 devel related to fixing the timing issues with referential actions having their checks run on intermediate states. I've only put in a simple test that failed against 8.0 in the regression patch and regression still passes for me. There's still an outstanding question of whether looping gives the correct result in the presence of explicit inserts and set constraints immediate in before triggers. As Darcy noticed, the patch as given does definately still have problems with before triggers. I was able to construct a case that violates the constraint with an update in a before delete trigger. I think this might be why the spec has the wierd timing rules for before triggers on cascaded deletes such that the deletions happen before the before triggers. We have a similar problem for before triggers that update the rows that are being cascade updated. The following seems to violate the constraint for me on 8.0.3: drop table pk cascade; drop table fk cascade; drop function fk_move(); create table pk(a int primary key); create table fk(a int references pk on delete cascade on update cascade, b int); create function fk_move() returns trigger as ' begin raise notice '' about to move for % '', old.b; update fk set b=b-1 where b old.b; return new; end;' language 'plpgsql'; create trigger fkmovetrig before update on fk for each row execute procedure fk_move(); insert into pk values(1); insert into pk values(2); insert into fk values(1,1); insert into fk values(1,2); insert into fk values(2,3); select * from pk; select * from fk; update pk set a = 3 where a = 1; select * from pk; select * from fk; This gives me (3,1), (1,1) and (2,2) as the rows in fk where the (1,1) row is invalid. This is obviously wrong, but the question is, what is the correct answer? Should the update in the before trigger trying to change b on a row that no longer has a reference have errored? We can't do that for compatibility reasons, but it would allow us to say that modifying a row in a before trigger that is also a row selected in the outer statement is an error for this update case. It'd presumably be an error for a normal delete as well, although I think it might be relaxable for cascaded deletes because the spec seems to say that the before triggers for deletions caused by the cascade are actually run after the removals. I'm not sure whether we could easily differentiate this case from any other cases where the row was modified twice either yet. Is there a case other than a before trigger updating a row we will want to act upon later in the statement where we'll get a row with xmax of our transaction and cmax greater than the current command? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Work-in-progress referential action trigger
On Fri, 9 Sep 2005, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Is there a case other than a before trigger updating a row we will want to act upon later in the statement where we'll get a row with xmax of our transaction and cmax greater than the current command? The greater-cmax case could occur via any kind of function, not only a trigger, ie update tab set x = foo(x) where ... where foo() is a volatile function that internally updates the tab table. I *thought* I was missing a case, I just couldn't figure out what. I suppose you could say that this is horrible programming practice and anyone who tries it deserves whatever weird behavior ensues ... but it's not the case that every such situation involves a trigger. Well, the change I was thinking of would have made it an error if foo(x) updated a row that was then later selected by the update rather than the current behavior which I think would have ignored the already updated row, so that's probably not going to work. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Cascades Failing
On Fri, 19 Aug 2005, Stephan Szabo wrote: On Fri, 19 Aug 2005, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Is the correct answer to continue marking and running the triggers until there are no immediate triggers left to run for this case? Hmm ... my recollection is that we put in the concept of marking because we needed it for correct behavior in some cases. I don't recall exactly why though. Hmm, there's an issue with before triggers as well. We add the checks for the updates to the end of the current statement's queue and shouldn't run them until all the cascaded updates are done. However, if a before on update trigger of the fk side also updates an fk row that is in the middle of a series of these updates with a direct update statement, that statement's check will happen inside the before trigger, which will fail. It's not necessarily a triggered data change violation if the change happens to not change the key values or sets them to what the have already or will become. We could get around this by somehow inheriting the state of our direct parent trigger (whether or not it was in a new query), but that seems like it'd break other cases because the triggers would line up in the pre-8.0 sequence in that case. Or, is it correct to fail in this case because the statement is trying to update in a new query to a set of invalid keys? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Cascades Failing
On Fri, 19 Aug 2005, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Is the correct answer to continue marking and running the triggers until there are no immediate triggers left to run for this case? Hmm ... my recollection is that we put in the concept of marking because we needed it for correct behavior in some cases. I don't recall exactly why though. The comment there talks about someone doing a set constraints immediate inside a trigger function. /* * Process all immediate-mode triggers queued by the query, and move * the deferred ones to the main list of deferred events. * * Notice that we decide which ones will be fired, and put the deferred * ones on the main list, before anything is actually fired. This * ensures reasonably sane behavior if a trigger function does * SET CONSTRAINTS ... IMMEDIATE: all events we have decided to defer * will be available for it to fire. * * If we find no firable events, we don't have to increment firing_counter. */ I think we were worried about it either skipping events or potentially running events twice in that case, but I don't remember exactly either. I'm not sure that looping would affect that though, it'd be basically mark (0) increment firing id (0-1) run triggers using the old id (0) - if the set constraints immediate was run here, it'd mark using the - incremented id (hopefully incrementing again - will need to check) and - run using that id (1-2) and (1) mark (2) increment firing id (2-3) run triggers using (2) There might be some other reason that's not enshrined in the comment though. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Cascades Failing
On Wed, 17 Aug 2005, Stephan Szabo wrote: On Tue, 16 Aug 2005, Stephan Szabo wrote: On Tue, 16 Aug 2005, Tom Lane wrote: I think this would take some generalization of afterTriggerInvokeEvents, which now might or might not find the target rel in the EState it's passed, but otherwise it doesn't seem too invasive. Thoughts? That doesn't seem too bad really, looking at afterTriggerInvokeEvents it doesn't look like it'd be that much work to change it to handle that case. I can put a patch together to see what it looks like. I did some work on this, and I'm getting a couple of other failures from other parts of the foreign key regression test (specifically an error that is no longer erroring in a multi-column on update set default). I'm going to need to look more closely to see if I can figure out why. I think I see at least part of what's going on. It looks to me that events are being added, but not fired because they weren't marked. The event sequence seems to be: after trigger begin query add events for the actual statement after trigger end query fire trigger add events for the triggered statement finish trigger skip event added for triggered statement because it's not marked. Is the correct answer to continue marking and running the triggers until there are no immediate triggers left to run for this case? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Cascades Failing
On Tue, 16 Aug 2005, Stephan Szabo wrote: On Tue, 16 Aug 2005, Tom Lane wrote: I think this would take some generalization of afterTriggerInvokeEvents, which now might or might not find the target rel in the EState it's passed, but otherwise it doesn't seem too invasive. Thoughts? That doesn't seem too bad really, looking at afterTriggerInvokeEvents it doesn't look like it'd be that much work to change it to handle that case. I can put a patch together to see what it looks like. I did some work on this, and I'm getting a couple of other failures from other parts of the foreign key regression test (specifically an error that is no longer erroring in a multi-column on update set default). I'm going to need to look more closely to see if I can figure out why. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Cascades Failing
On Tue, 16 Aug 2005, Tom Lane wrote: [ redirected to -hackers ] I wrote: This suggests that we need a way to prevent immediate execution of freshly queued triggers at the end of a command fired by an FK trigger. If we could move them to the end of the trigger queue that the FK operation itself is in, things would work reasonably well I think. After a quick look through the code, it seems like the way to do this is to add an extra bool parameter nest_triggers to _SPI_pquery, which when false would simply suppress its calls to AfterTriggerBeginQuery and AfterTriggerEndQuery --- thus causing any queued triggers to be queued in the same trigger list the FK is in. We'd then expose this parameter (only) via SPI_execute_snapshot, which is intended only for RI trigger use anyway. This seems right to me. I'd thought that SQL wanted the user triggers to be run after the updating directly, but reading it again, SQL03 at least seems to just talk about adding state changes for after triggers to the current trigger context AFAICS which means that the above seems to be what is requested by the spec in general. I think this would take some generalization of afterTriggerInvokeEvents, which now might or might not find the target rel in the EState it's passed, but otherwise it doesn't seem too invasive. Thoughts? That doesn't seem too bad really, looking at afterTriggerInvokeEvents it doesn't look like it'd be that much work to change it to handle that case. I can put a patch together to see what it looks like. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Idea - optimising (left) joins?
On Wed, 27 Jul 2005, Dawid Kuroczko wrote: Hello. I was just wondering, assume we have such tables: CREATE TABLE data ( foo text, somename_id integer not null references (somenames) ); CREATE TABLE somenames ( somename_id serial PRIMARY KEY somename text NOT NULL ); And a view: CREATE someview AS SELECT foo,somename FROM data NATURAL JOIN somenames; ...and a user does: SELECT foo FROM data order by foo LIMIT 1 OFFSET 1000; ...we could assume some of the things: 1. as somename_id references somenames, and it is joined with somenames, there will always be at least one row in somenames for each row of data. 2. as the somename_id is primary key, there will be exactly one row. 3. 1 and 2 together -- no matter if we join somenames or not, we'll get same number of rows 4. So if columns from somenames are not used for anything, we can skip them. No need to join at all. Note that I believe the above only holds if also the reference is immediate (technically, also, the key must be immediate but we don't support non-immediate primary keys currently). I'm uncertain whether the condition holds inside triggers eventually coming from updates to data or somenames as well as from my last reading of SQL2003 I believe it's possible to get into places where you're running before the constraint check occurs. Why bother? There are cases where data is normalised and there is a view which joins main table with few tables similar to somenames mentioned here. If PostgreSQL could skip even looking at the tables that will not be used, it could give advantage; the bigger the data, the more advantage. What do you think about it? Is it worth the effort? Possibly, one big question is how much cost does it add to queries that the optimization doesn't apply in order to try to see if it does. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [Bizgres-general] A Guide to Constraint Exclusion
On Fri, 15 Jul 2005, Hannu Krosing wrote: What comes to Toms's May 20 argument that people would be surprised when they select form a table whith check(i0) constraint and get out i0 then I think this is a question of education. I potentially disagree. What are we trying to model here? Systems which allow children to have values that are not in the domain of the parent are useful for some things, but they also potentially lose some other useful properties like substitutability since a child isn't a parent (or more specifically f(parent) cannot assume parent's invarients are valid on its argument which may be a child unless they are applied to the child at call time). ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Must be owner to truncate?
On Thu, 7 Jul 2005, Stephen Frost wrote: * Andrew - Supernews ([EMAIL PROTECTED]) wrote: On 2005-07-07, Stephen Frost [EMAIL PROTECTED] wrote: * truncate is not MVCC-safe. Erm, that's why it gets a stronger lock, so I don't really see what this has to do with it. It's not MVCC-safe even with the AccessExclusive lock; it damages snapshots that were taken before the truncate operation but which don't have a lock on the table yet. The only reason it doesn't break pg_dump is that the first thing that pg_dump does is to take AccessShare locks on every table that it's going to dump. This seems like something which should probably be fixed, but which is probably too late to fix for 8.1. Of course, if we could fix this then it seems like it would be possible for us to just change 'delete from x' to behave as truncate does now given appropriate conditions. I'm not as Doesn't the lock difference between delete and truncate mean that suddenly deletes on x may or may not block concurrent selects to x (depending on whether it's a full table delete and whether x has delete triggers)? Or are you thinking that after making it MVCC safe the lock could be lessened? With the current truncate lock, it seems bad to me for users who want to do: begin; delete from x; -- do inserts and other stuff to the now empty x while still allowing access to x. Especially if whether or not you have access depends on whether there are delete triggers on x. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] quote_boolean() and friends missing
On Mon, 20 Jun 2005, Tom Lane wrote: Markus Bertheau =?UTF-8?Q?=E2=98=AD?= [EMAIL PROTECTED] writes: Maybe quote_* is not the right name, but the functionality is needed. I don't think so --- at least not in plpgsql, which can do this already. Just assign the value to a text variable and then use quote_literal. Would that work for a bytea with embedded nul characters or does that run the risk of terminating the value early? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] foreign keys and RI triggers
On Thu, 26 May 2005, Neil Conway wrote: I spent a little while looking into a performance issue with a large UPDATE on a table with foreign keys. A few questions: (1) When a PK table is updated, we skip firing the per-row UPDATE RI triggers if none of the referenced columns in the PK table have been modified. However, AFAICS we do not apply a similar optimization for updates of foreign key tables: if a user does not modify the foreign key column, we needn't check for the presence of the FK column value in the primary key table. Is there a reason we don't implement this? Are you sure? RI_FKey_Check seems to have a section on TRIGGER_FIRED_BY_UPDATE which seems to check if the keys are equal if the old row wasn't part of this transaction. I'm not sure why it's doing the transaction id check, but it looks like it will do an equals check at least some of the time. (2) For per-row RI triggers of all kinds, we save the trigger under CurTransactionContext and invoke it at the end of the current query. There is not even overflow to disk (the report that prompted me to look into this was someone's database crashing because they kept running OOM when doing an UPDATE of a large table with FKs on a pretty lowend machine). While avoiding consuming a lot of memory for queued trigger execution is worth doing anyway, ISTM we needn't queue RI triggers in the first place. Is there a reason we can't just invoke after-row RI triggers immediately? If I'm understanding the question, there's two things. First is deferred constraints and the second is that constraints happen after the entire statement. In a case like: insert into pk values(2); insert into pk values(1); insert into fk values(2); update pk set key=key+1; In no action, that's not an error AFAIK because the constraint is satisfied at end of statement. If the order of updates happened such that the key=2 row were updated first we couldn't know whether or not the constraint would be satisfied by later updates to the same table. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] foreign keys and RI triggers
On Thu, 26 May 2005, Stephan Szabo wrote: On Thu, 26 May 2005, Neil Conway wrote: (2) For per-row RI triggers of all kinds, we save the trigger under CurTransactionContext and invoke it at the end of the current query. There is not even overflow to disk (the report that prompted me to look into this was someone's database crashing because they kept running OOM when doing an UPDATE of a large table with FKs on a pretty lowend machine). While avoiding consuming a lot of memory for queued trigger execution is worth doing anyway, ISTM we needn't queue RI triggers in the first place. Is there a reason we can't just invoke after-row RI triggers immediately? If I'm understanding the question, there's two things. First is deferred constraints and the second is that constraints happen after the entire statement. In a case like: insert into pk values(2); insert into pk values(1); insert into fk values(2); update pk set key=key+1; In no action, that's not an error AFAIK because the constraint is satisfied at end of statement. If the order of updates happened such that the key=2 row were updated first we couldn't know whether or not the constraint would be satisfied by later updates to the same table. Although, we could almost run referential actions that way. The almost comes from some behavior for set default cases to make sure the default still exists which I think would still have to happen at end of statement to be spec complient (I don't have easy access to my copy of SQL99 from here). I think we're still a little short of entirely complient on timing in any case because unless I'm misremembering constraint checks happen after user defined triggers and referential actions happen before which would be difficult if not impossible to do right now with the way we fire triggers. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] foreign keys and RI triggers
On Fri, 27 May 2005, Neil Conway wrote: Stephan Szabo wrote: Are you sure? RI_FKey_Check seems to have a section on TRIGGER_FIRED_BY_UPDATE which seems to check if the keys are equal if the old row wasn't part of this transaction. Well, regardless of how RI_FKey_Check() itself works, ISTM there is no need to enqueue the RI trigger in the first place. That's when the update-on-PK-table optimization is applied -- see trigger.c circa 3005. The specific case I was looking at resulted in the postgres backend allocating a few hundred MB just to store all the pending RI triggers, even though the UPDATE in question didn't change the foreign key field, so it didn't matter a great deal how quickly RI_FKey_Check() was able to bail out. Okay, I can't think of cases even with triggers and the like where removing the check on equal valued rows would give appreciably different results, but I haven't thought too hard about it. If I'm understanding the question, there's two things. First is deferred constraints Right -- obviously we can't fire RI triggers for deferred constraints immediately. Immediate constraints are the common case, though. constraints happen after the entire statement. In a case like: insert into pk values(2); insert into pk values(1); insert into fk values(2); update pk set key=key+1; Hmm, good point. But ISTM there are still some circumstances in which we can safely check the RI trigger immediately, rather than at end of statement. For example, updating the FK table, inserting into the FK table, or deleting from the PK table. Unfortunately, I don't think so, if my assumption that user triggers are supposed to happen before constraint checks is true. In that case, we must wait until not only the action but all triggers fired by that action happen in order to run the constraint check because a trigger could make an otherwise invalid row valid. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] foreign keys and RI triggers
On Thu, 26 May 2005, Stephan Szabo wrote: On Fri, 27 May 2005, Neil Conway wrote: Stephan Szabo wrote: Are you sure? RI_FKey_Check seems to have a section on TRIGGER_FIRED_BY_UPDATE which seems to check if the keys are equal if the old row wasn't part of this transaction. Well, regardless of how RI_FKey_Check() itself works, ISTM there is no need to enqueue the RI trigger in the first place. That's when the update-on-PK-table optimization is applied -- see trigger.c circa 3005. The specific case I was looking at resulted in the postgres backend allocating a few hundred MB just to store all the pending RI triggers, even though the UPDATE in question didn't change the foreign key field, so it didn't matter a great deal how quickly RI_FKey_Check() was able to bail out. Okay, I can't think of cases even with triggers and the like where removing the check on equal valued rows would give appreciably different results, but I haven't thought too hard about it. Err, except the case that Tom mentions in his message. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] foreign keys and RI triggers
On Thu, 26 May 2005, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Okay, I can't think of cases even with triggers and the like where removing the check on equal valued rows would give appreciably different results, but I haven't thought too hard about it. Err, except the case that Tom mentions in his message. But the check could incorporate the same transaction ID test already in use. I think Neil is right that it'd be a win to apply the test before enqueueing the trigger instead of after. Good point. That would help in many cases anyway. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] foreign keys and RI triggers
On Thu, 26 May 2005, Tom Lane wrote: The thoughts I've had about special-casing RI events to save memory have to do with the idea of lossy storage. As you accumulate more per-row events, at some point it becomes more efficient to forget the individual rows and just reapply the original full-table check query when it's time to check the constraint. So if we could recognize One problem with that is that it works for the constraint check but not for referential actions, although if we instead fired the referential actions truly immediately rather than queued to statement end that'd prevent those from being an issue. The only thing there is that we'd have to also have a constraint check for at least set default. RI events as being associated with the same constraint, and keep track of how many are pending for each constraint, we could make a decision to discard the queue and instead register one event to apply a full-table check. It's not clear how to do that efficiently though. Yeah, I was thinking we could keep a separate structure for (foreign key trigger oid, action) where we could keep track of a current count and whether or not we've consolidated already and scan the queue when we do the consolidation removing items for that oid. That's still not very good though. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] foreign keys and RI triggers
On Thu, 26 May 2005, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: On Thu, 26 May 2005, Tom Lane wrote: The thoughts I've had about special-casing RI events to save memory have to do with the idea of lossy storage. One problem with that is that it works for the constraint check but not for referential actions, True. But even fixing it only for constraint checks would be a win. Yeah, I'm just wondering if going the extra step and forcing really really immediate referential actions (even if that sometimes means adding a no action trigger on the event as well) would be worth doing as that could remove the queued pk actions for cascade and set null as well and at least turn the queued pk action for set default into one that could be consolidated. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Foreign keys on array elements
On Wed, 20 Apr 2005, Christopher Kings-Lynne wrote: Hi, Can you put a foreign key constraint on an array column that says that each element of the array must match a primary key? Not currently, because foreign keys are between directly comparable things. If not, is this a TODO perhaps? Maybe. It's been discussed before IIRC. Doing the referential actions might get tricky, and you'd often want to index so that finding the individual array elements isn't expensive. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Patch for collation using ICU
On Sat, 26 Mar 2005, Palle Girgensohn wrote: I've noticed a couple of things about using the ICU patch vs. pristine pg-8.0.1: - ORDER BY is case insensitive when using ICU. This might break the SQL standard (?), but sure is nice :) Err, I think if your system implements strcoll correctly 8.0.1 can do this if the chosen collation is set up that way (or at least naive tests I've done seem to imply that). Or are you speaking about C locale? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Patch for collation using ICU
On Sun, 27 Mar 2005, Palle Girgensohn wrote: --On lördag, mars 26, 2005 08.16.01 -0800 Stephan Szabo [EMAIL PROTECTED] wrote: On Sat, 26 Mar 2005, Palle Girgensohn wrote: I've noticed a couple of things about using the ICU patch vs. pristine pg-8.0.1: - ORDER BY is case insensitive when using ICU. This might break the SQL standard (?), but sure is nice :) Err, I think if your system implements strcoll correctly 8.0.1 can do this if the chosen collation is set up that way (or at least naive tests I've done seem to imply that). Or are you speaking about C locale? No, I doubt this. Example: set up a cluster: $ initdb -E LATIN1 --locale=sv_SE.ISO8859-1 $ createdb foo CREATE DATABASE $ psql foo foo=# create table bar (val text); CREATE TABLE foo=# insert into bar values ('aaa'); INSERT 18354409 1 foo=# insert into bar values ('BBB'); INSERT 18354412 1 foo=# select val from bar order by val; val - BBB aaa (2 rows) Order by is not case insensitive. It shouldn't be for any system, AFAIK. As It is on my machine... for the same test: foo=# select val from bar order by val; val - aaa BBB (2 rows) I think this just implies even greater breakage of either the collation or strcoll on the system you're trying on. ;) Which, of course, is a fairly reasonable reason to offer an alternative. Especially if it's generically useful. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] pg_dump issue : Cannot drop a non-existent(?) trigger
On Fri, 25 Mar 2005, Devrim GUNDUZ wrote: - -- PostgreSQL 8.0.1 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.2.3 20030502 (Red Hat Linux 3.2.3-42) === We wanted to take a full backup, so ran pg_dump,but got an error: === $ pg_dump prod -U postgres all.pgdump pg_dump: invalid argument string (firma_moduller_firma_fkey1) for trigger RI_ConstraintTrigger_39053 on table t_firma_moduller === Looking at the details, we saw the trigger there: === prod=# \d t_firma_moduller Table public.t_firma_moduller Column |Type | Modifiers - --+-+-- firma_no | character varying(10) | not null modul_adi| character varying(20) | not null last_update_date | timestamp without time zone | last_update_user | character varying(45) | kod | integer | not null default fn_get_seq_value('t_firma_moduller'::character varying) Indexes: t_firma_moduller_pkey PRIMARY KEY, btree (firma_no, modul_adi) t_firma_moduller_kodu_ukey UNIQUE, btree (kod) Triggers: RI_ConstraintTrigger_39053 AFTER INSERT OR UPDATE ON t_firma_moduller FROM t_firmalar NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE RI_FKey_check_ins('firma_moduller_firma_fkey1', 't_firma_moduller', 't_firmalar', 'UNSPECIFIED', 'firma_no', 'no') Odd, why is this not showing up as a foreign key constraint? Has this database been upgraded (no matter through how many different upgrades) from an old version (7.1 or earlier I think)? prod=# DROP TRIGGER RI_ConstraintTrigger_39053 ON t_firma_moduller; ERROR: trigger ri_constrainttrigger_39053 for table t_firma_moduller does not exist You'd need to quote the trigger name because it's mixed case. Is this a bug or something? There should not be a trigger there. Well, it really should be showing up as something like: CONSTRAINT firma_moduller_firma_fkey1 FOREIGN KEY (firma_no) REFERENCES t_firmalar(no) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings