[HACKERS] Toasted table not deleted when no out of line columns left
Hi, we came across a database where a table had a toasted table, keeping huge amounts of disk space allocated. However, the table's current definition didn't explain why there was a toasted table. Then upon some experiments, it struck me. There _was_ a toasted field but as the schema was modified, the fields was dropped, leaving only inline stored fields. VACUUM [FULL] [ANALYZE] didn't cleaned up the space that was used by the toasted table. My tests were done on 8.3.3. As every statements that reference a table puts a lock on the pg_class record, ALTER TABLE cannot progress until all locks are gone, i.e. the transactions referencing the table finished. It's true vice-versa, ALTER TABLE blocks every transactions that may reference the table. Judging from that, the toasted table cleanup may be part of ALTER TABLE DROP COLUMN. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal for smaller indexes on index-ordered tables
Jeffrey Baker írta: The way I read it, the current btree index stores the index value and the TID of every tuple having that value. When you have a table with three columns, you index one of them and you get an index which is practically as large as the table itself. Supposing the table is generally or strictly ordered by the column to be indexed, it would be more compact if the index stored ranges of tuples. Instead of storing the TID of every tuple with that value, the index would store a first and last TID, between which all tuples have the value. Example: table with one million rows indexed on a column having one thousand distinct values. Table is in-order by the indexed column. The traditional index would contain a million TIDs, whereas a range index would contain only two thousand. The range index would be 500 times smaller, more likely to be cached, etc. Thoughts? -jwb Example with your theory: One (not yet committed) transaction changes one tuple that was in the middle of a range before but the tuple's indexed column changed. What would you do? You need to keep track of multiple index versions: 1. the range has to be split for the not-yet-committed modifier transaction, it might need to re-read the same table. 2. the old range has to be kept for reader transactions that still see the old data Imagine you have thousands of UPDATEs in flight on different rows. Or you introduce readers has to wait for writers locks and updaters has to wait for other updaters on the same range that the MVCC implementation nicely avoids. Look at MaxDB once, you'll appreciate PostgreSQL then. MaxDB stores table tuples in the order of its primary key, it uses a balanced btree for that. This means slower INSERTs and UPDATEs and decreased concurrency compared to PostgreSQL. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal for smaller indexes on index-ordered tables
Jeffrey Baker írta: On Tue, Jun 24, 2008 at 1:59 PM, Zoltan Boszormenyi [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Jeffrey Baker írta: The way I read it, the current btree index stores the index value and the TID of every tuple having that value. When you have a table with three columns, you index one of them and you get an index which is practically as large as the table itself. Supposing the table is generally or strictly ordered by the column to be indexed, it would be more compact if the index stored ranges of tuples. Instead of storing the TID of every tuple with that value, the index would store a first and last TID, between which all tuples have the value. Example: table with one million rows indexed on a column having one thousand distinct values. Table is in-order by the indexed column. The traditional index would contain a million TIDs, whereas a range index would contain only two thousand. The range index would be 500 times smaller, more likely to be cached, etc. Thoughts? -jwb Example with your theory: One (not yet committed) transaction changes one tuple that was in the middle of a range before but the tuple's indexed column changed. What would you do? Insert the new tuple at the end of the table and add another range to the index. Leave the old tuple in place and don't touch the original index range. This is what I described below but I only mentioned the index part: You need to keep track of multiple index versions: 1. the range has to be split for the not-yet-committed modifier transaction, it might need to re-read the same table. 2. the old range has to be kept for reader transactions that still see the old data This is only true if you update the tuple in-place. Why? If you update in-place then the above is not needed. You just need to serialize transactions but there goes concurrency. Imagine you have thousands of UPDATEs in flight on different rows. I'm quite aware of the problems of maintaining such a table and index, but the fact is that data warehouse type tables may never be updated after being created. The particular application I'm struggling with does a SELECT ... INTO ... ORDER BY to make an ordered table for querying every night. The problem is it takes longer, much longer, to create the index than to create the table, and in the end the index is as big as half the table anyway. So this type of index would only be useful for an essentially read-only table. I agree. Quite another proposal would be to somehow instruct the database that the table is strictly in-order by a column and allow a binary search access method. Then you don't need any index at all. CLUSTER tablename USING indexname; It's useful for little changing very large tables. -jwb -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rawhide report: 20080612 changes
Hm. Someone had his second finger chainsawed? Where is 8.3.2? The ftp browser also shows 8.3.1 and 8.3.3, and 8.3.1 is the latest on the main page. Rawhide írta: postgresql-8.3.3-1.fc10 --- * Wed Jun 11 18:00:00 2008 Tom Lane [EMAIL PROTECTED] 8.3.3-1 - Update to PostgreSQL 8.3.3. - Remove postgresql-prefer-ncurses.patch, no longer needed in recent Fedora releases because libtermcap is gone. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rawhide report: 20080612 changes
Thanks for the info. Magnus Hagander írta: 8.3.2 was pulled back because of an urgent bugfix, and re-released as 8.3.3 since it had already hit the mirrors. 8.3.3 has not been officially releasde yet, but it will be out soon. //Magnus Zoltan Boszormenyi wrote: Hm. Someone had his second finger chainsawed? Where is 8.3.2? The ftp browser also shows 8.3.1 and 8.3.3, and 8.3.1 is the latest on the main page. Rawhide írta: postgresql-8.3.3-1.fc10 --- * Wed Jun 11 18:00:00 2008 Tom Lane [EMAIL PROTECTED] 8.3.3-1 - Update to PostgreSQL 8.3.3. - Remove postgresql-prefer-ncurses.patch, no longer needed in recent Fedora releases because libtermcap is gone. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH RECURSIVE patch V0.1
Gregory Stark írta: This is indeed really cool. I'm sorry I haven't gotten to doing what I promised in this area but I'm glad it's happening anyways. Zoltan Boszormenyi [EMAIL PROTECTED] writes: Can we get the rows in tree order, please? ... After all, I didn't specify any ORDER BY clauses in the base, recursive or the final queries. The standard has a clause to specify depth-first order. However doing a depth-first traversal would necessitate quite a different looking plan and it's far less obvious (to me anyways) how to do it. That would be even cooler to have it implemented as well. Also, it seems there are no infinite recursion detection: # with recursive x(level, parent, child) as ( select 1::integer, * from test_connect_by where parent is null union all select x.level + 1, base.* from test_connect_by as base, x where base.child = x.child ) select * from x; ... it waits and waits and waits ... Well, psql might wait and wait but it's actually receiving rows. A cleverer client should be able to deal with infinite streams of records. I think it's the other way around. The server should not emit infinite number of records. I think DB2 does produce a warning if there is no clause it can determine will bound the results. But that's not actually reliable. It's quite possible to have clauses which will limit the output but not in a way the database can determine. Consider for example a tree-traversal for a binary tree stored in a recursive table reference. The DBA might know that the data contains no loops but the database doesn't. Well, a maintenance resjunk could be used like the branch column in tablefunc::connectby(). -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH RECURSIVE patch V0.1
Martijn van Oosterhout írta: On Mon, May 19, 2008 at 08:19:17AM +0200, Zoltan Boszormenyi wrote: The standard has a clause to specify depth-first order. However doing a depth-first traversal would necessitate quite a different looking plan and it's far less obvious (to me anyways) how to do it. That would be even cooler to have it implemented as well. From an implementation point of view, the only difference between breadth-first and depth-first is that your tuplestore needs to be LIFO instead of FIFO. However, just looking at the plan I don't know whether it could support that kind of usage. At the very least I don't think the standard tuplestore code can handle it. Well, psql might wait and wait but it's actually receiving rows. A cleverer client should be able to deal with infinite streams of records. I think it's the other way around. The server should not emit infinite number of records. The server won't, the universe will end first. The universe is alive and well, thank you. :-) But the server won't emit infinite number of records, you are right. Given the implementation uses a tuplestore and not producing the tupleslots on the fly, it will go OOM first not the psql client, I watched them in 'top'. It just takes a bit of time. This is a nice example of the halting problem: http://en.wikipedia.org/wiki/Halting_problem Which was proved unsolvable a long time ago. Hmpf, yes, I forgot too much about Turing-machines since university. :-( Have a nice day, -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] WITH RECURSIVE patch V0.1
Yoshiyuki Asaba írta: Hi, From: Zoltan Boszormenyi [EMAIL PROTECTED] Subject: Re: [PATCHES] WITH RECURSIVE patch V0.1 Date: Mon, 19 May 2008 08:19:17 +0200 Also, it seems there are no infinite recursion detection: # with recursive x(level, parent, child) as ( select 1::integer, * from test_connect_by where parent is null union all select x.level + 1, base.* from test_connect_by as base, x where base.child = x.child ) select * from x; ... it waits and waits and waits ... Well, psql might wait and wait but it's actually receiving rows. A cleverer client should be able to deal with infinite streams of records. I think it's the other way around. The server should not emit infinite number of records. How about adding new GUC parameter max_recursive_call? Yes, why not? MSSQL has a similar MAXRECURSION hint for WITH RECURSIVE queries according to their docs. http://msdn.microsoft.com/en-us/library/ms186243.aspx Regards, -- Yoshiyuki Asaba [EMAIL PROTECTED] -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH RECURSIVE patch V0.1
Martijn van Oosterhout írta: On Mon, May 19, 2008 at 08:19:17AM +0200, Zoltan Boszormenyi wrote: The standard has a clause to specify depth-first order. However doing a depth-first traversal would necessitate quite a different looking plan and it's far less obvious (to me anyways) how to do it. That would be even cooler to have it implemented as well. From an implementation point of view, the only difference between breadth-first and depth-first is that your tuplestore needs to be LIFO instead of FIFO. Are you sure? I think a LIFO tuplestore would simply return reversed breadth-first order. Depth-first means for every new record descend into another recursion first then continue with the next record on the right. However, just looking at the plan I don't know whether it could support that kind of usage. At the very least I don't think the standard tuplestore code can handle it. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH RECURSIVE patch V0.1
Martijn van Oosterhout írta: On Mon, May 19, 2008 at 11:56:17AM +0200, Zoltan Boszormenyi wrote: From an implementation point of view, the only difference between breadth-first and depth-first is that your tuplestore needs to be LIFO instead of FIFO. Are you sure? I think a LIFO tuplestore would simply return reversed breadth-first order. Depth-first means for every new record descend into another recursion first then continue with the next record on the right. Say your tree looks like: Root-A, D A-B,C D-E,F LIFO pushes A and D. It then pops A and pushes B and C. B and C have no children and are returned. Then D is popped and E and F pushed. So the returned order is: A,B,C,D,E,F. You could also do B,C,A,E,F,D if you wanted. FIFO pushes A and D. It then pops A and puts B and C at *the end*. It then pops D and pushes E and F at the end. So you get the order A,D,B,C,E,F Hope this helps, Thanks, I didn't consider popping elements off while processing. However, if the toplevel query returns tuples in A, D order, you need a positioned insert into the tuplestore, because the LIFO would pop D first. Say, a treestore would work this way: 1. setup: treestore is empty, storage_position := 0 2. treestore_puttupleslot() adds slot at current position, storage_position++ 3. treestore_gettupleslot() removes slot from the beginning, storage_position := 0 This works easily in memory lists but it's not obvious for me how it may work with disk backed temporary storage inside PG. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITH RECURSIVE patch V0.1
Gregory Stark írta: Martijn van Oosterhout [EMAIL PROTECTED] writes: From an implementation point of view, the only difference between breadth-first and depth-first is that your tuplestore needs to be LIFO instead of FIFO. I think it's not so simple. How do you reconcile that concept with the join plans like merge join or hash join which expect you to be able to be able to process the records in a specific order? It sounds like you might have to keep around a stack of started executor nodes or something but hopefully we can avoid anything like that because, well, ick. If I understand the code right, the recursion from level N to level N+1 goes like this: collect all records from level N and JOIN it with the recursive query. This way we get all level 1 records from the base query, then all records at the second level, etc. This is how it gets breadth-first ordering. Depth-first ordering could go like this: get only 1 from the current level then go into recursion. Repeat until there are no records in the current level. The only difference would be more recursion steps. Instead of one per level, there would be N per level if there are N tuples in the current level. Definitely slower then the current implementation but comparable with the tablefunc.c connectby() code. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Auto-updated fields
Martijn van Oosterhout írta: On Wed, May 07, 2008 at 03:04:49PM -0700, David Fetter wrote: 1. Create a generic (possibly overloaded) trigger function, bundled with PostgreSQL, which sets a field to some value. For example, a timestamptz version might set the field to now(). Doesn't the SQL standard GENERATED BY functionality work for this? Or won't that handle updates? You mean GENERATED ALWAYS AS (expression)? Yes, they should be updated on every UPDATE as the expression may include other fields in the same row. A GENERATED column implemented as a stored column would work for this but a virtual column would not. A virtual column would return different values for now() in every SELECT. However we can argue for use cases of a virtual column and implement it similarly as VIEWs, i.e an ON SELECT rule can expand the original expression of the column definition. I suggest using these syntaxes if we decide to implement them: GENERATED ALWAYS AS (expression) -- SQL standard stored GENERATED column GENERATED VIRTUAL AS (expression) -- virtual column, obviously Have a nice day, Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Auto-updated fields
Zoltan Boszormenyi írta: Martijn van Oosterhout írta: On Wed, May 07, 2008 at 03:04:49PM -0700, David Fetter wrote: 1. Create a generic (possibly overloaded) trigger function, bundled with PostgreSQL, which sets a field to some value. For example, a timestamptz version might set the field to now(). Doesn't the SQL standard GENERATED BY functionality work for this? Or won't that handle updates? You mean GENERATED ALWAYS AS (expression)? Yes, they should be updated on every UPDATE as the expression may include other fields in the same row. A GENERATED column implemented as a stored column would work for this but a virtual column would not. A virtual column would return different values for now() in every SELECT. However we can argue for use cases of a virtual column and implement it similarly as VIEWs, i.e an ON SELECT rule can expand the original expression of the column definition. I suggest using these syntaxes if we decide to implement them: GENERATED ALWAYS AS (expression) -- SQL standard stored GENERATED column GENERATED VIRTUAL AS (expression) -- virtual column, obviously Or, as found in Oracle 11g: GENERATED ALWAYS AS (expr) VIRTUAL Have a nice day, Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [RFC] Localized literals
Hi, we have a customer who shot themselves in the foot by using table names with german accented characters in them. The client application on the popular OS is using a single-byte encoding (LATIN9), their dump of the original database is using the same but no SET client_encoding = ... line anywhere. We didn't know the previous conditions and have setup the default installed database cluster that was created during installation on Debian 4.0. Obviously we have to re-initialize the cluster with the original locale so the table names come out right after reloading the dump. But the question popped up whether PostgreSQL can be extended to allow localized literals and apply encoding conversion the same way as on string data. NAMEDATA can be replaced with regular TEXT and have the same conversion everywhere. This way the relation and field name limits are also eliminated. The conversion could be controlled by a compile-time option and/or a GUC variable. Tell me if I am crazy. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Localized literals
Martijn van Oosterhout írta: On Wed, Apr 23, 2008 at 10:02:37AM +0200, Zoltan Boszormenyi wrote: But the question popped up whether PostgreSQL can be extended to allow localized literals and apply encoding conversion the same way as on string data. NAMEDATA can be replaced with regular TEXT and have the same conversion everywhere. This way the relation and field name limits are also eliminated. The conversion could be controlled by a compile-time option and/or a GUC variable. Tell me if I am crazy. It does convert the table names also, since the encoding translation is applied to the whole query string, not just normal strings. Thanks for enlightenment, I didn't know or check it. A simple SET CLIENT_ENCODING='latin9' at the beginning of your dump should have worked. I thought so - the dump was off a 7.4 server with its pg_dump and there was no SET client_encoding = ... in the script. They insisted on continuing with LATIN9 but with the default UTF-8 server encoding createdb -E LATIN9 failed. As for the other point, the reason NAMEDATA is fixed is because these records is mapped onto in memory structures in the backend. By changing it to a variable length type all structure accesses would become much more expensive. Yes, I guessed it would be slower. However, NAMEDATALEN doesn't translate to the varchar(NAMEDATALEN) if I use accented characters in literals with UTF-8 encoding. :-( But none of this has anything ot do with encodings. Have a nice day, Thanks, I have it. :-) Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRUNCATE TABLE with IDENTITY
Hi, Zoltan Boszormenyi írta: Updated patch implements TRUNCATE ... RESTART IDENTITY which restarts all owned sequences for the truncated table(s). Regression tests updated, documentation added. pg_dump was also extended to output original[1] START value for creating SEQUENCEs. [1] For 8.3 and below I could only guesstimate it as MINVALUE for ascending and MAXVALUE for descending sequences. Best regards, Zoltán Böszörményi I just saw this on the CommitFest:May page: alvherre says: I'm not sure if this is the same patch in the previous entry, or a different feature I wanted to clarify, the second patch contains two features. 1. stored start value for sequences, ALTER SEQUENCE ... RESTART; 2. (builds on 1.) TRUNCATE ... RESTART IDENTITY; Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRUNCATE TABLE with IDENTITY
Zoltan Boszormenyi írta: Zoltan Boszormenyi írta: Decibel! írta: On Apr 3, 2008, at 12:52 AM, Zoltan Boszormenyi wrote: Where is the info in the sequence to provide restarting with the _original_ start value? There isn't any. If you want the sequence to start at some magic value, adjust the minimum value. There's the START WITH option for IDENTITY columns and this below is paragraph 8 under General rules of 14.10 truncate table statement in 6WD2_02_Foundation_2007-12.pdf (page 902): 8) If RESTART IDENTITY is specified and the table descriptor of T includes a column descriptor IDCD of an identity column, then: a) Let CN be the column name included in IDCD and let SV be the start value included in IDCD. b) The following alter table statement is effectively executed without further Access Rule checking: ALTER TABLE TN ALTER COLUMN CN RESTART WITH SV This says that the original start value is used, not the minimum value. IDENTITY has the same options as CREATE SEQUENCE. In fact the identity column specification links to 11.63 sequence generator definition when it comes to IDENTITY sequence options. And surprise, surprise, 11.64 alter sequence generator statement now defines ALTER SEQUENCE sn RESTART [WITH newvalue] where omitting the WITH newval part also uses the original start value. Best regards, Zoltán Böszörményi Attached patch implements the extension found in the current SQL200n draft, implementing stored start value and supporting ALTER SEQUENCE seq RESTART; Some error check are also added to prohibit CREATE SEQUENCE ... RESTART ... and ALTER SEQUENCE ... START ... Best regards, Zoltán Böszörményi Updated patch implements TRUNCATE ... RESTART IDENTITY which restarts all owned sequences for the truncated table(s). Regression tests updated, documentation added. pg_dump was also extended to output original[1] START value for creating SEQUENCEs. [1] For 8.3 and below I could only guesstimate it as MINVALUE for ascending and MAXVALUE for descending sequences. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ sql2008-compliant-seq-v2.patch.gz Description: Unix tar archive -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRUNCATE TABLE with IDENTITY
Zoltan Boszormenyi írta: Decibel! írta: On Apr 3, 2008, at 12:52 AM, Zoltan Boszormenyi wrote: Where is the info in the sequence to provide restarting with the _original_ start value? There isn't any. If you want the sequence to start at some magic value, adjust the minimum value. There's the START WITH option for IDENTITY columns and this below is paragraph 8 under General rules of 14.10 truncate table statement in 6WD2_02_Foundation_2007-12.pdf (page 902): 8) If RESTART IDENTITY is specified and the table descriptor of T includes a column descriptor IDCD of an identity column, then: a) Let CN be the column name included in IDCD and let SV be the start value included in IDCD. b) The following alter table statement is effectively executed without further Access Rule checking: ALTER TABLE TN ALTER COLUMN CN RESTART WITH SV This says that the original start value is used, not the minimum value. IDENTITY has the same options as CREATE SEQUENCE. In fact the identity column specification links to 11.63 sequence generator definition when it comes to IDENTITY sequence options. And surprise, surprise, 11.64 alter sequence generator statement now defines ALTER SEQUENCE sn RESTART [WITH newvalue] where omitting the WITH newval part also uses the original start value. Best regards, Zoltán Böszörményi Attached patch implements the extension found in the current SQL200n draft, implementing stored start value and supporting ALTER SEQUENCE seq RESTART; Some error check are also added to prohibit CREATE SEQUENCE ... RESTART ... and ALTER SEQUENCE ... START ... Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ diff -dcrpN pgsql.orig/src/backend/commands/sequence.c pgsql/src/backend/commands/sequence.c *** pgsql.orig/src/backend/commands/sequence.c 2008-01-01 20:45:49.0 +0100 --- pgsql/src/backend/commands/sequence.c 2008-04-08 10:51:27.0 +0200 *** static Relation open_share_lock(SeqTable *** 88,94 static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel); static Form_pg_sequence read_info(SeqTable elm, Relation rel, Buffer *buf); static void init_params(List *options, bool isInit, ! Form_pg_sequence new, List **owned_by); static void do_setval(Oid relid, int64 next, bool iscalled); static void process_owned_by(Relation seqrel, List *owned_by); --- 88,94 static void init_sequence(Oid relid, SeqTable *p_elm, Relation *p_rel); static Form_pg_sequence read_info(SeqTable elm, Relation rel, Buffer *buf); static void init_params(List *options, bool isInit, ! Form_pg_sequence new, Form_pg_sequence old, List **owned_by); static void do_setval(Oid relid, int64 next, bool iscalled); static void process_owned_by(Relation seqrel, List *owned_by); *** DefineSequence(CreateSeqStmt *seq) *** 116,122 NameData name; /* Check and set all option values */ ! init_params(seq-options, true, new, owned_by); /* * Create relation (and fill *null *value) --- 116,122 NameData name; /* Check and set all option values */ ! init_params(seq-options, true, new, NULL, owned_by); /* * Create relation (and fill *null *value) *** DefineSequence(CreateSeqStmt *seq) *** 143,148 --- 143,153 namestrcpy(name, seq-sequence-relname); value[i - 1] = NameGetDatum(name); break; + case SEQ_COL_STARTVAL: + coldef-typename = makeTypeNameFromOid(INT8OID, -1); + coldef-colname = start_value; + value[i - 1] = Int64GetDatumFast(new.start_value); + break; case SEQ_COL_LASTVAL: coldef-typename = makeTypeNameFromOid(INT8OID, -1); coldef-colname = last_value; *** AlterSequence(AlterSeqStmt *stmt) *** 336,342 memcpy(new, seq, sizeof(FormData_pg_sequence)); /* Check and set new values */ ! init_params(stmt-options, false, new, owned_by); /* Clear local cache so that we don't think we have cached numbers */ /* Note that we do not change the currval() state */ --- 341,347 memcpy(new, seq, sizeof(FormData_pg_sequence)); /* Check and set new values */ ! init_params(stmt-options, false, new, seq, owned_by); /* Clear local cache so that we don't think we have cached numbers */ /* Note that we do not change the currval() state */ *** read_info(SeqTable elm, Relation rel, Bu *** 967,973 */ static void init_params(List *options, bool isInit, ! Form_pg_sequence new, List **owned_by) { DefElem*last_value = NULL; DefElem*increment_by = NULL; --- 972,978 */ static void init_params(List *options, bool isInit, ! Form_pg_sequence new, Form_pg_sequence old, List **owned_by) { DefElem*last_value = NULL; DefElem*increment_by = NULL; *** init_params(List *options, bool isInit, *** 995,1003
Re: [HACKERS] TRUNCATE TABLE with IDENTITY
Decibel! írta: On Apr 3, 2008, at 12:52 AM, Zoltan Boszormenyi wrote: Where is the info in the sequence to provide restarting with the _original_ start value? There isn't any. If you want the sequence to start at some magic value, adjust the minimum value. There's the START WITH option for IDENTITY columns and this below is paragraph 8 under General rules of 14.10 truncate table statement in 6WD2_02_Foundation_2007-12.pdf (page 902): 8) If RESTART IDENTITY is specified and the table descriptor of T includes a column descriptor IDCD of an identity column, then: a) Let CN be the column name included in IDCD and let SV be the start value included in IDCD. b) The following alter table statement is effectively executed without further Access Rule checking: ALTER TABLE TN ALTER COLUMN CN RESTART WITH SV This says that the original start value is used, not the minimum value. IDENTITY has the same options as CREATE SEQUENCE. In fact the identity column specification links to 11.63 sequence generator definition when it comes to IDENTITY sequence options. And surprise, surprise, 11.64 alter sequence generator statement now defines ALTER SEQUENCE sn RESTART [WITH newvalue] where omitting the WITH newval part also uses the original start value. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRUNCATE TABLE with IDENTITY
Decibel! írta: On Mar 25, 2008, at 11:40 AM, Zoltan Boszormenyi wrote: All of them? PostgreSQL allow multiple SERIALs to be present, the standard allows only one IDENTITY column in a table. And what about this case below? CREATE TABLE t1 (id1 serial, ...); ALTER SEQUENCE seq_t1_id1 RESTART WITH 5432 CYCLE; or the equivalent CREATE SEQUENCE seq_t1_id1 START WITH 5432 CYCLE; CREATE TABLE t1 (id1 serial, ...); ALTER SEQUENCE seq_t1_id1 OWNED BY t1.id1; PostgreSQL doesn't keep the START WITH information. But it should to perform a restart on the sequence, using the minval in this case wouldn't be correct. I think you misunderstand what ALTER SEQUENCE RESTART does; it only changes the current value of the sequence. I didn't misunderstood, I know that. I quoted both because (currently) CREATE SEQUENCE ... START WITH does the same. zozo= create sequence seq1 start with 327; CREATE SEQUENCE zozo= select * from seq1; sequence_name | last_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called ---++--+-+---+-+-+---+--- seq1 |327 |1 | 9223372036854775807 | 1 | 1 | 1 | f | f (1 row) Note the difference between min_value and last_value. Using the standard syntax of CREATE TABLE ( id integer IDENTITY GENERATED ALWAYS AS (START WITH 327), ... ); and assuming you use the existing sequence infrastructure there's a problem with TRUNCATE ... RESTART IDENTITY; Where is the info in the sequence to provide restarting with the _original_ start value? -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRUNCATE TABLE with IDENTITY
Simon Riggs írta: On Tue, 2008-03-25 at 09:08 -0700, Steve Crawford wrote: Simon Riggs wrote: RESTART IDENTITY will reset the SERIAL sequences back to the original start value. Assuming this feature were to be added In cases where the same sequence has been used across multiple tables, what will be the appropriate response when a user attempts to TRUNCATE one of those tables with RESTART IDENTITY? Well, I'm suggesting it as a TODO item, based on the standard. It would be for whoever took this up to unravel that. Since that's a weak answer, I'd say it should only reset sequences that have been placed there automatically through the use of SERIAL or BIGSERIAL datatypes. All of them? PostgreSQL allow multiple SERIALs to be present, the standard allows only one IDENTITY column in a table. And what about this case below? CREATE TABLE t1 (id1 serial, ...); ALTER SEQUENCE seq_t1_id1 RESTART WITH 5432 CYCLE; or the equivalent CREATE SEQUENCE seq_t1_id1 START WITH 5432 CYCLE; CREATE TABLE t1 (id1 serial, ...); ALTER SEQUENCE seq_t1_id1 OWNED BY t1.id1; PostgreSQL doesn't keep the START WITH information. But it should to perform a restart on the sequence, using the minval in this case wouldn't be correct. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRUNCATE TABLE with IDENTITY
Zoltan Boszormenyi írta: Simon Riggs írta: On Tue, 2008-03-25 at 09:08 -0700, Steve Crawford wrote: Simon Riggs wrote: RESTART IDENTITY will reset the SERIAL sequences back to the original start value. Assuming this feature were to be added In cases where the same sequence has been used across multiple tables, what will be the appropriate response when a user attempts to TRUNCATE one of those tables with RESTART IDENTITY? Well, I'm suggesting it as a TODO item, based on the standard. It would be for whoever took this up to unravel that. Since that's a weak answer, I'd say it should only reset sequences that have been placed there automatically through the use of SERIAL or BIGSERIAL datatypes. All of them? PostgreSQL allow multiple SERIALs to be present, the standard allows only one IDENTITY column in a table. And what about this case below? CREATE TABLE t1 (id1 serial, ...); ALTER SEQUENCE seq_t1_id1 RESTART WITH 5432 CYCLE; or the equivalent CREATE SEQUENCE seq_t1_id1 START WITH 5432 CYCLE; CREATE TABLE t1 (id1 serial, ...); of course CREATE TABLE t1 (id1 integer, ...); ALTER SEQUENCE seq_t1_id1 OWNED BY t1.id1; PostgreSQL doesn't keep the START WITH information. But it should to perform a restart on the sequence, using the minval in this case wouldn't be correct. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [Fwd: Re: [PATCHES] 64-bit CommandIds]
Hi, what's your opinion on this? I saw response only from Alvaro on the -patches list. Thanks in advance, Zoltán Böszörményi Eredeti üzenet Tárgy: Re: [PATCHES] 64-bit CommandIds Dátum: Tue, 04 Mar 2008 21:52:25 +0100 Feladó: Zoltan Boszormenyi [EMAIL PROTECTED] Címzett:pgsql-patches [EMAIL PROTECTED] CC: Alvaro Herrera [EMAIL PROTECTED], Hans-Juergen Schoenig [EMAIL PROTECTED] Hivatkozások: [EMAIL PROTECTED] [EMAIL PROTECTED] Alvaro Herrera írta: Zoltan Boszormenyi wrote: attached is our patch against HEAD which enables extending CommandIds to 64-bit. This is for enabling long transactions that really do that much non-read-only work in one transaction. I think you should add a pg_control field and corresponding check, to avoid a 64bit-Cid postmaster to start on a 32bit-Cid data area and vice versa. I added the check but I needed to add it BEFORE checking for toast_max_chunk_size otherwise it complained about this more cryptic problem. I think it's cleaner to report this failure to know why toast_max_chunk_size != TOAST_MAX_CHUNK_SIZE. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ diff -dcrpN pgsql.orig/configure pgsql-cid64/configure *** pgsql.orig/configure 2008-03-02 13:44:42.0 +0100 --- pgsql-cid64/configure 2008-03-04 16:53:46.0 +0100 *** if test -n $ac_init_help; then *** 1349,1354 --- 1349,1355 Optional Features: --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) --enable-FEATURE[=ARG] include FEATURE [ARG=yes] + --enable-huge-commandidenable 64-bit CommandId support --enable-integer-datetimes enable 64-bit integer date/time support --enable-nls[=LANGUAGES] enable Native Language Support --disable-shareddo not build shared libraries *** fi *** 2175,2180 --- 2176,2219 # + # 64-bit CommandId + # + echo $as_me:$LINENO: checking whether to build with 64-bit CommandId support 5 + echo $ECHO_N checking whether to build with 64-bit CommandId support... $ECHO_C 6 + + pgac_args=$pgac_args enable_huge_commandid + + # Check whether --enable-huge-commandid or --disable-huge-commandid was given. + if test ${enable_huge_commandid+set} = set; then + enableval=$enable_huge_commandid + + case $enableval in + yes) + + cat confdefs.h \_ACEOF + #define USE_64BIT_COMMANDID 1 + _ACEOF + + ;; + no) + : + ;; + *) + { { echo $as_me:$LINENO: error: no argument expected for --enable-huge-commandid option 5 + echo $as_me: error: no argument expected for --enable-huge-commandid option 2;} +{ (exit 1); exit 1; }; } + ;; + esac + + else + enable_huge_commandid=no + + fi; + + echo $as_me:$LINENO: result: $enable_huge_commandid 5 + echo ${ECHO_T}$enable_huge_commandid 6 + + # # 64-bit integer date/time storage (--enable-integer-datetimes) # { echo $as_me:$LINENO: checking whether to build with 64-bit integer date/time support 5 diff -dcrpN pgsql.orig/configure.in pgsql-cid64/configure.in *** pgsql.orig/configure.in 2008-03-02 13:44:43.0 +0100 --- pgsql-cid64/configure.in 2008-03-04 16:53:46.0 +0100 *** PGAC_ARG_REQ(with, libs, [ --with- *** 128,133 --- 128,142 # + # 64-bit CommandId + # + AC_MSG_CHECKING([whether to build with 64-bit CommandId support]) + PGAC_ARG_BOOL(enable, huge-commandid, no, [ --enable-huge-commandidenable 64-bit CommandId support], + [AC_DEFINE([USE_64BIT_COMMANDID], 1, + [Define to 1 if you want 64-bit CommandId support. (--enable-huge-commandid)])]) + AC_MSG_RESULT([$enable_huge_commandid]) + + # # 64-bit integer date/time storage (--enable-integer-datetimes) # AC_MSG_CHECKING([whether to build with 64-bit integer date/time support]) diff -dcrpN pgsql.orig/doc/src/sgml/installation.sgml pgsql-cid64/doc/src/sgml/installation.sgml *** pgsql.orig/doc/src/sgml/installation.sgml 2008-02-18 13:49:58.0 +0100 --- pgsql-cid64/doc/src/sgml/installation.sgml 2008-03-04 17:16:14.0 +0100 *** su - postgres *** 1011,1016 --- 1011,1027 /varlistentry varlistentry +termoption--enable-huge-commandid/option/term +listitem + para + Use 64-bit CommandIds if you are planning to run transactions + consisting of more than 4 billion commands. This is off by default + to save disk space. + /para +/listitem + /varlistentry + + varlistentry termoption--enable-integer-datetimes/option/term listitem para diff -dcrpN pgsql.orig/src/backend/access/transam/xact.c pgsql-cid64/src/backend/access/transam/xact.c *** pgsql.orig/src
[HACKERS] IDENTITY/GENERATED patch
Hi, as the wishlist for PostgreSQL 8.4 still has my IDENTITY/GENERATED patches, I thought I refresh it. Before actually doing it though, I wanted to ask for opinions on implementation and ideas. Here are the general ideas that were done by my patch: 1. IDENTITY columns. The IDENTITY columns are similar to SERIALs as featured by current PostgreSQL with the differences below. They can be of almost any basic type, i.e. other than INTEGER or BIGINT. There are two types of them: GENERATED BY DEFAULT [ AS ( sequence_options ) ] and GENERATED ALWAYS [ AS ( sequence_options ) ] The first behaves more or less as SERIAL in PostgreSQL currently, i.e. upon INSERT and UPDATE such a field can be assigned a value explicitely, they can be updated, etc. So, this can be viewed as nothing more than a SERIAL pseudo-type column, followed by the statements below: ALTER TABLE ... ALTER COLUMN ... TYPE ...; ALTER SEQUENCE seq_name sequence_options; So, the above form is mostly a syntax sugar over already existing features. In fact, this was the reason the acceptance was shot down. The standard describes some behavioural difference that may make it worth to distinguish from plain SERIALs. However, IDENTITY GENERATED ALWAYS has some more constraints. This is more like the autoincrementing fields in other DBMSs, i.e. upon INSERT, usually the value 0 indicates the generation of the next value. With the standard behaviour, any value is ignored and the sequence next value is generated. UPDATE can only use the DEFAULT keyword, not an explicit value. It seems the last draft for SQL:2008 largely clarified details for IDENTITY columns that were either conflicting or unclear in SQL:2003. 2. GENERATED columns The GENERATED column is an automatic way of computing expressions over columns in the same row. This is a feature of DB2. Currently this can be implemented as a BEFORE TRIGGER in PostgreSQL but that requires being a little familiar writing PLPGSQL functions. The syntax of GENERATED columns allows using plain expressions, so it's an easy to use feature. The GENERATED columns are real, stored columns, not on-the-fly computed virtual columns. This means that the value computation cost is in the INSERT and UPDATE statements instead of the SELECTs. This provides a nice speedup which I tested but cannot currently find the mail in the archive. Despite this demonstrable speedup, exactly the non-virtual nature was the cause why Tom Lane didn't like it. There's a problem however with GENERATED columns that was pointed out, regarding BEFORE TRIGGERs. For speed, the GENERATED columns' values can be computed after all BEFORE TRIGGERs were executed. However, this causes that functions running as BEFORE TRIGGERs see the old value of the columns. This detail was mentioned in some of the previous drafts but not in the latest SQL:2008 draft. This means such columns MUST be recomputed before running every BEFORE TRIGGERs and after the last one, too. The performance hit in this case is large. A GUC variable can be introduced to make both the fast and the correct behaviour possible. Based on my last patch: http://archives.postgresql.org/pgsql-patches/2007-05/msg00076.php what problems can you see in it that needs fixing? I know, it extends on the standard. (e.g. DROP IDENTITY, etc.) However, would a strictly standard compliant IDENTITY/GENERATED get accepted into PostgreSQL? Which extensions in the patch are worth keeping? There are two possible problems that I know of. I need to fix the OVERRIDING USER VALUE case as it isn't a NOP. This convinced me that PostgreSQL SERIAL is not equivalent to the IDENTITY GENERATED BY DEFAULT column and the requirement that only one IDENTITY column may exists in a table is strongly assumed by the OVERRIDING clauses. And I just discovered an extension of TRUNCATE TABLE in SQL:2008 that make it possible to restart the sequence behind the identity column upon TRUNCATE. What else? -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Error building 32 bit on 64 bit linux system
Doug Knight írta: All, I am trying to build 8.2.5, forcing to a 32 bit build on a 64 bit system. I have set CFLAGS=-m32, and I run the configure and make/make install as follows: setarch i386 ./configure setarch i386 make setarch i386 make install However, I get the following error (using timezone for example): $ make gcc -m32 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -I../../src/include -D_GNU_SOURCE -c -o localtime.o localtime.c gcc -m32 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -I../../src/include -D_GNU_SOURCE -c -o strftime.o strftime.c gcc -m32 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -I../../src/include -D_GNU_SOURCE -c -o pgtz.o pgtz.c /usr/bin/ld -r -o SUBSYS.o localtime.o strftime.o pgtz.o /usr/bin/ld: Relocatable linking with relocations from format elf32-i386 (localtime.o) to format elf64-x86-64 (SUBSYS.o) is not supported make: *** [SUBSYS.o] Error 1 Funny thing is, there is no SUBSYS.o in my current directory. If I build from the top, I see this same error in each dirctory/makefile where a SUBSYS.o is linked with. If I search my build tree after a top-down build, I do not see any SUBSYS.O files at all. Where is this SUBSYS.o getting created, and why isn't it being created as a 32 bit file instead of 64 bit? This problem is solved by using export CC=gcc -m32 export LD=ld -melf_i386 before doing setarch i386 (./configure ... ; make) Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] IDENTITY/GENERATED patch
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: as the wishlist for PostgreSQL 8.4 still has my IDENTITY/GENERATED patches, I thought I refresh it. Before actually doing it though, I wanted to ask for opinions on implementation and ideas. IIRC the end conclusion on that patch was that it was awfully messy, didn't really buy much in new functionality, and was trying to track a portion of the spec that the SQL committee had apparently not gotten right yet. It seems the last draft for SQL:2008 largely clarified details for IDENTITY columns that were either conflicting or unclear in SQL:2003. Implementing stuff that's still changing in a *draft* standard seems to me like a good way to get burnt. From http://www.wiscorp.com/SQLStandards.html : SQL:2008 Draft International Standard Documents http://www.wiscorp.com/sql200n.zip (Updated 12/14/2007) - *Updated!!!* This points to the /documents which wlll likely be the documents that represent the SQL 2008 Standard. These documents are out for International Standard ballot at this time. The vote is an Up/Down vote. No changes allowed./ I think it's time to start looking at it. I think this needs to stay on the back burner, at least until there's a published final spec that doesn't seem as broken as 2003 was. I was sufficiently disillusioned with the whole thing at the end of the previous review cycle http://archives.postgresql.org/pgsql-hackers/2007-05/msg00441.php that I'm not really eager to pour more time down the same hole. regards, tom lane -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] convert int to bytea
Hi, please don't top post to someone who didn't used this convention in answering you. It's impolite. I edited the mail a bit to return sanity. On Nov 29, 2007 9:00 PM, Douglas McNaught [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On 11/29/07, [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: On Thu, 29 Nov 2007, Gregory Stark wrote: What do you want the resulting bytea to look like? example : id = 9 , bytea = '\000\000\011' IIRC What do you expect to happen when server and client are differently-endian? -Doug Usama Dar írta: Does it matter if you have written an explicit cast for int to bytea? You don't know what't endianness is, do you? Say, you have a number: 0x12345678. This is stored differently depending on the endianness. Big-endian (like Sparc, Motorola, etc): 0x12 0x34 0x56 0x78 Little-endian (Intel-compatibles, etc): 0x78 0x56 0x34 0x12 So, how do you want your number to come out as a byte array? Since a bytea is a sequence of bytes as stored in memory, you may have different meaning for an int-bytea conversion. It's your homework to look up what's network order is. :-) But it would give you consistent answer no matter what CPU your server uses. -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Problem with pg_dump -n schemaname
Hi, we came across a problem when you want to dump only one schema. The ASCII output when loaded with psql into an empty database doesn't produce an identical schema to the original. The problem comes from this statement ordering: SET ... -- some initial DB parameters ... SET search_path = schemaname , pg_catalog; -- the above fails because no schema with this name exists -- as a consequence, the original search_path (e.g. $user, public) -- is not modified DROP INDEX schemaname.index1; ... DROP TABLE schemaname.table1; DROP SCHEMA schemaname; CREATE SCHEMA schemaname; ALTER SCHEMA schemaname OWNER TO schemaowner; CREATE TABLE table1; -- note that it was DROPped with full name schemaname.table1 ... So, because search_path is ' $user, public ' for e.g. postgres, the tables are created in the public schema. Hence, I propose the attached patch which issues SET search_path = ... statements before the first CREATE TABLE stmt in their respective schema instead of before the first DROP command. The problem manifests only when you dump only one schema. The same problem exists in at least 8.0.3, 8.2.5 and last 8.3cvs. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH http://www.postgresql.at/ --- postgresql-8.2.5.orig/src/bin/pg_dump/pg_backup_archiver.c 2007-08-06 03:38:24.0 +0200 +++ postgresql-8.2.5/src/bin/pg_dump/pg_backup_archiver.c 2007-11-16 11:00:46.0 +0100 @@ -241,9 +241,6 @@ { /* We want the schema */ ahlog(AH, 1, dropping %s %s\n, te-desc, te-tag); -/* Select owner and schema as necessary */ -_becomeOwner(AH, te); -_selectOutputSchema(AH, te-namespace); /* Drop it */ ahprintf(AH, %s, te-dropStmt); } @@ -275,6 +272,10 @@ { ahlog(AH, 1, creating %s %s\n, te-desc, te-tag); + /* Select owner and schema as necessary */ + _becomeOwner(AH, te); + _selectOutputSchema(AH, te-namespace); + _printTocEntry(AH, te, ropt, false, false); defnDumped = true; ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] support for hyper-long transactions ...
Heikki Linnakangas írta: Hans-Juergen Schoenig wrote: at the moment i am working on an application which is supposed to run extremely large transactions (a lot of server side stored procedure stuff which can hardly be split into small transactions for visibility reasons). so, from time to time it happens that i exceed my CommandCounter ( 2.000.000.000 statements inside the same transaction). Interesting. What kind of a stored procedure is that? One optimization that might be possible is to not increment it for statements that do on-disk changes. my idea is: how about adding a configure option to compile postgres with a 64 bit command counter. this would allow larger transactions for special purpose applications while it would not have an impact on normal applications. One objection to expanding TransactionId to 64-bits has been that we depend on assignment of TransactionId to be atomic. That might not be an issue with CommandIds; I don't think we store commandids in shared memory structures. It still doesn't feel like a good idea to me, though. Isn't 64-bit assignment atomic on 64-bit machines? With a little autoconf magic, the conditional can be disabled for 32-bit platforms. So that's not a real obstacle for going to 64-bit TransactionIds. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] XID wraparound and busy databases
Tom Lane írta: It's hard to see how anyone could be doing 6K xacts/sec unless most are read-only. regards, tom lane In a recent stress test with our PostgreSQL-based cluster between two machines 3 million transaction were performed with pgbench -c 150 -t 2 -s 200 in about _ten hours_. The primary machine (desktop-level machine for development) used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want about my disk lying about flush, its 75MB/sec transfer rate transfer rate is real. So 5 million real transaction in 24 hours is not unrealistic. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] XID wraparound and busy databases
Heikki Linnakangas írta: Zoltan Boszormenyi wrote: Tom Lane írta: It's hard to see how anyone could be doing 6K xacts/sec unless most are read-only. In a recent stress test with our PostgreSQL-based cluster between two machines 3 million transaction were performed with pgbench -c 150 -t 2 -s 200 in about _ten hours_. The primary machine (desktop-level machine for development) used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want about my disk lying about flush, its 75MB/sec transfer rate transfer rate is real. So 5 million real transaction in 24 hours is not unrealistic. 6k xacts / s is five *hundred* million transactions, not five million... Blush. :-) You're right. However a single machine with ramdisk is able to do that. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] New cast between inet/cidr and bytea
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Bruce Momjian írta: What is the use case for such a cast? The application doesn't want to parse the textual IP address when all the parsing and checking intelligence is already there in the inet/cidr type checks. This presumes exactly the assumption we are questioning, namely that there's a universal binary representation for these things. But there is: network order. There might be such for bare IP addresses (ignoring endianness) but the argument doesn't scale to CIDR. Would you enlighten me why not? You've also failed to make the case that this application designer has made a sane judgment about whether avoiding parsing is a good tradeoff here. So, reinventing the wheel is always the way to go? Even when the app is actually storing those IP addresses with the type and features PostgreSQL provides? Also: to the extent that the application is willing to deal with a Postgres-specific inet/cidr representation (which, in the end, is what this would be) it can do that *today* using binary output format. So I'm still not seeing an argument for exposing a cast to bytea. regards, tom lane But the binary output of inet/cidr needs another round of parsing which requires using internal server headers. Would you like a 4/8/16/32 byte output using IP only or IP + fully represented netmask better? Best regards, -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] New cast between inet/cidr and bytea
Hi, we at Cybertec have developed cast functions in C between inet/cidr - bytea for a client and we would like to submit it. This is how it works: - IPv4 inet/cidr value will return 4 bytes if the netmask covers all 32 bits. It returns 5 bytes if the netmask is shorter than 32 bits. - Similarly for IPv6, 12[+1] bytes is returned. - 4, 5, 12 or 13 bytes long bytea values are converted to an inet/cidr value, the 5th or 13th byte is range-checked to be a valid netmask value. What are the requirements for it to be accepted as a core cast function or as a contrib module? -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] New cast between inet/cidr and bytea
Bruce Momjian írta: Zoltan Boszormenyi wrote: Hi, we at Cybertec have developed cast functions in C between inet/cidr - bytea for a client and we would like to submit it. This is how it works: - IPv4 inet/cidr value will return 4 bytes if the netmask covers all 32 bits. It returns 5 bytes if the netmask is shorter than 32 bits. - Similarly for IPv6, 12[+1] bytes is returned. - 4, 5, 12 or 13 bytes long bytea values are converted to an inet/cidr value, the 5th or 13th byte is range-checked to be a valid netmask value. What are the requirements for it to be accepted as a core cast function or as a contrib module? You discuss it on the hackers list. That's what I am trying to. Have you read the developer's FAQ? Not yet in its entireness. What is the use case for such a cast? e.g.: # select '192.168.0.1'::inet::bytea; bytea -- \300\250\000\001 (1 row) # select '192.168.0.0/24'::inet::bytea; bytea -- \300\250\000\000\030 (1 row) # select decode('\\300\\250\\000\\001', 'escape')::inet; decode - 192.168.0.1 (1 row) # select decode('\\300\\250\\000\\000\\030', 'escape')::inet; decode 192.168.0.2/24 (1 row) Similarly for IPv6 addresses. The application doesn't want to parse the textual IP address when all the parsing and checking intelligence is already there in the inet/cidr type checks. The reverse when you pass in bytes is only a logical extension. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] New cast between inet/cidr and bytea
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: we at Cybertec have developed cast functions in C between inet/cidr - bytea for a client and we would like to submit it. Why is this a good idea? Exposing the internal representation of a datatype is usually bad. I didn't expose the inet/cidr internals. An IP address is an IP address whether you represent it textually or as a series of bytes. Since the rise of the CIDR type netmask usage, there is not much support for netmasks with holes in the middle of the bits, like 255.255.192.128 on the 'net. And there is no support for this kind of netmask in PostgreSQL either. So, to don't lose data (the netmask _is_ an important data) we decided to go this way. A single IP address without the netmask means the netmask covers the whole 32 or 128 bits in real life, too. What will you do when we add support for scoped IPv6 addressing, to take one obvious example? It can still be represented as a series of bytes, won't it? Just as in an actual IP packet header. When the support arrives, I can fix the cast, too, if needed. regards, tom lane Best regards -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Behavior of GENERATED columns per SQL2003
Tom Lane írta: After some more study of the SQL spec, the distinction between GENERATED ALWAYS AS IDENTITY and GENERATED BY DEFAULT AS IDENTITY is not what I thought it was. * As far as I can find from the spec, there is *no* difference between the two cases for INSERT commands. The rule is that you ignore any user-supplied data and use the default (ie, nextval()) unless OVERRIDING SYSTEM VALUE is specified. It is not an error to try to insert data into an identity column, it's just ignored unless OVERRIDING SYSTEM VALUE. * The difference for UPDATE commands is that you can update a BY DEFAULT identity column to anything you want, whereas for an ALWAYS identity it's an error to update to anything but DEFAULT (which causes a fresh nextval() to be assigned). Both behaviors are different from a generated column, which is updated whether you mention it or not. The quoted SIGMOD paper mentioned that specifying a value for a generated column should raise an error in INSERT but this behaviour is not mentioned by the standard. BTW, do you know what's a self-referencing column? I haven't found a definition of it and there are places where the standard uses this term on behaviour that would be natural for generated columns. E.g. page 860 in latest drafts, section 10.14, or SQL:2003, section 14.8, about INSERT statement: the value the user specified should be stored if some underlying column of Ci is a self-referencing column and OVERRIDING SYSTEM VALUE is specified. This means that GENERATED BY DEFAULT AS IDENTITY is not at all equivalent to our historical behavior for SERIAL columns and hence we cannot merge the two cases. Yes, they are equivalent if you read 5IWD2-02-Foundation-2006-04.pdf or 5CD2-02-Foundation-2006-01.pdf, i.e. the latest two drafts. (The latter seems to be misnamed considering that www.wiscorp.com refreshed the sql200n.zip on 2007-03-11.) Page 860, section 14.10, INSERT. The value the user provides should be accepted for storage if: - the column is an identity column and you provide OVERRIDING SYSTEM VALUE, or - the column is an GENERATED BY DEFAULT AS IDENTITY and you provide neither OVERRIDING USER VALUE nor the DEFAULT specification for the column. I think the babble about OVERRIDING USER VALUE is somewhat controversial. Why would you want to do e.g. INSERT INTO tabname (id, ...) OVERRIDING USER VALUE (N, ...); where N is an explicit constant? And I haven't even implemented handling it. Anyway, without specifying OVERRIDING USER VALUE the GENERATED BY DEFAULT AS IDENTITY is equivalent with traditional SERIAL in PostgreSQL. Implementing OVERRIDING USER VALUE behaviour means that GENERATED BY DEFAULT AS IDENTITY (or SERIAL) would be marked as an identity as well, not as a column simply having a DEFAULT clause. Otherwise OVERRIDING USER VALUE would override every user-specified value for regular columns having a DEFAULT expression. The lack of any behavioral difference for INSERT seems surprising and counterintuitive; have I just missed something in the spec? No, I was just ahead of the times and read newer drafts than SQL:2003. BTW, I found what they did about the problem that generated columns are out of sync with their underlying columns during BEFORE-trigger execution: in 11.39 12)If BEFORE is specified, then: ... c) The triggered action shall not contain a field reference that references a field in the new transition variable corresponding to a generated column of T. I vaguely remember reading it, although the idea seem to have remained in my mind. :-) IOW they just pretend you can't look. So I think we need not worry about leaving the values out-of-date until after the triggers fire. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Behavior of GENERATED columns per SQL2003
Zoltan Boszormenyi írta: The quoted SIGMOD paper mentioned that specifying a value for a generated column should raise an error in INSERT but this behaviour is not mentioned by the standard. I found it now, I haven't read hard enough before. SQL:2003, section 14.8, syntax rules: 10) If contextually typed table value constructor CTTVC is specified, then every contextually typed row value constructor element simply contained in CTTVC whose positionally corresponding column name in insert column list references a column of which some underlying column is a generated column shall be a default specification. So, I can only omit the generated column or specify DEFAULT. Anything else should raise an error. Should it be done in analyze.c or in rewriteHandler.c? -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] parser dilemma
Martijn van Oosterhout írta: On Thu, Apr 19, 2007 at 11:19:40AM +0200, Zoltan Boszormenyi wrote: The problem comes from cases like colname coltype DEFAULT 5! GENERATED ... Since b_expr allows postfix operators, it takes one more token of lookahead than we have to tell if the default expression is 5! or 5!GENERATED ISTM that as long as: colname coltype DEFAULT (5!) GENERATED ... works I don't see why it would be a problem to require the parentheses in this case. Postfis operators are not going to be that common here I think. Have a nice day, You mean like this one? *** gram.y.old 2007-04-20 09:23:16.0 +0200 --- gram.y 2007-04-20 09:25:34.0 +0200 *** *** 7550,7557 { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, $3, @2); } | qual_Op b_expr%prec Op { $$ = (Node *) makeA_Expr(AEXPR_OP, $1, NULL, $2, @1); } ! | b_expr qual_Op%prec POSTFIXOP ! { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); } | b_expr IS DISTINCT FROM b_expr %prec IS { $$ = (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, =, $1, $5, @2); --- 7550,7557 { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, $3, @2); } | qual_Op b_expr%prec Op { $$ = (Node *) makeA_Expr(AEXPR_OP, $1, NULL, $2, @1); } ! | '(' b_expr qual_Op ')' %prec POSTFIXOP ! { $$ = (Node *) makeA_Expr(AEXPR_OP, $3, $2, NULL, @3); } | b_expr IS DISTINCT FROM b_expr %prec IS { $$ = (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, =, $1, $5, @2); This change alone brings 13 reduce/reduce conflicts. Best regards -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] parser dilemma
Andrew Dunstan írta: Zoltan Boszormenyi wrote: Martijn van Oosterhout írta: On Thu, Apr 19, 2007 at 11:19:40AM +0200, Zoltan Boszormenyi wrote: The problem comes from cases like colname coltype DEFAULT 5! GENERATED ... Since b_expr allows postfix operators, it takes one more token of lookahead than we have to tell if the default expression is 5! or 5!GENERATED ISTM that as long as: colname coltype DEFAULT (5!) GENERATED ... works I don't see why it would be a problem to require the parentheses in this case. Postfis operators are not going to be that common here I think. Have a nice day, You mean like this one? *** gram.y.old 2007-04-20 09:23:16.0 +0200 --- gram.y 2007-04-20 09:25:34.0 +0200 *** *** 7550,7557 { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, $3, @2); } | qual_Op b_expr%prec Op { $$ = (Node *) makeA_Expr(AEXPR_OP, $1, NULL, $2, @1); } ! | b_expr qual_Op%prec POSTFIXOP ! { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); } | b_expr IS DISTINCT FROM b_expr%prec IS { $$ = (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, =, $1, $5, @2); --- 7550,7557 { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, $3, @2); } | qual_Op b_expr%prec Op { $$ = (Node *) makeA_Expr(AEXPR_OP, $1, NULL, $2, @1); } ! | '(' b_expr qual_Op ')' %prec POSTFIXOP ! { $$ = (Node *) makeA_Expr(AEXPR_OP, $3, $2, NULL, @3); } | b_expr IS DISTINCT FROM b_expr%prec IS { $$ = (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, =, $1, $5, @2); This change alone brings 13 reduce/reduce conflicts. No - that's not what you do. All you need to do is remove those 2 lines from the b_expr rules. The postfix rule will still be in a_expr and the parenthesized bit is taken care of in the ( a_expr ) rule for c_expr. cheers andrew I just sent a new patch that marks GENERATED as %right, which also solved the problem. Best regards, Zoltán -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] parser dilemma
Tom Lane írta: ... If anyone seriously wants to propose removing postfix ops from b_expr, we'd better take it up on someplace more widely read than -patches. regards, tom lane OK, I take the bullet and send it to -hackers. For everyone who don't read -patches, let me reiterate the problem During developing my GENERATED/IDENTITY patches, a parser problem turned up. Currently, DEFAULT is handled as a CONSTRAINT by the parser to be able to write DEFAULT clause and CONSTRAINT clauses in any order. Handling GENERATED { ALWAYS | BY DEFAULT} AS { IDENTITY | ( expression ) } syntax in the same way causes a conflict between DEFAULT and b_expr as discovered by Tom Lane. He proposed two solutions, quote: The problem comes from cases like colname coltype DEFAULT 5! GENERATED ... Since b_expr allows postfix operators, it takes one more token of lookahead than we have to tell if the default expression is 5! or 5!GENERATED There are basically two ways to fix this: 1. Collapse GENERATED ALWAYS and GENERATED BY into single tokens using filtered_base_yylex. 2. Stop allowing postfix operators in b_expr. I find #1 a bit icky --- not only does every case added to filtered_base_yylex slow down parsing a little more, but combined tokens create rough spots in the parser's behavior. As an example, both NULLS and FIRST are allegedly unreserved words, so this should work: regression=# create table nulls (x int); CREATE TABLE regression=# select first.* from nulls first; ERROR: syntax error at or near first LINE 1: select first.* from nulls first; ^ regression=# #2 actually seems like a viable alternative: postfix operators aren't really in common use, and doing this would not only fix GENERATED but let us de-reserve a few keywords that are currently reserved. In a non-exhaustive check I found that COLLATE, DEFERRABLE, and INITIALLY could become unreserved_keyword if we take out this production: *** 7429,7436 { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, $3, @2); } | qual_Op b_expr%prec Op { $$ = (Node *) makeA_Expr(AEXPR_OP, $1, NULL, $2, @1); } - | b_expr qual_Op%prec POSTFIXOP - { $$ = (Node *) makeA_Expr(AEXPR_OP, $2, $1, NULL, @2); } | b_expr IS DISTINCT FROM b_expr%prec IS { $$ = (Node *) makeSimpleA_Expr(AEXPR_DISTINCT, =, $1, $5, @2); --- 7550,7555 (Hmm, actually I'm wondering why COLLATE is a keyword at all right now... but the other two trace directly to the what-comes-after-DEFAULT issue.) I proposed a third solution, that is actually standard-conforming and still leaves the possibility of having postfix operators. The solution was to admit that DEFAULT is not a CONSTRAINT, hence not mixable with them. The standard has this syntax: column definition ::= column name [ data type or domain name ] [ default clause | identity column specification | generation clause ] [ column constraint definition... ] [ collate clause ] This says that DEFAULT | GENERATED ... AS IDENTITY | GENERATED ALWAYS AS ( expr ) must come after the data type and before any CONSTRAINTs and the three forms are mutually exclusive. This can be nicely handled by the parser and the analyzer phase can save some cycles by not checking for conflicting DEFAULT clauses. What do people think? Which would be the preferred solution? Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Hi, Zoltan Boszormenyi írta: Zoltan Boszormenyi írta: Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: So, I should allow DROP DEFAULT, implement SET DEFAULT GENERATED ALWAYS AS and modify the catalog so the GENERATED property is part of pg_attrdef. Sounds good. Finally here it is. What about IDENTITY? Should it also be part of pg_attrdef? There are two ways to implement it: have or don't have a notion of it. The latter would treat GENERATED BY DEFAULT AS IDENTITY the same as SERIAL. Is there any good reason to distinguish the two? Actually, I needed to have a flag for IDENTITY but not for the reason above. I need it to distinguish between GENERATED ALWAYS AS IDENTITY and GENERATED ALWAYS AS ( expr ). Changes: - Rewritten the GENERATED/IDENTITY flags to be part of the default pg_attrdef This made the patch MUCH smaller. - SERIALs are now the same as INTEGER GENERATED BY DEFAULT AS IDENTITY - Allow DROP DEFAULT on GENERATED/IDENTITY columns - Implemented SET GENERATED ALWAYS AS - Modified syntax of SET GENERATED {ALWAYS | BY DEFAULT} AS IDENTITY so it reads as SET IDENTITY GENERATED {ALWAYS | BY DEFAULT} so compiling gram.y/gram.c doesn't give me errors. This DDL statement isn't part of SQL:2003 so it might be accepted as a PostgreSQL extension. - Modified behaviour of SET IDENTITY to also restore the DEFAULT expression. Someone might have done did a DROP DEFAULT before but kept the OWNED sequence. - Fixed behaviour of GENERATED columns regarding INSERT ... OVERRIDING SYSTEM VALUE and only those GENERATED columns get UPDATEd that are either explicitly modified with SET column = DEFAULT or one of their referenced columns are modified. - Testcase and documentation is modified to reflect the above. - Also allowed UPDATE on IDENTITY columns. Please, review. I just realized that by treating SERIAL the same as IDENTITY GENERATED BY DEFAULT, I incidentally broke the possibility of multiple SERIALs in the same table. I rewrote the patch so instead of two BOOL flags, I now have only one CHAR flag: - ' ' says it's a simple DEFAULT expression - 'i' says it's GENERATED ALWAYS AS IDENTITY - 'g' says it's GENERATED ALWAYS AS ( expr ) Apart from making the patch a bit smaller again, checking only for 'i' still allows multiple SERIALs in the same table but lets disallowing multiple GENERATED ALWAYS AS IDENTITY. Thinking a bit about it, is it desired to disallow multiple GENERATED ALWAYS AS IDENTITY fields? It's just a twisted SERIAL anyway. And it was said many times that it's not an advantage to blindly follow the standard. Also, DROP IDENTITY is equivalent with SET DEFAULT nextval('owned_sequence') iff the field has an OWNED sequence and it was GENERATED ALWAYS AS IDENTITY before. Considering that SERIAL is a macro, and SET/DROP DEFAULT is allowed on IDENTITY/GENERATED columns per Tom's request, should I keep this statement? Also, the current grammar is made to give a syntax error if you say colname type GENERATED BY DEFAULT AS ( expr ). But it makes the grammar unbalanced, and gives me: bison -y -d gram.y conflicts: 2 shift/reduce Is there a good solution to this? I post the new patch after someone answers those questions for me. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Apart from making the patch a bit smaller again, checking only for 'i' still allows multiple SERIALs in the same table but lets disallowing multiple GENERATED ALWAYS AS IDENTITY. Thinking a bit about it, is it desired to disallow multiple GENERATED ALWAYS AS IDENTITY fields? It's just a twisted SERIAL anyway. I don't see the value of disallowing it. I thought so. Also, DROP IDENTITY is equivalent with SET DEFAULT nextval('owned_sequence') iff the field has an OWNED sequence and it was GENERATED ALWAYS AS IDENTITY before. Considering that SERIAL is a macro, and SET/DROP DEFAULT is allowed on IDENTITY/GENERATED columns per Tom's request, should I keep this statement? If it's not in the spec I don't see any strong reason to have it... It's not. Removed. Also, the current grammar is made to give a syntax error if you say colname type GENERATED BY DEFAULT AS ( expr ). But it makes the grammar unbalanced, and gives me: bison -y -d gram.y conflicts: 2 shift/reduce You'll have to fix that. Usually you can get around it by making the grammar a bit more verbose --- if you were trying to avoid duplication by means of optional productions, don't do that. What do you mean by making it more verbose? GENERATED BY DEFAULT AS ( expr ) is another way of saying DEFAULT expr but that being similar to GENERATED ALWAYS AS ( expr ) would make the users think that it would permit smarter expressions than simple DEFAULT would allow. My thought was to disallow this in the grammar. BTW, thanks for the quick answer. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Re: IDENTITY/GENERATED v36 Re: [PATCHES] Final version of IDENTITY/GENERATED patch
Andrew Dunstan írta: Florian G. Pflug wrote: bison -y -d gram.y conflicts: 2 shift/reduce I'ts been quite a time since I last used bison, but as far as I remember, you can tell it to write a rather details log about it's analysis of the grammar. That log should include more detailed information about those conflicts - maybe that helps to figure out their exact cause, and to find a workaround. You can almost always get rid of shift/reduce conflicts by unwinding some of the productions - resist the temptation to factor the grammar. The effect of this is to eliminate places where the parser has to decide between shifting and reducing. (This is why, for example, almost all the drop foo if exists variants require separate productions rather than using opt_if_exists.) cheers andrew Thanks. This idea solved one of the two shift/reduce conflicts. But the other one can only be solved if I put GENERATED into the reserved_keyword set. But the standard spec says it's unreserved. Now what should I do with it? Best regards, Zoltán ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Crash bug in 8.2.3 on Solaris 10/Sparc
Zoltan Boszormenyi írta: Zdenek Kotala írta: Zoltan Boszormenyi wrote: Hi, we have found that psql in PostgreSQL 8.2.3 has problems connecting to the server running on Solaris 10/Sun SPARC. $ uname -a SunOS dev-machine 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Fire-V440 It seems that somehow the system provided GCC 3.4.3 miscompiles timestamptz_send() and it segfaults. The default function looks like this: Can you send me how you compiled Postgres (configure switches, LDFLAGS ...) and is possible get core file? This was the configure line: ./configure --prefix=/export/local/postgresql/postgresql-8.2.3 --with-includes=/usr/local/include --with-libraries=/usr/local/lib/ I added --enable-debug --enable-depend --enable-cassert to get sensible gdb report after that. The problem was that the server had problems after psql connected with these commands: $ psql -l -h dev-machine -p 5477 -U user psql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. $ psql -h dev-machine -p 5477 -U user template1 psql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. If the user doesn't have permissions in e.g. pg_hba.conf then I get the correct permission denied error. If the user can connect then some statement inside psql causes segfault in the server. Compiled with debug info, I got this from gdb on the core file: $ gdb /.../pgsql/bin/postgres /.../data/core ... Program terminated with signal 11, Segmentation fault. #0 0x0021c8a0 in timestamptz_send (fcinfo=0x1) at timestamp.c:461 461 PG_RETURN_BYTEA_P(pq_endtypsend(buf)); (gdb) I described my experiments, compiling with --enable-integer-datetimes fixed the issue. We compiled GCC-4.1.2 on this machine, recompiled PostgreSQL with the new GCC without --enable-integer-datetimes and it fixed the problem we experienced. It seems that my suspicion was right: GCC-3.4.3 on Solaris 10/Sparc is buggy. -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Crash bug in 8.2.3 on Solaris 10/Sparc
Zoltan Boszormenyi írta: Zoltan Boszormenyi írta: Zdenek Kotala írta: Zoltan Boszormenyi wrote: Hi, we have found that psql in PostgreSQL 8.2.3 has problems connecting to the server running on Solaris 10/Sun SPARC. $ uname -a SunOS dev-machine 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Fire-V440 It seems that somehow the system provided GCC 3.4.3 miscompiles timestamptz_send() and it segfaults. The default function looks like this: Can you send me how you compiled Postgres (configure switches, LDFLAGS ...) and is possible get core file? This was the configure line: ./configure --prefix=/export/local/postgresql/postgresql-8.2.3 --with-includes=/usr/local/include --with-libraries=/usr/local/lib/ I added --enable-debug --enable-depend --enable-cassert to get sensible gdb report after that. The problem was that the server had problems after psql connected with these commands: $ psql -l -h dev-machine -p 5477 -U user psql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. $ psql -h dev-machine -p 5477 -U user template1 psql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. If the user doesn't have permissions in e.g. pg_hba.conf then I get the correct permission denied error. If the user can connect then some statement inside psql causes segfault in the server. Compiled with debug info, I got this from gdb on the core file: $ gdb /.../pgsql/bin/postgres /.../data/core ... Program terminated with signal 11, Segmentation fault. #0 0x0021c8a0 in timestamptz_send (fcinfo=0x1) at timestamp.c:461 461 PG_RETURN_BYTEA_P(pq_endtypsend(buf)); (gdb) I described my experiments, compiling with --enable-integer-datetimes fixed the issue. We compiled GCC-4.1.2 on this machine, recompiled PostgreSQL with the new GCC without --enable-integer-datetimes and it fixed the problem we experienced. It seems that my suspicion was right: GCC-3.4.3 on Solaris 10/Sparc is buggy. Oh, and the proof that I use the newly compiled version: $ psql -h reddb-dev-pgr -p 5477 test Welcome to psql 8.2.3, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit test=# select version(); version PostgreSQL 8.2.3 on sparc-sun-solaris2.10, compiled by GCC gcc (GCC) 4.1.2 (1 row) test=# show integer_datetimes; integer_datetimes --- off (1 row) -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Crash bug in 8.2.3 on Solaris 10/Sparc
Hi, we have found that psql in PostgreSQL 8.2.3 has problems connecting to the server running on Solaris 10/Sun SPARC. $ uname -a SunOS dev-machine 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Fire-V440 It seems that somehow the system provided GCC 3.4.3 miscompiles timestamptz_send() and it segfaults. The default function looks like this: Datum timestamptz_send(PG_FUNCTION_ARGS) { TimestampTz timestamp = PG_GETARG_TIMESTAMPTZ(0); StringInfoData buf; pq_begintypsend(buf); #ifdef HAVE_INT64_TIMESTAMP pq_sendint64(buf, timestamp); #else pq_sendfloat8(buf, timestamp); #endif PG_RETURN_BYTEA_P(pq_endtypsend(buf)); } GDB indicates crash at the last line. No matter how I unrolled the function calls, the indicated crasher line was always the one before: pq_sendfloat8(buf, timestamp); I must be a stack corruption somehow. I also unrolled pq_sendfloat8() so the function looks like this: Datum timestamptz_send(PG_FUNCTION_ARGS) { TimestampTz timestamp = PG_GETARG_TIMESTAMPTZ(0); StringInfoData buf; bytea *byteap; union { float8 f; int64 i; } swap; uint32 n32; pq_begintypsend(buf); #ifdef HAVE_INT64_TIMESTAMP pq_sendint64(buf, timestamp); elog(NOTICE, timestamptz_send() HAVE_INT64_TIMESTAMP after pq_sendint64); #else swap.f = (float8)timestamp; elog(NOTICE, timestamptz_send() int64: %lld, swap.i); /* High order half first, since we're doing MSB-first */ #ifdef INT64_IS_BUSTED /* don't try a right shift of 32 on a 32-bit word */ n32 = (swap.i 0) ? -1 : 0; elog(NOTICE, timestamptz_send() INT64_IS_BUSTED high 32: %d, n32); #else n32 = (uint32) (swap.i 32); elog(NOTICE, timestamptz_send() high 32: %d, n32); #endif n32 = htonl(n32); elog(NOTICE, timestamptz_send() htonl high 32: %d, n32); appendBinaryStringInfo(buf, (char *) n32, 4); /* Now the low order half */ n32 = (uint32) swap.i; elog(NOTICE, timestamptz_send() low 32: %d, n32); n32 = htonl(n32); elog(NOTICE, timestamptz_send() htonl low 32: %d, n32); appendBinaryStringInfo(buf, (char *) n32, 4); elog(NOTICE, timestamptz_send() pq_sendfloat8); #endif byteap = (bytea *) buf.data; elog(NOTICE, timestamptz_send() buf-data = %p, byteap); Assert(buf.len = VARHDRSZ); VARATT_SIZEP(byteap) = buf.len; PG_RETURN_BYTEA_P(byteap); } Th crashing line according to GDB is now the elog() call after: swap.f = (float8)timestamp; This is a simple explicit type cast which shouldn't cause problems, however it is the one that somehow corrupts something on the stack and causes the segfault upon entering the function at the next statement. As a workaround, we recompiled PostgreSQL 8.2.3 with --enable-integer-datetimes and the client can connect to the server now, after initdb. I tried to exercise calling timestamptz_send() but creating a table with float8 field, INSERTing and SELECTing works, too. Both textual and binary COPY FROM and COPY TO work, too. Either these exercises didn't call pq_sendfloat8() or it doesn't cause problems elsewhere, only in timestamptz_send(). -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Crash bug in 8.2.3 on Solaris 10/Sparc
Zdenek Kotala írta: Zoltan Boszormenyi wrote: Hi, we have found that psql in PostgreSQL 8.2.3 has problems connecting to the server running on Solaris 10/Sun SPARC. $ uname -a SunOS dev-machine 5.10 Generic_118833-36 sun4u sparc SUNW,Sun-Fire-V440 It seems that somehow the system provided GCC 3.4.3 miscompiles timestamptz_send() and it segfaults. The default function looks like this: Can you send me how you compiled Postgres (configure switches, LDFLAGS ...) and is possible get core file? This was the configure line: ./configure --prefix=/export/local/postgresql/postgresql-8.2.3 --with-includes=/usr/local/include --with-libraries=/usr/local/lib/ I added --enable-debug --enable-depend --enable-cassert to get sensible gdb report after that. The problem was that the server had problems after psql connected with these commands: $ psql -l -h dev-machine -p 5477 -U user psql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. $ psql -h dev-machine -p 5477 -U user template1 psql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. If the user doesn't have permissions in e.g. pg_hba.conf then I get the correct permission denied error. If the user can connect then some statement inside psql causes segfault in the server. Compiled with debug info, I got this from gdb on the core file: $ gdb /.../pgsql/bin/postgres /.../data/core ... Program terminated with signal 11, Segmentation fault. #0 0x0021c8a0 in timestamptz_send (fcinfo=0x1) at timestamp.c:461 461 PG_RETURN_BYTEA_P(pq_endtypsend(buf)); (gdb) I described my experiments, compiling with --enable-integer-datetimes fixed the issue. Did you try compile with different optimalization flags or did you try sun studio compiler? No, and no. Sun Studio isn't installed, only gcc. Zdenek -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Crash bug in 8.2.3 on Solaris 10/Sparc
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: we have found that psql in PostgreSQL 8.2.3 has problems connecting to the server running on Solaris 10/Sun SPARC. ... It seems that somehow the system provided GCC 3.4.3 miscompiles timestamptz_send() and it segfaults. I find it fairly hard to believe that timestamptz_send would be invoked at all while using psql, much less during initial connection. psql doesn't do any binary-output requests. regards, tom lane Then please explain this miracle. Anyway, your comment makes my suspicion about the correctness of GCC-3.4.3 on Solaris 10/sparc more founded now. :-) -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Test report on GENERATED/IDENTITY
Hi, I made some tests to prove that GENERATED can help boost performance. I created a table like this: create table t1 ( id serial, i1 integer, i2 integer, g1 integer generated always as ( case when i1 is null then i2 when i2 is null then i1 else i1 + i2 end)); I inserted 1 million records into the table: for (i = 1; i = 1000; i++) for (j = 1; j = 1000; j++) INSERT INTO t1 (i1, i2) VALUES (i, j); After VACUUM FULL ANALYZE, I timed SELECT id, i1, i2, g1 FROM t1 and SELECT id, i1, i2, generation expression FROM t1, result redirected to /dev/null. Results of ten consecutive runs are: SELECT id, i1, i2, g1 FROM t1 -- 2.59user 0.18system 0:04.75elapsed 58%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+33536minor)pagefaults 0swaps 2.57user 0.19system 0:04.59elapsed 60%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+34068minor)pagefaults 0swaps 2.60user 0.16system 0:04.60elapsed 60%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+33826minor)pagefaults 0swaps 2.59user 0.17system 0:04.82elapsed 57%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+34067minor)pagefaults 0swaps 2.59user 0.16system 0:04.61elapsed 59%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+34068minor)pagefaults 0swaps 2.61user 0.17system 0:04.60elapsed 60%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+33574minor)pagefaults 0swaps 2.59user 0.18system 0:04.60elapsed 60%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+34067minor)pagefaults 0swaps 2.59user 0.18system 0:04.67elapsed 59%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+32147minor)pagefaults 0swaps 2.58user 0.19system 0:04.63elapsed 59%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+33584minor)pagefaults 0swaps 2.73user 0.16system 0:04.94elapsed 58%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+34066minor)pagefaults 0swaps Average is 4.68 seconds. SELECT id, i1, i2, generation expression FROM t1 -- 2.76user 0.18system 0:05.49elapsed 53%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+33945minor)pagefaults 0swaps 2.69user 0.17system 0:05.23elapsed 54%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+33917minor)pagefaults 0swaps 2.60user 0.18system 0:05.04elapsed 55%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+32027minor)pagefaults 0swaps 2.64user 0.16system 0:05.03elapsed 56%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+32626minor)pagefaults 0swaps 2.60user 0.15system 0:05.03elapsed 54%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+32550minor)pagefaults 0swaps 2.77user 0.18system 0:05.21elapsed 56%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+32609minor)pagefaults 0swaps 2.71user 0.17system 0:05.07elapsed 57%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+33043minor)pagefaults 0swaps 2.66user 0.17system 0:05.12elapsed 55%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+33042minor)pagefaults 0swaps 2.80user 0.16system 0:05.19elapsed 57%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+33043minor)pagefaults 0swaps 2.71user 0.16system 0:05.14elapsed 56%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+32052minor)pagefaults 0swaps Average is 5.16 seconds. Using a single GENERATED column gains a good 9.4% in runtime. With the IDENTITY, the picture is not so bright. I expected some cost but not that much. Why is [simple_]heap_update() so expensive? I created a table and times inserting 1 million rows into it: create table t2 (id serial, i1 integer); or create table t2 (id serial generated always as identity, i1 integer); Using a serial column gave me about 12 seconds on the average of 5 runs. With an IDENTITY column, I got 61 seconds once and 66 seconds twice. So, the strictness of the identity column gave me 500-550% performance penalty. With a single unique index on i1, I got 24.4 seconds with the serial column and 67 seconds for the identity column. I run these only once so this last one isn't representative. I tried to use heap_inplace_update() to update the newly updated or inserted tuple in place but it gave me ERROR: heap_inplace_update: wrong tuple length even when I already filled the IDENTITY column with a constant Datum with an Int64 value 0 converted to the type of the column. If I read it correctly, the HOT patch would give me a speedup for this case? Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?
Florian G. Pflug írta: Zoltan Boszormenyi wrote: The GENERATED column is an easy of use feature with possibly having less work, whereas the IDENTITY column is mandatory for some applications (e.g. accounting and billing is stricter in some countries) where you simply cannot skip a value in the sequence, the strict monotonity is not enough. But just postponing nextval() until after the uniqueness checks only decreases the *probability* of non-monotonic values, and *does not* preven them. Consindert two transactions A: begin ; B: Begin ; A: insert ... -- IDENTITY generates value 1 B: insert .. -- IDENTITY generates value 2 A: rollback ; B: commit ; I can understand that. But your example is faulty, you can't have transaction inside a transaction. Checkpoints are another story. 8-) You can have some application tricks to have continous sequence today with regular serials but only if don't have a unique index that doesn't use the serial column. Inserting a record to that table outside the transaction, making note of the serial value. If subsequent processing fails (because of unique, check constraint, etc) you have to go back to the main table and modify the record, indicating that the record isn't representing valid data. But you must keep it with the serial value it was assigned. I have seen systems requiring this. My point is that with the identity column, you will be able to define unique index on the table that exludes the identity column. Now there is a record with IDENTITY 2, but not with 1. The *only* way to fix this is to *not* use a sequence, but rather do lock table t in exclusive mode ; select max(identity)+1 from t ; to generate the identity - but of course this prevents any concurrent inserts, which will make this unuseable for any larger database. Note that this is not a deficency of postgres sequences - there is no way to guarantee stricly monotonic values while allowing concurrent selects at the same time. (Other than lazyly assigning the values, but this needs to be done by the application) Agreed. I agree that I'd be nice to generate the identity columns as late as possible to prevents needless gaps, but not if price is a for more intrusive patch, or much higher complexity. Intrusive, hm? The catalog have to indicate that the column is IDENTITY, otherwise you cannot know it. The cost I am thinking now is an extra heap_update() after heap_insert() without generating the identity value and inserting index tuples to indexes that doesn't contain the identity column. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?
Zoltan Boszormenyi írta: Florian G. Pflug írta: Zoltan Boszormenyi wrote: The GENERATED column is an easy of use feature with possibly having less work, whereas the IDENTITY column is mandatory for some applications (e.g. accounting and billing is stricter in some countries) where you simply cannot skip a value in the sequence, the strict monotonity is not enough. But just postponing nextval() until after the uniqueness checks only decreases the *probability* of non-monotonic values, and *does not* preven them. Consindert two transactions A: begin ; B: Begin ; A: insert ... -- IDENTITY generates value 1 B: insert .. -- IDENTITY generates value 2 A: rollback ; B: commit ; I can understand that. But your example is faulty, you can't have transaction inside a transaction. Checkpoints are another story. 8-) You can have some application tricks to have continous sequence today with regular serials but only if don't have a unique index that doesn't use the serial column. Inserting a record to that table outside the transaction, making note of the serial value. If subsequent processing fails (because of unique, check constraint, etc) you have to go back to the main table and modify the record, indicating that the record isn't representing valid data. But you must keep it with the serial value it was assigned. I have seen systems requiring this. My point is that with the identity column, you will be able to define unique index on the table that exludes the identity column. Now there is a record with IDENTITY 2, but not with 1. The *only* way to fix this is to *not* use a sequence, but rather do lock table t in exclusive mode ; select max(identity)+1 from t ; to generate the identity - but of course this prevents any concurrent inserts, which will make this unuseable for any larger database. Note that this is not a deficency of postgres sequences - there is no way to guarantee stricly monotonic values while allowing concurrent selects at the same time. (Other than lazyly assigning the values, but this needs to be done by the application) Agreed. I agree that I'd be nice to generate the identity columns as late as possible to prevents needless gaps, but not if price is a for more intrusive patch, or much higher complexity. Intrusive, hm? The catalog have to indicate that the column is IDENTITY, otherwise you cannot know it. The cost I am thinking now is an extra heap_update() after heap_insert() without generating the identity value and inserting index tuples to indexes that doesn't contain the identity column. And as far as I tested the current state, there is no cost if you don't use GENERATED or IDENTITY. The extra heap_update() would be performed only if you have an IDENTITY colum. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?
Florian G. Pflug írta: Yes, of course you can prevent gaps by just filling them with garbage/invalid records of whatever. But I don't see why this is usefull - either you want, say, your invoice number to be continuous because it's required by law - or you don't. But if the law required your invoice numbers to be continous, surely just filling the gaps with fake invoices it just as illegal as having gaps in the first place. Not fake invoice, stornoed for whatever reason. But you have to keep the record to show you didn't delete anything. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?
Zoltan Boszormenyi írta: The cost I am thinking now is an extra heap_update() after heap_insert() without generating the identity value and inserting index tuples to indexes that doesn't contain the identity column. And as far as I tested the current state, there is no cost if you don't use GENERATED or IDENTITY. The extra heap_update() would be performed only if you have an IDENTITY colum. The modification I imagined is actually working: - skip indexes using the identity columns - do a simple_heap_update() after all other columns are assigned and index tuples are inserted - do ExecInsertIndexTuples() on indexes referencing the IDENTITY column However, I get warning messages like: WARNING: detected write past chunk end in ExecutorState 0xaaff68 How can I prevent them? Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: However, I get warning messages like: WARNING: detected write past chunk end in ExecutorState 0xaaff68 How can I prevent them? Find the memory-clobbering bug in your patch. regards, tom lane Thanks, I found it. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] psql problem querying relations
Hi, this is with current CVS code: # \dt ERROR: did not find '}' at end of input node Server log: ERROR: did not find '}' at end of input node STATEMENT: SELECT n.nspname as Schema, c.relname as Name, CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' END as Type, r.rolname as Owner FROM pg_catalog.pg_class c JOIN pg_catalog.pg_roles r ON r.oid = c.relowner LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind IN ('r','') AND n.nspname NOT IN ('pg_catalog', 'pg_toast') AND pg_catalog.pg_table_is_visible(c.oid) ORDER BY 1,2; Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] psql problem querying relations
Andrew Dunstan írta: Zoltan Boszormenyi wrote: Hi, this is with current CVS code: # \dt ERROR: did not find '}' at end of input node Server log: ERROR: did not find '}' at end of input node It's working for me. Have you tried with a fresh checkout or after running make clean before you build? cheers andrew I have to reinitdb my test database then tried again. It's working regardless of my IDENTITY patch is applied or not. Thanks. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?
Hi, I would like to be able to harden the conditions of generating IDENTITY columns so the events below run in this order: - assign values for regular columns (with or without DEFAULT) - NOT NULL checks on regular columns - CHECK constraints whose expression contains only regular columns - assign values for GENERATED columns - NOT NULL checks on GENERATED columns - CHECK constraints whose expression may contain regular or GENERATED columns - UNIQUE index checks that has only regular columns - UNIQUE index checks that may have regular or GENERATED columns - assign values for IDENTITY column - NOT NULL on IDENTITY - CHECK constraints on IDENTITY - UNIQUE index checks that can reference IDENTITY column At this point the heap tuple and the index tuple can be inserted without further checks. Currently tuple-t_self is required by ExecInsertIndexTuples() and I don't see any way to make IDENTITY work the way it's intended but to mix heap_insert()/heap_update() and ExecInsertIndexTuples() together and use the result in ExecInsert() and ExecUpdate(). Would it be acceptable? Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?
Hi, Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Would it be acceptable? No, because you can't create index entries when you haven't yet got the TID for the heap tuple. What do you propose doing, insert a dummy index entry and then go back to fill it in later? Aside from approximately No, I was thinking about breaking up e.g. heap_insert() to be able to mix with ExecInsertIndexTuples() so I get a pinned buffer and have the heap_tuple's t_self set first then doing the uniqueness checks step by step. BTW, can I use modify_tuple() after doing RelationGetBufferForTuple() and RelationPutHeapTuple(), right? doubling the work involved, this is fundamentally broken because no Well, the work wouldn't be doubled as all the unique indexes have to be checked anyway with the current way, too, to have the tuple accepted into the database. other backend could know what to do upon encountering the dummy index entry --- there's no way for it to check if the entry references a live tuple or not. Not to mention that a crash here will leave a permanently dummy index entry that there's no way to vacuum. The other rearrangements you suggest are not any more acceptable; we are not going to restructure the entire handling of defaults and check constraints around a single badly-designed SQL2003 feature. My IDENTITY/GENERATED patch broke up the checks currently this way (CHECK constraints are prohibited for special case columns): - normal columns are assigned values (maybe using DEFAULT) - check NOT NULLs and CHECKs for normal columns ( Up to this point this works the same way as before if you don't use neither IDENTITY nor GENERATED. ) - assign GENERATED with ther values - check NOT NULLs for GENERATED - assign IDENTITY with value - check NOT NULL for IDENTITY and - check UNIQUE for everything Identity would be special so it doesn't inflate the sequence if avoidable. Currently the only way if UNIQUE fails for any index which is still very much makes it unusable. What I would like to achieve is for IDENTITY to skip a sequence value and fail to be INSERTed if the IDENTITY column's uniqe check is failed. Which pretty much means that there is already a record with that IDENTITY value regardless of the UNIQUE index is defined for only the IDENTITY column or the IDENTITY column is part of a multi-column UNIQUE index. If I could broke up the order of events the way I described in my first mail, I could re-enable having CHECK constraints for both IDENTITY and GENERATED columns. The point with GENERATED is you have to have all other columns assigned with values BEFORE being able to compute a GENERATED column that reference other columns in its expression so you _have to_ break up the current order of computing DEFAULTs. I know a computed column could be done either in the application or with SELECTs but compare the amount of work: if you do it in the SELECT you have to compute the expressions every time the SELECT is run making it slower. Doing it on UPDATE or INSERT makes it LESS work in a fewer INSERT/UPDATE + heavy SELECT workload. Of course, for a heavy UPDATE workload it makes it more work but only if you actually use GENERATED columns. It means exatly the same amount of work if you use IDENTITY as with SERIAL, it's just made in different order. The GENERATED column is an easy of use feature with possibly having less work, whereas the IDENTITY column is mandatory for some applications (e.g. accounting and billing is stricter in some countries) where you simply cannot skip a value in the sequence, the strict monotonity is not enough. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] New version of IDENTITY/GENERATED
Hi, I started working on my previous patch, encouraged by the fact that it became a wishlist item for 8.3. :-) The changes in this version are: - Refreshed to almost current (5 days old) CVS version of 8.3 devel - The original SERIAL pseudo type is left alone, you _have to_ spell out GENERATED { ALWAYS | BY DEFAULT} AS IDENTITY to get an identity column. - The action-at-a-distance behaviour is actually working for the IDENTITY/GENERATED columns on INSERT so the DEFAULT value is generated for them after all the regular columns were validated via ExecConstraints(). This way, if the validation fails, the sequence isn't inflated. - Test case is updated to reflect the above. - Documentation is updated, Identity columns have a new subsection now. - Dropped my pg_dump changes, as the altered sequence is also dumped in 8.2, thanks to Tom Lane. I am considering the following: - Since the IDENTITY is a new feature (plain old SERIAL behaves the same as always) I will restore the SQL:2003 confromant check that there can be only one identity column in a table at any time. - I read somewhere (but couldn't find it now in SQL:2003) that CHECK constraints cannot be defined for GENERATED (and IDENTITY?) columns. Maybe it was in the latest draft, I have to look at it... Anyway, I have to implement checks to disallow CHECKs for such columns. - Introduce an ALTER TABLE SET|DROP IDENTITY so a serial can be upgraded to an identity. This way, an identity column can be built by hand and pg_dump will need it, too. SET IDENTITY will either have to issue an error if CHECKs defined for such columns or automatically drop every such constraints. And I have a question, too. Is there a way to use ExecEvalExpr*() so values from a given tuples are used for current row? E.g. at present, UPDATE table SET f1 = f1 + 1, f2 = f1 + 1; sets both fields' new value to (f1 value before UPDATE) + 1. For a GENERATED column, value _after_ UPDATE is needed, so CREATE TABLE table ( f1 INTEGER, f2 INTEGER GENERATED ALWAYS AS (f1 + 1)); and no matter which one of the following is used: UPDATE table SET f1 = f1 + 1; or UPDATE table SET f1 = f1 + 1, f2 = default; the f2 current value = f1 current value + 1 is always maintained. Best regards, Zoltán Böszörményi psql-serial-30.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] How can I merge a TargetEntry with a tuple?
Hi, I started working again on my IDENTITY/GENERATED patch. My question is $SUBJECT. This code is in rewriteTargetlist(): new_attr = build_column_default() new_tle = makeTargetEntry((Expr *) new_expr, ...) Now, in ExecInsert() I have to compute the default for IDENTITY/GENERATED between ExecConstraints() and heap_insert(). How can I create a Datum out of either an Expr or a TargetEntry (that contains the computed constant out of the default expression) so I can use it after I did an heap_deform_tuple()? Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.2.0 Tarball vs. REL8_2_0 vs. REL8_2_STABLE
Matt Miller írta: The [pgcluster-1.7.0rc1-patch] patch applies to the 8.2.0 tarball ... However, the patch will not apply to cvs branch REL8_2_0. I've been told that the pgcluster patch patches some generated files (parse.h and other apparently). Yes, I could not at first apply to REL8_2_0 because the patch file wanted to patch src/backend/parser/gram.c. At that point I started over with a fresh REL8_2_0, ran ./configure; make, and tried the patch again. That's when I got a bunch of failures and fuzz. The problem files are: src/backend/parser/gram.c src/backend/parser/parse.h src/interfaces/libpq/libpq.rc So, I suppose libpq.rc is a derived file, also? Now I have two questions. First, why does pgcluster patch derived files? Is this just sloppy/lazy technique, or could there be some Exactly. E.g. PGCluster patches configure, not configure.in, among others. The sugar on the top is PGCluster ruins the nice portability of PostgreSQL. E.g. plain usage of fork() without considering EXEC_BACKEND is not portable across archs. deeper reason? I realize this is properly to be posed to the pgcluster folks, but they don't seem to be too responsive, at least not to their pgfoundry forums. Second, does it make sense that the derived files that rejected the patch would be so different between the 8.2.0 tarball and my REL8_2_0 build? If the autotools and bison is different, they may certainly produce different files. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Performance testing of COPY (SELECT) TO
Thanks!!! Tom Lane írta: =?iso-8859-2?Q?B=F6sz=F6rm=E9nyi_Zolt=E1n?= [EMAIL PROTECTED] writes: as per your suggestion, the COPY view TO support was cut and a hint was added. Please, review. Committed after some refactoring to avoid code duplication. Unfortunately, in a moment of pure brain fade, I looked at the wrong item in my inbox and wrote Bernd Helmle's name instead of yours in the commit message :-(. My sincere apologies. Bruce, would you make a note to be sure the right person gets credit in the release notes? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Performance testing of COPY (SELECT) TO
Bruce Momjian írta: Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. Thanks. Would you please add this instead? psql built-in \copy (select ...) now also work. Best regards, Zoltán Böszörményi pgsql-copyselect-8.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Performance testing of COPY (SELECT) TO
Hi, Böszörményi Zoltán írta: Böszörményi Zoltán [EMAIL PROTECTED] writes: With PostgreSQL 8.1.4, I used this: begin; select ... into temp myquery1; copy myquery1 to stdout csv delimiter '|'; rollback; The performance of this would doubtless vary a lot with the temp_buffers setting. Did you try different values? Yes, I did, but now checked back with 8.2CVS. The previously quoted result was achieved with temp_buffers = 1000 on both 8.1.4 and 8.2CVS. On 8.2CVS with temp_buffers = 4096, the 10 client case kills the machine with swapping, but the 3 client runtime with COPY(SELECT) went down to 2:41. The SELECT INTO TEMP case went down to 3:36. It'd also be interesting to time the same way (with a temp table) in devel. I don't remember whether we did any performance work on the COPY CSV data path in this cycle, or whether that was all present in 8.1. In any case it'd be worth proving that the COPY SELECT patch isn't degrading performance of the copy-a-relation case. I will report back with that, say on Monday. It seems my previous mail hasn't reached the hackers list, I answer here. In the export, there is a largish table, that has both many columns and rows. With COPY(SELECT) patch applied: time psql -c copy (select * from table) to 'file' dbx COPY 886046 real0m13.253s user0m0.000s sys 0m0.000s time psql -c copy table to 'file' dbx COPY 886046 real0m13.234s user0m0.000s sys 0m0.000s time psql -c copy table to stdout dbx file real0m15.155s user0m0.540s sys 0m0.450s time psql -c copy (select * from table) to stdout dbx file real0m15.079s user0m0.540s sys 0m0.590s Surprisingly, without the COPY(SELECT) patch it's slower, this is the lowest from five runs, e.g. with warm caches: time psql -c copy table to 'file' dbx real0m20.464s user0m0.000s sys 0m0.010s time psql -c copy table to stdout dbx file real0m25.753s user0m0.570s sys 0m0.460s With the original settings, temp_buffers = 1000 on 8.2CVS, the export runtime with one client looks like this: first run 1:44, second run 1:12, third run 1:04. It seems it's a bit faster both on startup and on subsequent runs. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] COPY view
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: How about the callback solution for the SELECT case that was copied from the original? Should I consider open-coding in copy.c what ExecutorRun() does to avoid the callback? Adding a DestReceiver type is a good solution ... although that static variable is not. Instead define a DestReceiver extension struct that can carry the CopyState pointer for you. Done. You could also consider putting the copy-from-view-specific state fields into DestReceiver instead of CopyState, though this is a bit asymmetric with the relation case so maybe it's not really cleaner. Left it alone for now. BTW, lose the tuple_to_values function --- it's an extremely bad reimplementation of heap_deform_tuple. Done. copy_dest_printtup also seems coded without regard for the TupleTableSlot access API (read printtup() to see what to do instead). I am still interpreting it. Can you give me some hints besides using slot_getallattrs(slot)? And what's the point of factoring out the heap_getnext loop as CopyRelationTo? It's not like that lets you share any more code. The inside of the loop, ie what you've called CopyValuesTo, is the sharable part. Done. The option parsing and error checking is now common. I also changed it to use transformStmt() in analyze.c. However, both the UNION and sunselect cases give me something like this: ERROR: could not open relation 1663/16384/16723: No such file or directory What else can I do with it? Best regards, Zoltán Böszörményi pgsql-copyselect-4.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] COPY view
Zoltan Boszormenyi írta: The option parsing and error checking is now common. I also changed it to use transformStmt() in analyze.c. However, both the UNION and sunselect cases give me something like this: ERROR: could not open relation 1663/16384/16723: No such file or directory What else can I do with it? But a single SELECT with two tables joined also works so it must be something trivial. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] COPY view
Zoltan Boszormenyi írta: Zoltan Boszormenyi írta: The option parsing and error checking is now common. I also changed it to use transformStmt() in analyze.c. However, both the UNION and sunselect cases give me something like this: ERROR: could not open relation 1663/16384/16723: No such file or directory What else can I do with it? But a single SELECT with two tables joined also works so it must be something trivial. Now UNIONs and subselects also work. Your concern about copy_dest_printtup() wasn't solved yet. Best regards, Zoltán Böszörményi pgsql-copyselect-5.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] COPY view
Tom Lane írta: [ cc list trimmed to something reasonable ] Zoltan Boszormenyi [EMAIL PROTECTED] writes: OK, here's my current version. The reference leak is fixed. But as my testcase shows, it only works for single selects currently. The parser accepts it but COPY doesn't produce the expected output. Please, suggest a solution. This patch seems to be adding a tremendous amount of duplicated code in copy.c. Why? 1. to minimize the already huge code impact on the relation case. 2. the checks done for the SELECT case is not exactly the same as for the realation case. 3. the relation case is managed by passing around a Relation pointer, e.g. CopyGetAttnums. This simply not appropriate for the SELECT case. I will try to clean it up as much as I can, though. How about the callback solution for the SELECT case that was copied from the original? Should I consider open-coding in copy.c what ExecutorRun() does to avoid the callback? Also, moving checks for the relation case out of copy.c and into analyze.c is inappropriate. The checks you have moved there are actually wrong because you have no lock on the relation at the time you are checking. You could perhaps take a lock at analyze time, but frankly I see no reason for this patch to be messing with the relation case at all. OK, I will put the checks back where they were. As for the UNION problem, try passing the query to transformStmt rather than prejudging where transformStmt will send it. Compare for instance the analyze.c code for ExplainStmt. Thanks. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] COPY view
Hi, Bruce Momjian írta: Alvaro Herrera wrote: Bruce Momjian wrote: Andrew Dunstan wrote: Bruce Momjian wrote: I think Alvaro is saying we need it in a few days, not longer. I thought he was saying today ;-) He actually said now, but I don't think we need it immediately, especially if he is still working on it. We are at least 1-2 weeks away from having all open patches applied. Yes, I'm saying today so that we can all look at it and point obvious mistakes now, not in 2 weeks from now. Release early, release often. If the patch contains a mistake and we find out in 2 weeks, are we going to fix it? No, we are going to reject it. OK, I understand. B?sz?rm?nyi, post now so we can see where you are, but keep working and send it to us again when you are done. No sense in not posting your working version. OK, here's my current version. The reference leak is fixed. But as my testcase shows, it only works for single selects currently. The parser accepts it but COPY doesn't produce the expected output. Please, suggest a solution. BTW, my first name is Zoltán. Best regards, Zoltán Böszörményi pgsql-copyselect.patch.gz Description: Unix tar archive ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] COPY view
Zoltan Boszormenyi írta: Hi, Bruce Momjian írta: Alvaro Herrera wrote: Bruce Momjian wrote: Andrew Dunstan wrote: Bruce Momjian wrote: I think Alvaro is saying we need it in a few days, not longer. I thought he was saying today ;-) He actually said now, but I don't think we need it immediately, especially if he is still working on it. We are at least 1-2 weeks away from having all open patches applied. Yes, I'm saying today so that we can all look at it and point obvious mistakes now, not in 2 weeks from now. Release early, release often. If the patch contains a mistake and we find out in 2 weeks, are we going to fix it? No, we are going to reject it. OK, I understand. B?sz?rm?nyi, post now so we can see where you are, but keep working and send it to us again when you are done. No sense in not posting your working version. OK, here's my current version. The reference leak is fixed. But as my testcase shows, it only works for single selects currently. The parser accepts it but COPY doesn't produce the expected output. Please, suggest a solution. I meant that UNION selects, subselects don't work yet. BTW, my first name is Zoltán. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] COPY view
Alvaro Herrera írta: Zoltan Boszormenyi wrote: OK, here's my current version. The reference leak is fixed. But as my testcase shows, it only works for single selects currently. The parser accepts it but COPY doesn't produce the expected output. Please, suggest a solution. I'm not sure I agree with the approach of creating a fake SELECT * FROM foo in analyze.c in the relation case and passing it back to the parser to create a Query node. That's not there in the original code and you shouldn't need it. Just let the case where COPY gets a relation continue to handle it as it does today, and add a separate case for the SELECT. The exact same code was there, e.g. parse and rewrite SELECT * FROM view just not in analyze.c. I will try without it, though. That doesn't help you with the UNION stuff though. :-( ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: GENERATED ... AS IDENTITY, Was: Re: [HACKERS] Feature Freeze
Rod Taylor írta: For db restoration (pg_dump), how do you restore to the same values as previously if it is always regenerated? By making ALWAYS a suggestion for some users instead of always enforced and providing an override mechanism for it. I assume it only works for relation owners but I've not figured out how the spec does permissions. override clause ::= OVERRIDING USER VALUE | OVERRIDING SYSTEM VALUE In short, pg_dump should append OVERRIDING SYSTEM VALUE to any insert or copy for relations with an GENERATED ALWAYS identity column and the backend will need to respect that. Aren't INSERT and COPY distinguished in code paths? Yes, they are separate but they also use the same permission set. Any user can copy into a structure at any time and virtually every restriction will be applied normally (CHECK, DEFAULT, etc.). Copy bypasses Rules, significant parsing overhead since there is no need to look for subselects, and possibly some triggers are bypassed. I'm fairly sure that foreign key triggers fire. In short, COPY doesn't bypass enforcement. GENERATED ALWAYS is part of that enforcement and should be included in that. If it is not included, we cannot recommend GENERATED ALWAYS for uses like recording CURRENT_USER in an audit log since the data could be fudged. OK. So COPY needs an OVERRIDING close, too, not just INSERT. In the meantime I implemented the discussed restrictions on UPDATE for GENERATED ALWAYS columns, allowing UPDATE tab SET col = default only for GENERATED ALWAYS AS. I also implemented INSERT ... OVERRIDING { SYSTEM | USER } VALUE. If I got it correctly, OVERRIDING USER VALUE seems to be the same as omitting the OVERRIDING clause... ALWAYS is really only enforced for anyone who doesn't have permission to specify otherwise. Another one that got me is what do you do if you do this: CREATE TABLE tab (col integer); INSERT INTO tab VALUES (10); ALTER TABLE tab ALTER col GENERATED ALWAYS AS IDENTITY; What is the value for tab.col? It would seem that the table should be rewritten with all values for col recalculated -- thus it would be '1'. But wait! Can we add the override clause here too to keep the old values and change the enforcement for new tuples only? I don't think we should rewrite existing rows because when it was inserted, the stored value was valid according to the rules at that time. What if you have more than one rows in that table? SERIAL has, until recently, been described as a macro. A tool for setting things up quickly but many parts of which can be changed by hand after-ward. It's not exactly a good source for information on how this structure should work. For one, you can easily override the suggested default a serial gives at any time as any user. The intention of ALWAYS is to prevent exactly that behaviour. I don't have an opinion on ALTER TABLE changes for this one way or the other. It was my intention to advise that a group decision is required and some research into what other databases do in this case. I believe MSSQL and DB2 both implement this functionality. I see. I looked a bit into the TODO entry that's about ALTER TABLE tab ALTER col RENAME newcol should also rename the sequence. My question is: is it legal to call pg_get_serial_sequence() from src/backen/parser/analyze.c:transformAlterTableStmt()? This would be the easiest way to issue an ALTER TABLE oldseq RENAME newseq command automatically. And I think I found a bug in PostgreSQL. If I do this: create table tab1 (id serial, t text); -- creates tab1_id_seq create table tab2 (id serial, t text) inherits (tab1); -- creates tab2_id_seq drop table tab1 cascade; then tab1_id_seq gets also dropped but tab2_id_seq doesn't. Both 8.1.4 and current 8.2CVS do this. Oh, and one more item. These expressions have the same abilities as a CHECK constraint for referencing other columns. This example comes from an IBM Guide: CREATE TABLE T1(c1 INT, c2 DOUBLE, c3 DOUBLE GENERATED ALWAYS AS (c1 + c2), c4 SMALLINT GENERATED ALWAYS AS (CASE WHEN c1 c2 THEN 1 ELSE NULL END) ); For this to work, we need to lift the restriction on DEFAULT so cother columns can appear in the expression. Dependencies must be tracked between columns so GENERATED ALWAYS columns on UPDATE and DEFAULT/ GENERATED ALWAYS columns on INSERT get their computed values. Circular dependencies must be avoided, etc. Hm. Here is what IBM has to say about ALTER TABLE and GENERATED ALWAYS: GENERATED Specifies that DB2 generates values for the column. ALWAYS Specifies that DB2 will always generate a value for the
Re: GENERATED ... AS IDENTITY, Was: Re: [HACKERS] Feature Freeze
Hi, next version follows. Changes: - Supports OVERRIDING { USER | SYSTEM } VALUE syntax not yet documented, I have doubts about USER variant - UPDATES is forbidden entirely on GENERATED ALWAYS AS IDENTITY columns, UPDATE tab SET col = DEFAULT is allowed on GENERATED ALWAYS AS ( expr ) columns - ALTER TABLE tab ALTER col RESTART [WITH] N and ALTER TABLE tab ALTER col SET identity_options are supported but not yet documented - extended the test case but the expected .out wasn't updated so 1 out of 101 tests fail. After exercising with the last one, ALTER tab RENAME to newtab and ALTER tab RENAME col TO newcol should be easy. With the introduced infrastructure to correctly support the first two changes (new column attribute: attidentity) it is be easy to implement checks to disallow ALTER TABLE tab DROP DEFAULT on IDENTITY columns. Best regards, Zoltán Böszörményi Zoltan Boszormenyi írta: Rod Taylor írta: For db restoration (pg_dump), how do you restore to the same values as previously if it is always regenerated? By making ALWAYS a suggestion for some users instead of always enforced and providing an override mechanism for it. I assume it only works for relation owners but I've not figured out how the spec does permissions. override clause ::= OVERRIDING USER VALUE | OVERRIDING SYSTEM VALUE In short, pg_dump should append OVERRIDING SYSTEM VALUE to any insert or copy for relations with an GENERATED ALWAYS identity column and the backend will need to respect that. Aren't INSERT and COPY distinguished in code paths? Yes, they are separate but they also use the same permission set. Any user can copy into a structure at any time and virtually every restriction will be applied normally (CHECK, DEFAULT, etc.). Copy bypasses Rules, significant parsing overhead since there is no need to look for subselects, and possibly some triggers are bypassed. I'm fairly sure that foreign key triggers fire. In short, COPY doesn't bypass enforcement. GENERATED ALWAYS is part of that enforcement and should be included in that. If it is not included, we cannot recommend GENERATED ALWAYS for uses like recording CURRENT_USER in an audit log since the data could be fudged. OK. So COPY needs an OVERRIDING close, too, not just INSERT. In the meantime I implemented the discussed restrictions on UPDATE for GENERATED ALWAYS columns, allowing UPDATE tab SET col = default only for GENERATED ALWAYS AS. I also implemented INSERT ... OVERRIDING { SYSTEM | USER } VALUE. If I got it correctly, OVERRIDING USER VALUE seems to be the same as omitting the OVERRIDING clause... ALWAYS is really only enforced for anyone who doesn't have permission to specify otherwise. Another one that got me is what do you do if you do this: CREATE TABLE tab (col integer); INSERT INTO tab VALUES (10); ALTER TABLE tab ALTER col GENERATED ALWAYS AS IDENTITY; What is the value for tab.col? It would seem that the table should be rewritten with all values for col recalculated -- thus it would be '1'. But wait! Can we add the override clause here too to keep the old values and change the enforcement for new tuples only? I don't think we should rewrite existing rows because when it was inserted, the stored value was valid according to the rules at that time. What if you have more than one rows in that table? SERIAL has, until recently, been described as a macro. A tool for setting things up quickly but many parts of which can be changed by hand after-ward. It's not exactly a good source for information on how this structure should work. For one, you can easily override the suggested default a serial gives at any time as any user. The intention of ALWAYS is to prevent exactly that behaviour. I don't have an opinion on ALTER TABLE changes for this one way or the other. It was my intention to advise that a group decision is required and some research into what other databases do in this case. I believe MSSQL and DB2 both implement this functionality. I see. I looked a bit into the TODO entry that's about ALTER TABLE tab ALTER col RENAME newcol should also rename the sequence. My question is: is it legal to call pg_get_serial_sequence() from src/backen/parser/analyze.c:transformAlterTableStmt()? This would be the easiest way to issue an ALTER TABLE oldseq RENAME newseq command automatically. And I think I found a bug in PostgreSQL. If I do this: create table tab1 (id serial, t text); -- creates tab1_id_seq create table tab2 (id serial, t text) inherits (tab1); -- creates tab2_id_seq drop table tab1 cascade; then tab1_id_seq gets also dropped but tab2_id_seq doesn't. Both 8.1.4 and current 8.2CVS do this. Oh, and one more item. These expressions have the same abilities as a CHECK constraint for referencing other columns. This example comes from an IBM Guide: CREATE TABLE T1(c1 INT, c2 DOUBLE, c3 DOUBLE
GENERATED ... AS IDENTITY, Was: Re: [HACKERS] Feature Freeze
Hi, I have progressed a bit with my pet project, a.k.a $SUBJECT. Now GENERATED ALWAYS AS IDENTITY and GENERATED ALWAYS AS ( expr ) work as intended. Documentation was also extended. Some test cases are also included, that shows that ALTER TABLE ALTER TYPE keeps both the sequence and the GENERATED ALWAYS property. Gzipped patch is attached. Next steps are: - pg_dump support - more ALTER TABLE support for adding and dropping IDENTITY and GENERATED ALWAYS features - more testing I still maintain that I don't see any standard requirement between the GENERATED AS IDENTITY and NEXT VALUE FOR but obviously both require SEQUENCE as supported feature in parallel. I can be proven wrong, though, but please, quote section# and text where it can be found in the standard. As for why GENERATED ALWAYS AS IDENTITY is useful? Consider someone who is coming from another DBMS (Informix, Access, etc.) where INSERT INTO table (id, ...) VALUES (0, ...); inserts the next value for the autoincrementer field instead of 0. Leaving out fields from INSERT is not allowed in the source because of documentation reasons and writing DEFAULT is not handy or not found in that legacy DBMS' features. Multiply it with N applications that was written that way over the years of the lifespan of a large project, count in the human resistance to learn something new (say 2.5x multiplier, but that may be under-estimated :-) ) and a feature that help porting easier will be a cheered feature. IIRC Bruce Momjian himself wrote in this list that ease-of-use features can boost PostgreSQL userbase pretty quickly. So, please, review my patch in it's current state and decide whether it's a 8.2-worthy feature. BTW, is there anyone working on COPY FROM ( select ) feature? Thanks in advance and best regards, Zoltán Böszörményi psql-serial-12.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: GENERATED ... AS IDENTITY, Was: Re: [HACKERS] Feature Freeze
Rod Taylor írta: On Tue, 2006-08-01 at 18:10 +0200, Zoltan Boszormenyi wrote: Hi, I have progressed a bit with my pet project, a.k.a $SUBJECT. Now GENERATED ALWAYS AS IDENTITY and GENERATED ALWAYS AS ( expr ) work as intended. Documentation was also extended. I'm only commenting because I debated trying to implement this feature a couple of times. Thanks for commenting it. The ugliness required for pg_dump put me off of doing it. I haven't looked into it yet. I did not see a test for enforcement during COPY. That was sort of intended, COPY is expected to pull back the same record it wrote out. But see below. UPDATE restrictions appear to have been missed as well: 4) If set clause SC specifies an object column that references a column of which some underlying column is either a generated column or an identity column whose descriptor indicates that values are always generated, then the update source specified in SC shall consist of a default specification. object column is the update target, or the left hand side of the equation. In short, if a column marked GENERATED ALWAYS is updated then it must be to DEFAULT or not provided as an update target. CREATE TABLE tab (col integer GENERATED ALWAYS AS IDENTITY); UPDATE tab SET col = DEFAULT; -- ACCEPTED UPDATE tab SET col = 1; -- ERROR Yes, I have also read that detail but not yet implemented it. I was too happy that I found a straightforward way to make GENERATED ALWAYS work. For db restoration (pg_dump), how do you restore to the same values as previously if it is always regenerated? By making ALWAYS a suggestion for some users instead of always enforced and providing an override mechanism for it. I assume it only works for relation owners but I've not figured out how the spec does permissions. override clause ::= OVERRIDING USER VALUE | OVERRIDING SYSTEM VALUE In short, pg_dump should append OVERRIDING SYSTEM VALUE to any insert or copy for relations with an GENERATED ALWAYS identity column and the backend will need to respect that. Aren't INSERT and COPY distinguished in code paths? (I don't have too deep knowledge about PostgreSQL internals, yet.) If they are, OVERRIDING SYSTEM VALUE will be needed only when pg_dump produces INSERTs. ALWAYS is really only enforced for anyone who doesn't have permission to specify otherwise. Another one that got me is what do you do if you do this: CREATE TABLE tab (col integer); INSERT INTO tab VALUES (10); ALTER TABLE tab ALTER col GENERATED ALWAYS AS IDENTITY; What is the value for tab.col? It would seem that the table should be rewritten with all values for col recalculated -- thus it would be '1'. But wait! Can we add the override clause here too to keep the old values and change the enforcement for new tuples only? I don't think we should rewrite existing rows because when it was inserted, the stored value was valid according to the rules at that time. What if you have more than one rows in that table? Which record gets which value? You cannot know which record was inserted first because subsequent updates may ruin that order before the ALTER TABLE. And recalculating the max value of col isn't too reliable if another session is also inserting records. And what about non-unique columns? Plain SERIALs aren't declared unique automatically, either. Consider the following: CREATE TABLE tab (col integer); INSERT INTO tab VALUES (10); ALTER TABLE tab ALTER col GENERATED ALWAYS AS IDENTITY( MINVALUE 1 MAXVALUE 12 CYCLE ); Here I expect equal values and I don't want existing rows rewritten. E.g. if you want a new start value, you will also need to issue ALTER TABLE tab ALTER col RESTART WITH n; which I started to implement. Also, for a unique SERIAL column, you can still insert a record with an out-of-order number and one of the INSERTs that reach that number will fail with unique violation. e.g. it's not a real autoincrementer field. Or you can alter a sequence that supports such a column. PostgreSQL documents both behaviour and I wanted to keep it. Thanks for the comments, Zoltán Böszörményi Some test cases are also included, that shows that ALTER TABLE ALTER TYPE keeps both the sequence and the GENERATED ALWAYS property. Gzipped patch is attached. Next steps are: - pg_dump support - more ALTER TABLE support for adding and dropping IDENTITY and GENERATED ALWAYS features - more testing I still maintain that I don't see any standard requirement between the GENERATED AS IDENTITY and NEXT VALUE FOR but obviously both require SEQUENCE as supported feature in parallel. I can be proven wrong, though, but please, quote section# and text where it can be found in the standard. As for why GENERATED ALWAYS AS IDENTITY is useful? Consider someone who is coming from another DBMS (Informix, Access, etc.) where INSERT INTO table (id
Re: [HACKERS] Three weeks left until feature freeze
Hi, Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: I am working on adding a new column contraint, namely the GENERATED [ALWAYS | BY DEFAULT ] AS [ IDENTITY ( sequence_options ) | ( expression )] Doesn't this still have the issue that we're taking over spec-defined syntax to represent behavior that does not quite match the spec-defined semantics? It's not clear to me how closely tied this syntax is to NEXT VALUE FOR, but if it's got any of the latter's action-at-a-distance subtleties then I think we ought to leave well enough alone till we have a solution for that. regards, tom lane Sorry for not answering earlier, I was on a holiday. I read again sections 6.13 and 9.21 about NEXT VALUE FOR and generation of the next value of a sequence generator, respectively. If I see it right, neither of them require the other one. The IDENTITY COLUMN syntax in section 11.4 only mentions section 9.21. Section 14.8 about INSERT statement does not even mention it, only refers to default clause in section 11.5. And that also doesn't say anything about neither NEXT VALUE FOR nor next value generation of a sequence. And I saw comments in the PostgreSQL documentation that goes like this: Standard doesn't specify so we are conforming. Hint, hint. ;-) I think the IDENTITY COLUMN (and GENERATED ALWAYS AS) can stand on its own without NEXT VALUE FOR. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Three weeks left until feature freeze
Hi, Bruce Momjian írta: There are roughly three weeks left until the feature freeze on August 1. If people are working on items, they should be announced before August 1, and the patches submitted by August 1. If the patch is large, it should be discussed now and an intermediate patch posted to the lists soon. FYI, we don't have many major features ready for 8.2. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com I am working on adding a new column contraint, namely the GENERATED [ALWAYS | BY DEFAULT ] AS [ IDENTITY ( sequence_options ) | ( expression )] I surely won't be ready by August 1 as I am going on a holiday this weekend for two weeks. Anyway, I submit what I have in my hands now and describe what it does and what lacks. Maybe someone can fill the missing gaps, I didn't have too much time to dig deeper in the PostgreSQL code. Lacks: - proper sgml documentation - pg_dump support - GENERATED ALWAYS doesn't work (yet[*]) - ALTER TABLE support What works: - GENERATED ... AS IDENTITY creates the sequence with the given parameters - any data types that has an implicit cast from int8 can be defined as GENERATED ... AS IDENTITY DEFAULT and GENERATED are mutually exclusive and I tweaked gram.y to make it so and GENERATED ... is now parsed as a column constraint. Unfortunately I introduced one shift/reduce conflict and I don't see a way to solve it. SQL2003 says that only one IDENTITY column may exist in a table, I didn't make a check for that, it may be useful to allow more than one. Also, the parsing is very simplistic, it allows GENERATED BY DEFAULT AS ( expr ) and I made it to behave exactly like DEFAULT. It can be eliminated in the parser, though. Tom Lane was unhappy with my previous attempt as it introduced SERIAL, SERIAL4, SERIAL8 and BIGSERIAL as keywords. This time I kept it the way PostgreSQL always had. [*] I introduced a new column attribute (attforceddef) and it could be used in rewriteHandler.c to override given data with the default for GENERATED ALWAYS column. I must have missed some places where I have to assign ColumnDef-force_default to FormData_pg_attribute-attforceddef or vice-versa but I am certain that not too many left. Needless to say it passes make check. Please, review and point out my mistakes again. :-) Best regards, Zoltán Böszörményi psql-serial-8.diff.gz Description: Unix tar archive ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Extended SERIAL parsing
Hi, Jim C. Nasby írta: On Mon, Jun 12, 2006 at 02:27:31PM +0200, B?sz?rm?nyi Zolt?n wrote: Zoltan Boszormenyi [EMAIL PROTECTED] writes: after some experimentation, I came up with the attached patch, which implements parsing the following SERIAL types: As has been pointed out before, it would be a seriously bad idea to implement the SQL syntax for identity columns without matching the SQL semantics for them. That would leave us behind the eight-ball when we wanted to implement the SQL semantics. Right now we have a useful but non-standard semantics, and a useful but non-standard syntax, and those two should stick together. Well, I read all sections of 5WD-02-Foundation-2003-09.pdf where identity appears, here are the list of changes that will be needed for an identity column: Have you read the archives on the recent discussions that have taken place about whether SERIAL should be a black box or not? IIRC most of this was all hashed out in that thread. I just read it thoroughly, and the issues I listed wasn't mentioned in the black box thread, at all. I am trying to implement the standard syntax ( and gradually the conformant behaviour ) along the lines of sections 4.14.7, 11.3, 11.4, 11.7, 11.11, 11.12, 11.17 and 14.8. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Extended SERIAL parsing
Hi, after some experimentation, I came up with the attached patch, which implements parsing the following SERIAL types: SERIAL SERIAL GENERATED { ALWAYS | BY DEFAULT } SERIAL GENERATED [ ALWAYS | BY DEFAULT ] AS IDENTITY( sequence options ) The underlying type is still int4 or int8, so the problems you discussed aren't solved. But at least the current semantics is kept. It passes all regression tests, and it works, too: # create table proba (i serial generated as identity(minvalue 5 maxvalue 10) primary key, t text); NOTICE: CREATE TABLE will create implicit sequence proba_i_seq for serial column proba.i NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index proba_pkey for table proba CREATE TABLE # insert into proba (t) values ('a'); INSERT 0 1 # select * from proba; i | t ---+--- 5 | a (1 row) For now, GENERATED { ALWAYS | BY DEFAULT } are just fillings. The condition (column-is_serial column-force_default) can help enforcing GENERATED ALWAYS at INSERT time and can also help fixing the two TODO entries about SERIAL. Best regards, Zoltán Böszörményi diff -ur postgresql-8.2/src/backend/parser/analyze.c postgresql-8.2-serial/src/backend/parser/analyze.c --- postgresql-8.2/src/backend/parser/analyze.c 2006-04-30 20:30:39.0 +0200 +++ postgresql-8.2-serial/src/backend/parser/analyze.c 2006-06-11 23:36:22.0 +0200 @@ -825,40 +825,17 @@ transformColumnDefinition(ParseState *pstate, CreateStmtContext *cxt, ColumnDef *column) { - bool is_serial; bool saw_nullable; Constraint *constraint; ListCell *clist; cxt-columns = lappend(cxt-columns, column); - /* Check for SERIAL pseudo-types */ - is_serial = false; - if (list_length(column-typename-names) == 1) - { - char *typname = strVal(linitial(column-typename-names)); - - if (strcmp(typname, serial) == 0 || - strcmp(typname, serial4) == 0) - { - is_serial = true; - column-typename-names = NIL; - column-typename-typeid = INT4OID; - } - else if (strcmp(typname, bigserial) == 0 || - strcmp(typname, serial8) == 0) - { - is_serial = true; - column-typename-names = NIL; - column-typename-typeid = INT8OID; - } - } - /* Do necessary work on the column type declaration */ transformColumnType(pstate, column); /* Special actions for SERIAL pseudo-types */ - if (is_serial) + if (column-is_serial) { Oid snamespaceid; char *snamespace; @@ -898,7 +875,7 @@ */ seqstmt = makeNode(CreateSeqStmt); seqstmt-sequence = makeRangeVar(snamespace, sname); - seqstmt-options = NIL; + seqstmt-options = column-seq_opts; cxt-blist = lappend(cxt-blist, seqstmt); diff -ur postgresql-8.2/src/backend/parser/gram.y postgresql-8.2-serial/src/backend/parser/gram.y --- postgresql-8.2/src/backend/parser/gram.y 2006-05-27 19:38:45.0 +0200 +++ postgresql-8.2-serial/src/backend/parser/gram.y 2006-06-11 23:42:02.0 +0200 @@ -275,6 +275,7 @@ %type boolean opt_instead opt_analyze %type boolean index_opt_unique opt_verbose opt_full %type boolean opt_freeze opt_default opt_recheck +%type boolean ColIdGen ColOptIdGen %type defelt opt_binary opt_oids copy_delimiter %type boolean copy_from opt_hold @@ -284,8 +285,8 @@ %type node fetch_direction select_limit_value select_offset_value -%type list OptSeqList -%type defelt OptSeqElem +%type list OptSeqList OptSerialSeqList +%type defelt OptSeqElem OptSerialSeqElem %type istmt insert_rest @@ -313,7 +314,7 @@ %type range relation_expr_opt_alias %type target target_el insert_target_el update_target_el insert_column_item -%type typnam Typename SimpleTypename ConstTypename +%type typnam Typename SimpleTypename SerialTypename ConstTypename GenericType Numeric opt_float Character ConstCharacter CharacterWithLength CharacterWithoutLength @@ -357,10 +358,10 @@ /* ordinary key words in alphabetical order */ %token keyword ABORT_P ABSOLUTE_P ACCESS ACTION ADD_P ADMIN AFTER - AGGREGATE ALL ALSO ALTER ANALYSE ANALYZE AND ANY ARRAY AS ASC + AGGREGATE ALL ALSO ALTER ALWAYS ANALYSE ANALYZE AND ANY ARRAY AS ASC ASSERTION ASSIGNMENT ASYMMETRIC AT AUTHORIZATION - BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT + BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BIGSERIAL BINARY BIT BOOLEAN_P BOTH BY CACHE CALLED CASCADE CASCADED CASE CAST CHAIN CHAR_P @@ -380,11 +381,11 @@ FALSE_P FETCH FIRST_P FLOAT_P FOR FORCE FOREIGN FORWARD FREEZE FROM FULL FUNCTION - GLOBAL GRANT GRANTED GREATEST GROUP_P + GENERATED GLOBAL GRANT GRANTED GREATEST GROUP_P HANDLER HAVING HEADER_P HOLD HOUR_P - IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P INCLUDING INCREMENT + IDENTITY IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P INCLUDING INCREMENT INDEX INHERIT INHERITS INITIALLY INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION @@ -417,7 +418,7 @@ ROLE ROLLBACK ROW ROWS RULE SAVEPOINT SCHEMA SCROLL SECOND_P SECURITY SELECT SEQUENCE -
Re: [HACKERS] Extended SERIAL parsing
Rod Taylor írta: The condition (column-is_serial column-force_default) can help enforcing GENERATED ALWAYS at INSERT time and can also help fixing the two TODO entries about SERIAL. You will need to include the insert components of the spec which allow for overriding GENERATED ALWAYS during an INSERT and extend that to COPY and teach pg_dump how to use them. OK, that's over my head at the moment. :-) Maybe the wizards here pick up my patch and complete it. (I hope.) Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] SERIAL problems?
Hi, I just saw these in the TODO list: o %Disallow changing DEFAULT expression of a SERIAL column? This should be done only if the existing SERIAL problems cannot be fixed. o %Disallow ALTER SEQUENCE changes for SERIAL sequences because pg_dump does not dump the changes What are the existing problems? I am asking because I am experimenting to implement the SQL2003 compliant form for the serial type to be able specify the underlying sequence parameters: SERIAL [ GENERATED [ ALWAYS | BY DEFAULT ] AS IDENTITY ( [ INCREMENT [ BY ] increment ] [ MINVALUE minvalue | NO MINVALUE ] [ MAXVALUE maxvalue | NO MAXVALUE ] [ START [ WITH ] start ] [ CACHE cache ] [ [ NO ] CYCLE ] ) ] Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much
Hi! Tom Lane írta: Jim C. Nasby [EMAIL PROTECTED] writes: I've been able to verify this on 8.1.4; psql -A -t -c 'SELECT * FROM largetable' /dev/null results in psql consuming vast quantities of memory. Why is this? Is it different without the -A? I'm reading this as just another uninformed complaint about libpq's habit of buffering the whole query result. It's possible that there's a memory leak in the -A path specifically, but nothing said so far provided any evidence for that. regards, tom lane So, is libpq always buffering the result? Thanks. I thought psql buffers only because in its formatted output mode it has to know the widest value for all the columns. Then the SELECT INTO TEMP ; COPY TO STDOUT solution I found is _the_ solution. I guess then the libpq-based ODBC driver suffers from the same problem? It certainly explains the performance problems I observed: the server finishes the query, the ODBC driver (or libpq underneath) fetches all the records and the application receives the first record after all these. Nice. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much
Andrew Dunstan írta: Mark Woodward wrote: Jim C. Nasby [EMAIL PROTECTED] writes: On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote: I'm reading this as just another uninformed complaint about libpq's habit of buffering the whole query result. It's possible that there's a memory leak in the -A path specifically, but nothing said so far provided any evidence for that. Certainly seems like it. It seems like it would be good to allow for libpq not to buffer, since there's cases where it's not needed... See past discussions. The problem is that libpq's API says that when it hands you back the completed query result, the command is complete and guaranteed not to fail later. A streaming interface could not make that guarantee, so it's not a transparent substitution. I wouldn't have any strong objection to providing a separate API that operates in a streaming fashion, but defining it is something no one's bothered to do yet. In practice, if you have to code to a variant API, it's not that much more trouble to use a cursor... Wouldn't the COPY (select ...) TO STDOUT format being discussed solve this for free? Yes, it would for me. It won't solve it in the general case for clients that expect a result set. ISTM that use a cursor is a perfectly reasonable answer, though. The general case cannot be applied for all particular cases. E.g. you cannot use cursors from shell scripts and just for producing an export file it's not too reasonable either. Redirecting psql's output or COPY is enough. Best regards, Zoltán Böszörényi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Interesting speed anomaly
Jim C. Nasby írta: Those queries aren't the same though. The view is equivalent to SELECT * FROM (select 'AAA' AS prefix,id from table 1 union select 'AAA',id from table 2 ) view WHERE prefix||id = '...' In this case the prefixes have already been unioned together, so there's no chance for the planner to use the function index. If break the WHERE clause into seperate clauses, such as WHERE prefix='AAA' AND id = '200501' then I think the planner will know what selects it can simply ignore. If that doesn't work, then add 'AAA'||id AS fullid to each of the selects in the view and that should allow the function indexes to be used. Thanks, both method sworks very fast now and use the expression indexes. Thanks for the patience and the explanations. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Interesting speed anomaly
Hi, I am trying to prove whether PostgreSQL is faster than Informix so I can feed the management with numbers. In our system, there is an invoice browser view, an UNION of 12 different tables. (Yes, there are 12 different invoices, like new or second-hand cars, warranty, service, etc, with in/out directions, all have to be counted from 1 starting each year, e.g 20050001. The view contains a constant field that is the so called invoice prefix, e.g. CARO is CAR-OUT, invoice of sold new cars and so on. SELECT * or SELECT COUNT(*) from this view for listing all invoices is overall faster. When I search for only one invoice, knowing the prefix and the invoice number is more interesting, however. Informix results: $ time echo select * from v_invoice_browse where code = 'CARO' and inv_no = 20020 | dbaccess db Database selected. ... 1 row(s) retrieved. Database closed. real0m1.263s user0m0.530s sys 0m0.000s $ time echo select * from v_invoice_browse where code||inv_no = 'CARO20020' | dbaccess db Database selected. ... 1 row(s) retrieved. Database closed. real0m7.942s (varying between 7.5 and 14 seconds) user0m0.510s sys 0m0.000s PostgreSQL results: $ time echo select * from v_invoice_browse where code = 'CARO' and inv_no = 20020 |psql db ... (1 row) real0m0.061s user0m0.000s sys 0m0.010s $ time echo select * from v_invoice_browse where code||inv_no = 'CARO20020' |psql db ... (1 row) real0m18.158s (varying between about 18 and 24 seconds) user0m0.000s sys 0m0.020s The timing of the first query varied very little between five runs. The timing variations of the second query is indicated above, it naturally depends on other system activities. Is there a way to speed this operation up? Maybe it could be known whether a field in a view is constant, or it can only have limited values, like in this situation where we have an union of tables, and every member of the union has a constant in that field. Or there may be other ways to speed up comparing concatenated values. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Interesting speed anomaly
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: $ time echo select * from v_invoice_browse where code||inv_no = 'CARO20020' | dbaccess db Is there a way to speed this operation up? Make an expression index on code||inv_no, if you think this case is important enough to be worth maintaining an extra index for. (This is not on-topic for -hackers, IMHO. Try pgsql-perform.) regards, tom lane Thanks for both the hint and the pointer to the mailing list. My problem is, I can't see how could I create any index on a view. PostgreSQL refuses it: create index iinvbrowse1 on v_invoice_browse ((code||inv_no)); ERROR: v_invoice_browse is not a table Creating indexes on the 12 invoice tables, like this: create index iinvoice1 on invoice1 (('PREFIX'||id)); can be done but it doesn't seem to help, at least the query run time doesn't decrease. Remember, the view is an union on the 12 tables, the 'code' (invoice prefix) field is a fake constant field to distinguish between the different invoice types. And we have the 'inv_no' field in the view but the serial fields in the separate invoice tables are called 'szam'. So there is no direct linkage between the view and table field names, except the view definition. That still leaves me wondering. Both Informix and PostgreSQL seems to do the query using sequential scan but the above WHERE condition is computed about two times faster in Informix, every other usual queries are faster in PostgreSQL about (and I really meant at least) five times than Informix. That's why I sent it to pgsql-hackers, maybe the hackers are interested in further improving PostgreSQL. ;-) I will ask on pqsql-performance, thanks. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Interesting speed anomaly
Gavin Sherry írta: On Thu, 15 Dec 2005, Gavin Sherry wrote: On Wed, 14 Dec 2005, Zoltan Boszormenyi wrote: Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: $ time echo select * from v_invoice_browse where code||inv_no = 'CARO20020' | dbaccess db Is there a way to speed this operation up? Make an expression index on code||inv_no, if you think this case is important enough to be worth maintaining an extra index for. (This is not on-topic for -hackers, IMHO. Try pgsql-perform.) regards, tom lane Thanks for both the hint and the pointer to the mailing list. My problem is, I can't see how could I create any index on a view. PostgreSQL refuses it: create index iinvbrowse1 on v_invoice_browse ((code||inv_no)); ERROR: v_invoice_browse is not a table Creating indexes on the 12 invoice tables, like this: create index iinvoice1 on invoice1 (('PREFIX'||id)); Are you creating the index on (core || id) on on the string 'PREFIX' or some other literal? Sorry, I sent this email instead of cancelling it. I take it 'code' is a string generated by the query, for example: select 'CAR' as code,* from cars ? If so, it seems strange that we do not use the expressional index. Could you send the output of explain analyze? Thanks, Gavin The VIEW is created like this (shorter example): create view v1 (code,num) as select 'AAA',id from table1 union select 'BBB',id from table2; I created the indexes on the individual tables as create index index1 on table1 (('AAA'||id)); create index index2 on table2 (('BBB'||id)); Every index has the same literal the table is associated with in the VIEW. Here is the explain analyze output, on PostgreSQL 8.0.3. I can test the same from 8.1.1 tomorrow. * # explain analyze select * from v_invoice_browse where code||inv_no = 'CARO20020'; QUERY PLAN -- Subquery Scan v_invoice_browse (cost=346661.81..356932.96 rows=403 width=680) (actual time=9184.529..9735.884 rows=1 loops=1) Filter: ((code || (inv_no)::text) = 'CARO20020'::text) - Unique (cost=346661.81..355523.19 rows=80558 width=188) (actual time=9184.313..9602.540 rows=84693 loops=1) - Sort (cost=346661.81..346863.21 rows=80558 width=188) (actual time=9184.310..9241.868 rows=84693 loops=1) Sort Key: ?column?, prefix, szam, divitem, ?column?, ?column?, partner, pmode, inv_date, ins_time, ship_date, pterm, netto, total, vat, decimal14_2_1, ?column?, ?column?, rectify, invoice_rect, status, acc_status, dpaym, dpaym_incl, netto_w_dpaym, vat_w_dpaym, acc_group, currency, car, ?column?, userid, bank_account, ?column?, ?column?, ?column?, ?column?, ?column?, ?column?, diff_tax, prcar, case, inv_no, ?column? - Append (cost=0.00..321067.25 rows=80558 width=188) (actual time=0.149..4540.736 rows=84693 loops=1) - Subquery Scan *SELECT* 1 (cost=0.00..148200.17 rows=23571 width=188) (actual time=0.148..735.239 rows=24508 loops=1) - Nested Loop (cost=0.00..147964.46 rows=23571 width=188) (actual time=0.120..408.176 rows=24508 loops=1) - Nested Loop (cost=0.00..3.19 rows=1 width=15) (actual time=0.033..0.062 rows=1 loops=1) - Seq Scan on pssysinvoice p (cost=0.00..2.17 rows=1 width=7) (actual time=0.022..0.034 rows=1 loops=1) Filter: ((code)::text = 'GENI'::text) - Seq Scan on psgenjointemp t (cost=0.00..1.01 rows=1 width=8) (actual time=0.001..0.016 rows=1 loops=1) - Seq Scan on mminigeninvoice h (cost=0.00..1637.49 rows=23571 width=173) (actual time=0.048..81.226 rows=24508 loops=1) Filter: ((status = 2) OR (status = 3) OR (status = 4) OR (status = 5) OR (status = 6)) SubPlan - Index Scan using isercar1 on mmsercar sc (cost=0.00..3.13 rows=1 width=21) (actual time=0.005..0.006 rows=1 loops=3405) Index Cond: (szam = $7) - Index Scan using icarprorder1 on mmcarprorder po (cost=0.00..3.06 rows=1 width=20) (actual time=0.024..0.025 rows
Re: [HACKERS] SERIAL type feature request
Hi, Zoltan Boszormenyi írta: Jan Wieck írta: On 12/5/2005 1:03 PM, Zoltan Boszormenyi wrote: Jan Wieck írta: On 12/4/2005 5:10 PM, Zoltan Boszormenyi wrote: I found this in the SQL2003 draft: 4.14.7 Identity columns ... An identity column has a start value, an increment, a maximum value, a minimum value, and a cycle option. ... The exact properties of a sequence. It would be a good idea to be able to provide all these the same way PostgreSQL provides CREATE SEQUENCE. I think nobody would object to implementing support for the SQL2003 syntax. Most of that would be providing all the values that will get forwarded into the internal sequence generation during CREATE TABLE. The other thing needed is an extension to the default value mechanism that overrides any given value to implement GENERATE ALLWAYS. Not too hard either. Where can I find this syntax? (PDF file name, page#) Thanks. I think I modify my feature request for the standard behaviour. It's all in the Foundation paper inside this zip: http://www.wiscorp.com/sql/sql_2003_standard.zip Thanks, I found it. It's GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY, isn't it? If I interpret it correctly, GENERATED ALWAYS AS IDENTITY means that no matter what I give in INSERT INTO MYTABLE (serial_id, ...) VALUES (N, ...), the sequence next value will be inserted into the database. I am all for it, it's much stronger than just watching for the 0 value and would fit my needs. The other behaviour is GENERATED BY DEFAULT AS IDENTITY, which is what PostgreSQL currently provides. Best regards, Zoltán Böszörményi To reiterate it, I would like the following added to PostgreSQL 8.2 TODO (I may have got the optional parametes wrong...): - Extend SERIAL type declaration and functionality with the SQL2003 compliant sequence generation options: SERIAL [ GENERATED { ALWAYS | BY DEFAULT } [ AS IDENTITY ( [ START WITH startvalue ] [ INCREMENT BY incrementvalue ] [ MAXVALUE maxvalue ] [ MINVALUE minvalue ] [ CYCLE | NO CYCLE ] ) ] ] ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] SERIAL type feature request
Jan Wieck írta: On 12/4/2005 5:10 PM, Zoltan Boszormenyi wrote: I found this in the SQL2003 draft: 4.14.7 Identity columns ... An identity column has a start value, an increment, a maximum value, a minimum value, and a cycle option. ... The exact properties of a sequence. It would be a good idea to be able to provide all these the same way PostgreSQL provides CREATE SEQUENCE. I think nobody would object to implementing support for the SQL2003 syntax. Most of that would be providing all the values that will get forwarded into the internal sequence generation during CREATE TABLE. The other thing needed is an extension to the default value mechanism that overrides any given value to implement GENERATE ALLWAYS. Not too hard either. Where can I find this syntax? (PDF file name, page#) Thanks. I think I modify my feature request for the standard behaviour. Thanks and best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] SERIAL type feature request
Jan Wieck írta: On 12/5/2005 1:03 PM, Zoltan Boszormenyi wrote: Jan Wieck írta: On 12/4/2005 5:10 PM, Zoltan Boszormenyi wrote: I found this in the SQL2003 draft: 4.14.7 Identity columns ... An identity column has a start value, an increment, a maximum value, a minimum value, and a cycle option. ... The exact properties of a sequence. It would be a good idea to be able to provide all these the same way PostgreSQL provides CREATE SEQUENCE. I think nobody would object to implementing support for the SQL2003 syntax. Most of that would be providing all the values that will get forwarded into the internal sequence generation during CREATE TABLE. The other thing needed is an extension to the default value mechanism that overrides any given value to implement GENERATE ALLWAYS. Not too hard either. Where can I find this syntax? (PDF file name, page#) Thanks. I think I modify my feature request for the standard behaviour. It's all in the Foundation paper inside this zip: http://www.wiscorp.com/sql/sql_2003_standard.zip Thanks, I found it. It's GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY, isn't it? If I interpret it correctly, GENERATED ALWAYS AS IDENTITY means that no matter what I give in INSERT INTO MYTABLE (serial_id, ...) VALUES (N, ...), the sequence next value will be inserted into the database. I am all for it, it's much stronger than just watching for the 0 value and would fit my needs. The other behaviour is GENERATED BY DEFAULT AS IDENTITY, which is what PostgreSQL currently provides. Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] SERIAL type feature request
Andrew Dunstan írta: Zoltan Boszormenyi wrote: Hi! I would like to add an entry to PostgreSQL 8.2 TODO: - Extend SERIAL to a full-featured auto-incrementer type. I can't see this item on the TODO list at all. Where exactly did you find it? That's why I wanted it ADDed... ;-) Best regards ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] SERIAL type feature request
Peter Eisentraut írta: Josh Berkus wrote: I believe that our SERIAL/SEQUENCE stuff is already in compliance with the SQL standard for sequences (in SQL03). Why would we change it? Because your belief is wrong, but Zoltan's proposal is not getting is closer. OK, what does the standard say on SERIAL for specifying the start value? And about this: last serial value was e.g. 307 insert into mytable (serial_id, ...) values (500, ...); delete from mytable where serial_id = 500; In Informix, this is a way to start the next serial value at 501. Best regards, Zoltán ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] SERIAL type feature request
OK, I admit I haven't read the SQL standards on this matter. Tino Wildenhain írta: Am Samstag, den 03.12.2005, 22:23 +0100 schrieb Zoltan Boszormenyi: Hi! I would like to add an entry to PostgreSQL 8.2 TODO: - Extend SERIAL to a full-featured auto-incrementer type. To achieve this, the following three requirements should be fulfilled: 1. The statement parser should be able to handle this: create table x ( id serial(N), ... ); and behind the scenes this would translate into the create sequence ... start N before creating the table. why isnt N max_id? Or increment? Sounds inconsistent. -1 on this. A SERIAL type has the assumption that its value starts at a low value (1) and is increasing. Or is there a type modifier keyword that makes it work backwards? A start value would also work here, decreasing from there. 2. Upon INSERTing to a serial column, explicitly given 0 value or 'default' keyword or omitted field (implicit default) should be interchangeable. default and omit are these. 0 would be an error. -1 on this too. Why? A sequence in PostgreSQL won't give you 0 even in wraparound mode. I just checked it: db= create sequence proba_seq maxvalue 5 cycle; CREATE SEQUENCE db= select nextval('proba_seq'); nextval - 1 (1 sor) db= select nextval('proba_seq'); nextval - 2 (1 sor) db= select nextval('proba_seq'); nextval - 3 (1 sor) db= select nextval('proba_seq'); nextval - 4 (1 sor) db= select nextval('proba_seq'); nextval - 5 (1 sor) db= select nextval('proba_seq'); nextval - 1 (1 sor) 3. When a serial field value is given in an INSERT or UPDATE statement and the value is larger the the current value of the sequence then the sequence should be modified accordingly. sideeffects, raceconditions. -1 on this. This event doesn't (shouldn't) occur often, e.g. you have an invoice table, invoice No. contains the year, too. It's somewhat natural to handle it with the serial field, so it gives out 20051 ... values. At the beginning of the next year, you modify the sequence to start at 20061. What I mean is that there may be two paths in the serial field handling, one where 'default' is used, it's hopefully isn't racy as this is the way it works now. The other is when the value is explicitly given, a little critical section may not hurt: Lock sequence Check the current value of section If given value is higher Then Modify sequence Unlock sequence This is the way Informix handles its serial type, although it doesn't seem to have a visible sequence bound to the serial column. Sounds like this informix is seriously broken ;) Yes, and slow, too. :-( That's why I would like to port the company's software to PostgreSQL but there way too many places where Informixism were used. Is it feasible in the 8.2 timeframe? I hope not ;) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] SERIAL type feature request
Jan Wieck írta: On 12/3/2005 4:23 PM, Zoltan Boszormenyi wrote: Hi! I would like to add an entry to PostgreSQL 8.2 TODO: - Extend SERIAL to a full-featured auto-incrementer type. To achieve this, the following three requirements should be fulfilled: 1. The statement parser should be able to handle this: create table x ( id serial(N), ... ); and behind the scenes this would translate into the create sequence ... start N before creating the table. Syntactic sugar with zero real value. A setval() after create table does exactly the same. Unless you extend your proposal to unambiguosly specify any or all of the serials properties (min, max, start, cache, cycle), this has to be rejected as incomplete. I found this in the SQL2003 draft: 4.14.7 Identity columns ... An identity column has a start value, an increment, a maximum value, a minimum value, and a cycle option. ... The exact properties of a sequence. It would be a good idea to be able to provide all these the same way PostgreSQL provides CREATE SEQUENCE. 2. Upon INSERTing to a serial column, explicitly given 0 value or 'default' keyword or omitted field (implicit default) should be interchangeable. Why exactly would we treat an explicit zero different from any other explicit value? What you are asking for is to substitute an explicit, literal value presented by the user with something different. Sorry, but if Informix does THAT, then Informix is no better than MySQL. Thinking about it more, 0 is a special value that a sequence created with defaults (just like the ones created for SERIAL fields) will not produce. If PostgreSQL provides a way to specify the sequence parameters for a SERIAL, there may be other values too, that a sequence created with given parameters will not produce. At the extreme, they may be handled the same way. E.g. CREATE SEQUENCE seq1 INCREMENT 2 MINVALUE 2 MAXVALUE 100; won't produce 0, 1, any odd number between 3 and 99, and numbers 101 ... 2^64 -1. 3. When a serial field value is given in an INSERT or UPDATE statement and the value is larger the the current value of the sequence then the sequence should be modified accordingly. How about negative increment values, cycling sequences and max/minval? For descending sequences, a lower value should update the sequence. This is the way Informix handles its serial type, although it doesn't seem to have a visible sequence bound to the serial column. Have you considered asking Informix to do the reverse changes? Hm. Good idea. I'll try. But I guess they won't backport it to 9.21. :-) Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] SERIAL type feature request
Hi! I would like to add an entry to PostgreSQL 8.2 TODO: - Extend SERIAL to a full-featured auto-incrementer type. To achieve this, the following three requirements should be fulfilled: 1. The statement parser should be able to handle this: create table x ( id serial(N), ... ); and behind the scenes this would translate into the create sequence ... start N before creating the table. 2. Upon INSERTing to a serial column, explicitly given 0 value or 'default' keyword or omitted field (implicit default) should be interchangeable. 3. When a serial field value is given in an INSERT or UPDATE statement and the value is larger the the current value of the sequence then the sequence should be modified accordingly. This is the way Informix handles its serial type, although it doesn't seem to have a visible sequence bound to the serial column. Is it feasible in the 8.2 timeframe? Thanks in advance, Zoltán Böszörményi ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster