Re: [HACKERS] Fix dumping pre-10 DBs by pg_dump10 if table "name" exists
On 10/31/17, Tom Lane wrote: > Yeah, there are quite a few unqualified casts in pg_dump, but AFAICS > all the rest are OK because the search_path is just pg_catalog. > > But I did find psql's describe.c making a similar mistake :-(. > Pushed that along with your fix. > > regards, tom lane > Oops. I missed it in "describe.c" because I grepped for exact "::name" string. Thank you very much! -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix dumping pre-10 DBs by pg_dump10 if table "name" exists
On 10/31/17, Tom Lane wrote: > Vitaly Burovoy writes: >> Recently my colleagues found a bug. > >> - "SELECT 'bigint'::name AS >> sequence_type, " >> + "SELECT >> 'bigint'::pg_catalog.name AS sequence_type, > > Good catch, but I think we could simplify this by just omitting the cast > altogether: > > - "SELECT 'bigint'::name AS > sequence_type, " > + "SELECT 'bigint' AS > sequence_type, > > pg_dump doesn't particularly care whether the column comes back marked > as 'name' or 'text' or 'unknown'. > > regards, tom lane OK, just for convenience I'm attaching your version of the fix. I left an other "NULL::name AS rolname" at src/bin/pg_dump/pg_dump.c:2978 because can't check (remoteVersion < 9) it and it is under strict "selectSourceSchema(fout, "pg_catalog");" schema set. -- Best regards, Vitaly Burovoy 0001-Fix-dumping-schema-if-a-table-named-name-exists.ver2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fix dumping pre-10 DBs by pg_dump10 if table "name" exists
Hello, hackers! Recently my colleagues found a bug. They could not migrate from PG9.5 to PG10 due to error during pg_upgrage (the same as in the "reproduce" part below). An investigation showed there is a table "name" in the same schema where the dumped sequence is located and the PG tries to unpack the literal "bigint" as a composite type and fails. The bug was introduced by two commits, one of them changes search_path, the second one introduces the "'bigint'::name" without specifying schema: da4d1c0c15ab9afdfeee8bad9a1a9989b6bd59b5 pg_dump: Fix some schema issues when dumping sequences 2ea5b06c7a7056dca0af1610aadebe608fbcca08 Add CREATE SEQUENCE AS clause Steps to reproduce: $ psql -h pre-10 postgres Password for user postgres: psql (11devel, server 9.5.8) Type "help" for help. postgres=# create table name(id serial); CREATE TABLE postgres=# \q $ pg_dump -h pre-10 postgres pg_dump: [archiver (db)] query failed: ERROR: malformed record literal: "bigint" LINE 1: SELECT 'bigint'::name AS sequence_type, start_value, increme... ^ DETAIL: Missing left parenthesis. pg_dump: [archiver (db)] query was: SELECT 'bigint'::name AS sequence_type, start_value, increment_by, max_value, min_value, cache_value, is_cycled FROM name_id_seq I've implemented a little fix (attached), don't think there is something to be written to docs and tests. -- Best regards, Vitaly Burovoy 0001-Fix-dumping-schema-if-a-table-named-name-exists.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 4/23/17, Robert Haas wrote: > On Thu, Apr 20, 2017 at 12:05 AM, Vitaly Burovoy > wrote: >> OK. Let's go through it again. >> IDENTITY is a property of a column. There are no syntax to change any >> property of any DB object via the "ADD" syntax. >> Yes, a structure (a sequence) is created. But in fact it cannot be >> independent from the column at all (I remind you that according to the >> standard it should be unnamed sequence and there are really no way to >> do something with it but via the column's DDL). > > I agree that ADD is a little odd here, but it doesn't seem terrible. Yes, it is not terrible, but why do we need to support an odd syntax if we can use a correct one? If we leave it as it was committed, we have to support it for years. > But why do we need it? Instead of: > > ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY > SET GENERATED { ALWAYS | BY DEFAULT } > DROP IDENTITY [ IF EXISTS ] > > Why not just: > > SET GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY > DROP IDENTITY [ IF EXISTS ] > > Surely the ALTER TABLE command can tell whether the column is already > GENERATED, so the first form could make it generated if it's not and > adjust the ALWAYS/BY DEFAULT property if it is. I thought exactly that way, but Peter gave an explanation[1]. I had to search a different way because no one joined to the discussion at that time. One of reasons from Peter was to make "SET GENERATED" follow the standard (i.e. raise an error). I asked whether "IF NOT EXISTS" works for him instead of "ADD GENERATED". The answer[2] was "It could be done", but "it is very difficult to implement". So I wonder why the adjustment patch is not wished for being committed. >> It is even hard to detect which sequence (since they have names) is >> owned by the column: >> >> postgres=# CREATE TABLE xxx(i int generated always as identity, j >> serial); >> CREATE TABLE >> postgres=# \d xxx* >> Table "public.xxx" >> Column | Type | Collation | Nullable |Default >> +-+---+--+ >> i | integer | | not null | generated always as identity >> j | integer | | not null | nextval('xxx_j_seq'::regclass) >> >> Sequence "public.xxx_i_seq" >>Column | Type | Value >> +-+--- >> last_value | bigint | 1 >> log_cnt| bigint | 0 >> is_called | boolean | f >> >> Sequence "public.xxx_j_seq" >>Column | Type | Value >> +-+--- >> last_value | bigint | 1 >> log_cnt| bigint | 0 >> is_called | boolean | f >> Owned by: public.xxx.j >> >> I can only guess that "public.xxx_i_seq" is owned by "public.xxx.i", >> nothing proves that. >> Whereas for regular sequence there are two evidences ("Default" and "Owned >> by"). > > This seems like a psql deficiency that should be fixed. It was a part of explanation why IDENTITY is a property and therefore "SET" should be used instead of "ADD". >> Also the created sequence cannot be deleted (only with the column) or >> left after the column is deleted. > > This does not seem like a problem. In fact I'd say that's exactly the > desirable behavior. Also it is not about a problem, it is a part of explanation. >> The "[ NOT ] EXISTS" is a common Postgres' syntax extension for >> creating/updating objects in many places. That's why I think it should >> be used instead of introducing the new "ADD" syntax which contradicts >> the users' current experience. > > As noted above, I don't understand why we need either ADD or IF NOT > EXISTS. Why can't SET just, eh, set the property to the desired > state? +1 > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company [1] https://postgr.es/m/497c40af-bd7a-5cb3-d028-d91434639...@2ndquadrant.com [2] https://postgr.es/m/59d8e32a-14de-6f45-c2b0-bb52e4a75...@2ndquadrant.com -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 4/18/17, Peter Eisentraut wrote: > On 4/7/17 01:26, Vitaly Burovoy wrote: >> I've implement SET GENERATED ... IF NOT EXISTS. It must be placed >> before other SET options but fortunately it conforms with the >> standard. >> Since that form always changes the sequence behind the column, I >> decided to explicitly write "[NO] CACHE" in pg_dump. >> >> As a plus now it is possible to rename the sequence behind the column >> by specifying SEQUENCE NAME in SET GENERATED. >> >> I hope it is still possible to get rid of the "ADD GENERATED" syntax. > > I am still not fond of this change. There is precedent all over the > place for having separate commands for creating a structure, changing a > structure, and removing a structure. I don't understand what the > problem with that is. > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services OK. Let's go through it again. IDENTITY is a property of a column. There are no syntax to change any property of any DB object via the "ADD" syntax. Yes, a structure (a sequence) is created. But in fact it cannot be independent from the column at all (I remind you that according to the standard it should be unnamed sequence and there are really no way to do something with it but via the column's DDL). It is even hard to detect which sequence (since they have names) is owned by the column: postgres=# CREATE TABLE xxx(i int generated always as identity, j serial); CREATE TABLE postgres=# \d xxx* Table "public.xxx" Column | Type | Collation | Nullable |Default +-+---+--+ i | integer | | not null | generated always as identity j | integer | | not null | nextval('xxx_j_seq'::regclass) Sequence "public.xxx_i_seq" Column | Type | Value +-+--- last_value | bigint | 1 log_cnt| bigint | 0 is_called | boolean | f Sequence "public.xxx_j_seq" Column | Type | Value +-+--- last_value | bigint | 1 log_cnt| bigint | 0 is_called | boolean | f Owned by: public.xxx.j I can only guess that "public.xxx_i_seq" is owned by "public.xxx.i", nothing proves that. Whereas for regular sequence there are two evidences ("Default" and "Owned by"). Also the created sequence cannot be deleted (only with the column) or left after the column is deleted. Everywhere else the "ADD" syntax is used where you can add more than one object to the altered one: ALTER TABLE ... ADD COLUMN /* there can be many added columns differing by names in the table */ ALTER TABLE ... ADD CONSTRAINT /* there can be many added constraints differing by names in the table */ ALTER TYPE ... ADD VALUE /* many values in an enum differing by names in the enum */ ALTER TYPE ... ADD ATTRIBUTE /* many attributes in a composite differing by names in the enum */ etc. But what is for "ALTER TABLE ... ALTER COLUMN ... ADD GENERATED"? Whether a property's name is used for a distinction between them in a column? Whether it is possible to have more than one such property to the altering column? The "SET GENERATED" (without "IF NOT EXISTS") syntax conforms to the standard for those who want it. The "SET GENERATED ... IF NOT EXISTS" syntax allows users to have the column be in a required state (IDENTITY with set options) without paying attention whether it is already set as IDENTITY or not. The "[ NOT ] EXISTS" is a common Postgres' syntax extension for creating/updating objects in many places. That's why I think it should be used instead of introducing the new "ADD" syntax which contradicts the users' current experience. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
On 4/4/17, Peter Eisentraut wrote: > On 3/30/17 22:47, Vitaly Burovoy wrote: >> It seemed not very hard to fix it. >> Please find attached patch to be applied on top of your one. >> >> I've added more tests to cover different cases of changing bounds when >> data type is changed. > > Committed all that. Thanks! Thank you very much! -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 4/6/17, Vitaly Burovoy wrote: > On 4/6/17, Peter Eisentraut wrote: >> As I tried to mention earlier, it is very difficult to implement the IF >> NOT EXISTS behavior here, because we need to run the commands the create >> the sequence before we know whether we will need it. > > In fact with the function "get_attidentity" it is not so hard. Please, > find the patch attached. ... > I hope it is still possible to get rid of the "ADD GENERATED" syntax. Hello hackers, Is it possible to commit the patch from the previous letter? It was sent before the feature freeze, introduces no new feature (only syntax change discussed in this thread and not implemented due to lack of a time of the author) and can be considered as a fix to the committed patch (similar to the second round in "sequence data type"). -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 4/6/17, Peter Eisentraut wrote: > On 4/4/17 22:53, Vitaly Burovoy wrote: >> The next nitpickings to the last patch. I try to get places with >> lacking of variables' initialization. >> All other things seem good for me now. I'll continue to review the >> patch while you're fixing the current notes. > > Committed with your changes (see below) as well as executor fix. Thank you very much! > As I tried to mention earlier, it is very difficult to implement the IF > NOT EXISTS behavior here, because we need to run the commands the create > the sequence before we know whether we will need it. In fact with the function "get_attidentity" it is not so hard. Please, find the patch attached. I've implement SET GENERATED ... IF NOT EXISTS. It must be placed before other SET options but fortunately it conforms with the standard. Since that form always changes the sequence behind the column, I decided to explicitly write "[NO] CACHE" in pg_dump. As a plus now it is possible to rename the sequence behind the column by specifying SEQUENCE NAME in SET GENERATED. I hope it is still possible to get rid of the "ADD GENERATED" syntax. > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Best regards, Vitaly Burovoy v7-0001-Implement-SET-IDENTITY-.-IF-NOT-EXISTS.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 4/4/17, Vitaly Burovoy wrote: > On 4/3/17, Peter Eisentraut wrote: >> On 3/30/17 22:57, Vitaly Burovoy wrote: >>> Why do you still want to leave "ADD IF NOT EXISTS" instead of using >>> "SET IF NOT EXISTS"? >>> If someone wants to follow the standard he can simply not to use "IF >>> NOT EXISTS" at all. Without it error should be raised. >> >> As I tried to mention earlier, it is very difficult to implement the IF >> NOT EXISTS behavior here, because we need to run the commands the create >> the sequence before we know whether we will need it. > > I understand. You did not mention the reason. > But now you have the "get_attidentity" function to check whether the > column is an identity or not. > If it is not so create a sequence otherwise skip the creation. > > I'm afraid after the feature freeze it is impossible to do "major > reworking of things", but after the release we have to support the > "ALTER COLUMN col ADD" grammar. So it would be great if you have a time to get rid of "ALTER ... ADD GENERATED" syntax in favor of "ALTER ... SET GENERATED ... IF NOT EXIST" > 3. > It would be better if tab-completion has ability to complete the > "RESTART" keyword like: > ALTER TABLE x1 ALTER COLUMN i RESTART > tab-complete.c:8898 > - COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "ADD", "DROP"); > + COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "RESTART", "ADD", "DROP"); Oops. Of course + COMPLETE_WITH_LIST6("TYPE", "SET", "RESET", "RESTART", "ADD", "DROP"); -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 4/3/17, Peter Eisentraut wrote: > On 3/30/17 22:57, Vitaly Burovoy wrote: >> Why do you still want to leave "ADD IF NOT EXISTS" instead of using >> "SET IF NOT EXISTS"? >> If someone wants to follow the standard he can simply not to use "IF >> NOT EXISTS" at all. Without it error should be raised. > > As I tried to mention earlier, it is very difficult to implement the IF > NOT EXISTS behavior here, because we need to run the commands the create > the sequence before we know whether we will need it. I understand. You did not mention the reason. But now you have the "get_attidentity" function to check whether the column is an identity or not. If it is not so create a sequence otherwise skip the creation. I'm afraid after the feature freeze it is impossible to do "major reworking of things", but after the release we have to support the "ALTER COLUMN col ADD" grammar. > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > The next nitpickings to the last patch. I try to get places with lacking of variables' initialization. All other things seem good for me now. I'll continue to review the patch while you're fixing the current notes. Unfortunately I don't know PG internals so deep to understand correctness of changes in the executor (what Tom and Andres are talking about). 0. There is a little conflict of applying to the current master because of 60a0b2e. 1. First of all, I think the previous usage of the constant "ATTRIBUTE_IDENTITY_NONE" (for setting a value) is better than just '\0'. It is OK for lacking of the constant in comparison. 2. Lack of "SET IDENTITY" and "RESTART" in the "Compatibility" chapter in "alter_table.sgml": "The forms ADD (without USING INDEX), DROP, SET DEFAULT, and SET DATA TYPE (without USING) conform with the SQL standard." Also ADD IDENTITY is an extension (I hope temporary), but it looks like a standard feature (from the above sentance). 3. It would be better if tab-completion has ability to complete the "RESTART" keyword like: ALTER TABLE x1 ALTER COLUMN i RESTART tab-complete.c:8898 - COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "ADD", "DROP"); + COMPLETE_WITH_LIST5("TYPE", "SET", "RESET", "RESTART", "ADD", "DROP"); 4. The block in "gram.y": /* ALTER TABLE ALTER [COLUMN] DROP IDENTITY */ does not contain "n->missing_ok = false;". I think it should be. 5. In the file "pg_dump.c" in the "getTableAttrs" function at row 7959 the "tbinfo->needs_override" is used (in the "||" operator) but it is initially nowhere set to "false". 6. And finally... a segfault when pre-9.6 databases are pg_dump-ing: SQL query in the file "pg_dump.c" in funcion "getTables" has the "is_identity_sequence" column only in the "if (fout->remoteVersion >= 90600)" block (frow 5536), whereas 9.6 does not support it (but OK, for 9.6 it is always FALSE, no sense to create a new "if" block), but in other blocks no ",FALSE as is_identity_sequence" is added. The same mistake is in the "getTableAttrs" function. Moreover whether "ORDER BY a.attrelid, a.attnum" is really necessary ("else if (fout->remoteVersion >= 90200)" has only "ORDER BY a.attnum")? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 3/29/17, Peter Eisentraut wrote: > On 3/24/17 05:29, Vitaly Burovoy wrote: >> It would be great if "DROP IDENTITY IF EXISTS" is in the current patch >> since we don't have any disagreements about "DROP IDENTITY" behavior >> and easiness of implementation of "IF EXISTS" there. > > Here is an updated patch that adds DROP IDENTITY IF EXISTS. > > Unfortunately, implementing ADD IF NOT EXISTS is much harder, so I > haven't done that here. > > Additionally, this patch fixes a few minor issues that you had pointed > out, and merges with the new expression evaluation system in the executor. > > I have also CC'ed you on a separate patch to improve the behavior when > changing a sequence's data type. Thank you a lot. I'll have a deep look by the Sunday evening. Why do you still want to leave "ADD IF NOT EXISTS" instead of using "SET IF NOT EXISTS"? If someone wants to follow the standard he can simply not to use "IF NOT EXISTS" at all. Without it error should be raised. But we would not need to introduce a new grammar for a column property which is single and can't be added more than once. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
On 3/30/17, Vitaly Burovoy wrote: > On 3/29/17, Vitaly Burovoy wrote: >> On 3/29/17, Michael Paquier wrote: >>> On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy >>> wrote: >>>> I think min_value and max_value should not be set to "1" or "-1" but >>>> to real min/max of the type by default. >>> >>> This is the default behavior for ages, since e8647c45 to be exact. So >>> you would change 20 years of history? >> >> ... is it a wrong way to keep historical minimum as "1" by >> default: it is not a minimum of any of supported type. > > I've read the standard about "minvalue", "maxvalue" and "start". > OK, I was wrong. Since "start" should be equal to "minvalue" unless > defined explicitly, the only bug left from my first email here is > resetting "minvalue" back to 1 when data type changes and if the value > matches the bound of the old type (the last case there). > > P.S.: the same thing with "maxvalue" when "increment" is negative. It seemed not very hard to fix it. Please find attached patch to be applied on top of your one. I've added more tests to cover different cases of changing bounds when data type is changed. -- Best regards, Vitaly Burovoy 0002-Fix-overriding-max-min-value-of-a-sequence-to-1-on-A.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
On 3/30/17, Vitaly Burovoy wrote: > On 3/29/17, Vitaly Burovoy wrote: >> On 3/29/17, Michael Paquier wrote: >>> On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy >>> wrote: >>>> I think min_value and max_value should not be set to "1" or "-1" but >>>> to real min/max of the type by default. >>> >>> This is the default behavior for ages, since e8647c45 to be exact. So >>> you would change 20 years of history? >> >> ... is it a wrong way to keep historical minimum as "1" by >> default: it is not a minimum of any of supported type. > > I've read the standard about "minvalue", "maxvalue" and "start". > OK, I was wrong. Since "start" should be equal to "minvalue" unless > defined explicitly, the only bug left from my first email here is > resetting "minvalue" back to 1 when data type changes and if the value > matches the bound of the old type (the last case there). > > P.S.: the same thing with "maxvalue" when "increment" is negative. It seemed not very hard to fix it. Please find attached patch. I've added more tests to cover different cases of changing bounds when data type is changed. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
On 3/29/17, Vitaly Burovoy wrote: > On 3/29/17, Michael Paquier wrote: >> On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy >> wrote: >>> I think min_value and max_value should not be set to "1" or "-1" but >>> to real min/max of the type by default. >> >> This is the default behavior for ages, since e8647c45 to be exact. So >> you would change 20 years of history? > > ... is it a wrong way to keep historical minimum as "1" by > default: it is not a minimum of any of supported type. I've read the standard about "minvalue", "maxvalue" and "start". OK, I was wrong. Since "start" should be equal to "minvalue" unless defined explicitly, the only bug left from my first email here is resetting "minvalue" back to 1 when data type changes and if the value matches the bound of the old type (the last case there). P.S.: the same thing with "maxvalue" when "increment" is negative. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
On 3/29/17, Michael Paquier wrote: > On Thu, Mar 30, 2017 at 11:18 AM, Vitaly Burovoy > wrote: >> I think min_value and max_value should not be set to "1" or "-1" but >> to real min/max of the type by default. > > This is the default behavior for ages, since e8647c45 to be exact. So > you would change 20 years of history? > -- > Michael > Unfortunately yes, because the current patch has appeared because of another patch with the "IDENTITY COLUMN" feature. Since sequences used for such columns are completely hidden (they do not appear in DEFAULTs like "serial" do) and a new option (sequence data type) is supported with its altering (which was absent previously), new errors appear because of that. Be honest I did not checked the "countdown" case at the current release, but the most important part is the second one because it is a direct analogue of what happens (in the parallel thread) on "ALTER TABLE tbl ALTER col TYPE new_type" where "col" is an identity column. Since the commit 2ea5b06c7a7056dca0af1610aadebe608fbcca08 declares "The data type determines the default minimum and maximum values of the sequence." is it a wrong way to keep historical minimum as "1" by default: it is not a minimum of any of supported type. I don't think something will break if min_value is set lesser than before, but it will decrease astonishment of users in cases I wrote in the previous email. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
On 3/29/17, Peter Eisentraut wrote: > Over at > <https://www.postgresql.org/message-id/cakoswnnxmm6ybxnzgnxtzqmpjdgjf+a3wx53wzmrq5wqdyr...@mail.gmail.com> > is is being discussed that maybe the behavior when altering the sequence > type isn't so great, because it currently doesn't update the min/max > values of the sequence at all. So here is a patch to update the min/max > values when the old min/max values were the min/max values of the data > type. > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > It seems almost good for me except a single thing (I'm sorry, I missed the previous discussion). Why is min_value set to 1 (or -1 for negative INCREMENTs) by default for all sequence types? With the committed patch it leads to the extra "MINVALUE" option besides the "START" one; and it is not the worst thing. It leads to strange error for countdown sequences: postgres=# CREATE SEQUENCE abc AS smallint MINVALUE 0 START 2 INCREMENT -1; ERROR: MINVALUE (0) must be less than MAXVALUE (-1) postgres=# CREATE SEQUENCE abc AS smallint MINVALUE 0 START 2 INCREMENT -1 NO MAXVALUE; -- does not help ERROR: MINVALUE (0) must be less than MAXVALUE (-1) With the proposed patch users can impact with the next error: postgres=# CREATE TABLE tbl(i smallserial); CREATE TABLE postgres=# SELECT * FROM pg_sequences; schemaname | sequencename | sequenceowner | data_type | start_value | min_value | max_value | increment_by | cycle | cache_size | last_value +--+---+---+-+---+---+--+---++ public | tbl_i_seq| burovoy_va| smallint | 1 | 1 | 32767 |1 | f | 1 | (1 row) postgres=# -- min_value for smallint is "1"? Ok, I want to use the whole range: postgres=# ALTER SEQUENCE tbl_i_seq MINVALUE -32768 START -32768 RESTART -32768; ALTER SEQUENCE postgres=# -- after a while I realized the range is not enough. Try to enlarge it: postgres=# ALTER SEQUENCE tbl_i_seq AS integer; ERROR: START value (-32768) cannot be less than MINVALUE (1) It is not an expected behavior. I think min_value and max_value should not be set to "1" or "-1" but to real min/max of the type by default. I recommend to add to the docs explicit phrase that START value is not changed even if it matches the bound of the original type. Also it is good to have regressions like: CREATE SEQUENCE sequence_test10 AS smallint MINVALUE -1000 MAXVALUE 1000; ALTER SEQUENCE sequence_test10 AS int NO MINVALUE NO MAXVALUE INCREMENT 1; ALTER SEQUENCE sequence_test10 AS bigint NO MINVALUE NO MAXVALUE INCREMENT -1; CREATE SEQUENCE sequence_test11 AS smallint MINVALUE -32768 MAXVALUE 32767; ALTER SEQUENCE sequence_test11 AS int NO MINVALUE NO MAXVALUE INCREMENT 1; ALTER SEQUENCE sequence_test11 AS int NO MINVALUE NO MAXVALUE INCREMENT -1; -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 3/23/17, Peter Eisentraut wrote: > On 3/23/17 06:09, Vitaly Burovoy wrote: >> I think we'll end up with "DROP IDENTITY IF EXISTS" to avoid raising >> an exception and "ADD OR SET" if your grammar remains. > > That sounds reasonable to me. It would be great if "DROP IDENTITY IF EXISTS" is in the current patch since we don't have any disagreements about "DROP IDENTITY" behavior and easiness of implementation of "IF EXISTS" there. >> Right. From that PoV IDENTITY also changes a default value: "SET (ADD >> ... AS?) IDENTITY" works as setting a default to "nextval(...)" >> whereas "DROP IDENTITY" works as setting it back to NULL. > > But dropping and re-adding an identity destroys state, so it's not quite > the same. How does dropping and re-adding affects a choosing between "ADD" and "SET"? If you drop identity property, you get a column's state destroyed whatever grammar variation you are using. > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services I have an idea. What about the next version of DDL: DROP IDENTITY [ IF EXISTS ] SET GENERATED { ALWAYS | BY DEFAULT } [ IF NOT EXISTS ] | SET ... That version: 1. does not introduce a new "ADD" variation; 2. without "IF NOT EXISTS" follows the standard; 3. with "IF NOT EXISTS" sets a column's identity property or alters it (works as "CREATE OR REPLACE" for functions). -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 3/22/17, Peter Eisentraut wrote: > On 3/22/17 03:59, Vitaly Burovoy wrote: >> Column's IDENTITY behavior is very similar to a DEFAULT one. We write >> "SET DEFAULT" and don't care whether it was set before or not, because >> we can't have many of them for a single column. Why should we do that >> for IDENTITY? > > One indication is that the SQL standard requires that DROP IDENTITY only > succeed if the column is currently an identity column. That is > different from how DEFAULT works. I think we'll end up with "DROP IDENTITY IF EXISTS" to avoid raising an exception and "ADD OR SET" if your grammar remains. > Another difference is that there is no such thing as "no default", > because in absence of an explicit default, it is NULL. So all you are > doing with SET DEFAULT or DROP DEFAULT is changing the default. You are > not actually adding or removing it. Right. From that PoV IDENTITY also changes a default value: "SET (ADD ... AS?) IDENTITY" works as setting a default to "nextval(...)" whereas "DROP IDENTITY" works as setting it back to NULL. > Therefore, the final effect of SET DEFAULT is the same no matter whether > another default was there before or not. For ADD/SET IDENTITY, you get > different behaviors. For example: > > ADD .. AS IDENTITY (START 2) > > creates a new sequence that starts at 2 and uses default parameters > otherwise. But > > SET (START 2) > > alters the start parameter of an existing sequence. So depending on > whether you already have an identity sequence, these commands do > completely different things. If you use "SET START 2" to a non-identity columns, you should get an exception (no information about an identity type: neither "by default" nor "always"). The correct example is: ADD GENERATED BY DEFAULT AS IDENTITY (START 2) and SET GENERATED BY DEFAULT SET START 2 Note that creating a sequence is an internal machinery hidden from users. Try to see from user's PoV: the goal is to have a column with an autoincrement. If it is already autoincremented, no reason to create a sequence (it is already present) and no reason to restart with 2 (there can be rows with such numbers). "... SET START 2" is for the next "RESTART" DDL, and if a user insists to start with 2, it is still possible: SET GENERATED BY DEFAULT SET START 2 RESTART 2 I still think that introducing "ADD" for a property which can not be used more than once (compare with "ADD CHECK": you can use it with the same expression multiple times) is not a good idea. I think there should be a consensus in the community for a grammar. > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 3/21/17, Peter Eisentraut wrote: > On 3/21/17 16:11, Vitaly Burovoy wrote: >> My argument is consistency. >> Since IDENTITY is a property of a column (similar to DEFAULT, NOT >> NULL, attributes, STORAGE, etc.), it follows a different rule: it is >> either set or not set. If it did not set before, the "SET" DDL "adds" >> it, if that property already present, the DDL replaces it. >> There is no "ADD" clause in DDLs like "...ALTER table ALTER column..." >> (only "SET", "RESET" and "DROP")[2]. >> Your patch introduces the single DDL version with "...ALTER column >> ADD..." for a property. > > But it creates a sequence, so it creates state. Right. But it is an internal mechanism. DDL is not about creating a sequence, it is about changing a property. > So mistakes could easily be masked. With my patch, if you do ADD twice, you > get an error. Agree. But what's for? Whether that parameters are incompatible (and can't be changed later)? > With your proposal, you'd have to use SET, and you could overwrite > existing sequence state without realizing it. I can't overwrite its state (current value), only its settings like start, maxval, etc. In fact when I write a DDL I want to change a schema. For non-properties it is natural to write "CREATE" (schema, table) and "ADD" (column, constraints) because there can be many of them (with different names) in a single object: many schemas in a DB, many tables in a schema, many columns in a table and even many constraints in a table. So ADD is used for adding objects which have a name to some container (DB, schema, table). It is not true for the IDENTITY property. You can have many identity columns, but you can not have many of them in a single column. Column's IDENTITY behavior is very similar to a DEFAULT one. We write "SET DEFAULT" and don't care whether it was set before or not, because we can't have many of them for a single column. Why should we do that for IDENTITY? Whether I write "ADD" or "SET" I want to have a column with some behavior and I don't mind what behavior it has until it is incompatible with my wish (e.g. it has DEFAULT, but I want IDENTITY or vice versa). >>> It does change the type, but changing the type doesn't change the >>> limits. That is a property of how ALTER SEQUENCE works, which was >>> separately discussed. >> >> Are you about the thread[1]? If so, I'd say the current behavior is not >> good. >> I sent an example with users' bad experience who will know nothing >> about sequences (because they'll deal with identity columns). >> Would it be better to change bounds of a sequence if they match the >> bounds of an old type (to the bounds of a new type)? > > That's an idea, but that's for a separate patch. It is very likely to have one in Postgres10. I'm afraid in the other case we'll impact with many bug reports similar to my example. > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
I haven't seen a patch (I'll do it later), just few notes: On 3/21/17, Peter Eisentraut wrote: >>>> 3. Strange error (not about absence of a column; but see pp.5 and 8): >>>> test=# ALTER TABLE idnt ALTER COLUMN o ADD GENERATED ALWAYS AS >>>> IDENTITY; >>>> ERROR: identity column type must be smallint, integer, or bigint >>> >>> What's wrong with that? >> >> The column mentioned there does not exist. Therefore the error should >> be about it, not about a type of absent column: > > This was already fixed in the previous version. I've just checked and still get an error about a type, not about absence of a column: test=# CREATE TABLE itest3 (a int generated by default as identity, b text); CREATE TABLE test=# ALTER TABLE itest3 ALTER COLUMN o ADD GENERATED ALWAYS AS IDENTITY; ERROR: identity column type must be smallint, integer, or bigint >> Since we extend the spec by "ADD GENERATED...", "SEQUENCE NAME" and >> allow more than one identity column, why can't we extend it by >> allowing "SET GENERATED" for non-identity columns and drop "ADD >> GENERATED..."? > > SQL commands generally don't work that way. Either they create or they > alter. There are "OR REPLACE" options when you do both. I'd agree with you if we are talking about database's objects like tables, functions, columns etc. > So I think it > is better to keep these two things separate. Also, while you argue that > we *could* do it the way you propose, I don't really see an argument why > it would be better. My argument is consistency. Since IDENTITY is a property of a column (similar to DEFAULT, NOT NULL, attributes, STORAGE, etc.), it follows a different rule: it is either set or not set. If it did not set before, the "SET" DDL "adds" it, if that property already present, the DDL replaces it. There is no "ADD" clause in DDLs like "...ALTER table ALTER column..." (only "SET", "RESET" and "DROP")[2]. Your patch introduces the single DDL version with "...ALTER column ADD..." for a property. >> 16. changing a type does not change an underlying sequence type (its >> limits): > > It does change the type, but changing the type doesn't change the > limits. That is a property of how ALTER SEQUENCE works, which was > separately discussed. Are you about the thread[1]? If so, I'd say the current behavior is not good. I sent an example with users' bad experience who will know nothing about sequences (because they'll deal with identity columns). Would it be better to change bounds of a sequence if they match the bounds of an old type (to the bounds of a new type)? > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services [1] https://www.postgresql.org/message-id/flat/898deb94-5265-37cf-f199-4f79ef864...@2ndquadrant.com [2] https://www.postgresql.org/docs/current/static/sql-altertable.html -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 2/28/17, Peter Eisentraut wrote: > New patch that fixes everything. ;-) Great work! > On 1/4/17 19:34, Vitaly Burovoy wrote: >> 1. The fact COPY ignores GENERATED ALWAYS constraint (treats as >> GENERATED BY DEFAULT) should be mentioned as well as rules. > > fixed (documentation added) > > What do you mean for rules? I meant the phrase "However, it will not invoke rules." mentioned there. For rewrite rules this behavior is mentioned on the COPY page, not on the "CREATE RULE" page. I think it would be better if that behavior is placed on both CREATE TABLE and COPY pages. >> 2. Usually error message for identical columns (with LIKE clause) >> looks like this: >> test=# CREATE TABLE def(i serial, n1 int NOT NULL, n2 int); >> CREATE TABLE >> test=# CREATE TABLE test(i serial, LIKE def INCLUDING ALL); >> ERROR: column "i" specified more than once >> >> but for generated columns it is disorienting enough: >> test=# CREATE TABLE idnt(i int GENERATED BY DEFAULT AS IDENTITY); >> CREATE TABLE >> test=# CREATE TABLE test(i int GENERATED BY DEFAULT AS IDENTITY, LIKE >> idnt INCLUDING ALL); >> ERROR: relation "test_i_seq" already exists > > Yeah, this is kind of hard to fix though, because the sequence names > are decided during parse analysis, when we don't see that the same > sequence name is also used by another new column. We could maybe > patch around it in an ugly way, but it doesn't seem worth it for this > marginal case. OK >> 3. Strange error (not about absence of a column; but see pp.5 and 8): >> test=# ALTER TABLE idnt ALTER COLUMN o ADD GENERATED ALWAYS AS IDENTITY; >> ERROR: identity column type must be smallint, integer, or bigint > > What's wrong with that? The column mentioned there does not exist. Therefore the error should be about it, not about a type of absent column: test=# CREATE TABLE idnt(i int GENERATED BY DEFAULT AS IDENTITY, t TEXT); CREATE TABLE test=# ALTER TABLE idnt ALTER COLUMN o TYPE int; -- expected error message ERROR: 42703: column "o" of relation "idnt" does not exist test=# -- compare with: test=# ALTER TABLE idnt ALTER COLUMN o ADD GENERATED ALWAYS AS IDENTITY; -- strange error message ERROR: identity column type must be smallint, integer, or bigint >> 5. Attempt to make a column be identity fails: >> test=# ALTER TABLE idnt ALTER COLUMN n1 SET GENERATED BY DEFAULT; >> ERROR: column "n1" of relation "idnt" is not an identity column >> >> As I understand from the Spec, chapter 11.20 General Rule 2)b) says >> about the final state of a column without mentioning of the initial >> state. >> Therefore even if the column has the initial state "not generated", >> after the command it changes the state to either "GENERATED ALWAYS" or >> "GENERATED BY DEFAULT". > > 11.12 states that "If specification> or is specified, then C > shall be an identity column." So this clause is only to alter an > existing identity column, not make a new one. Ouch. Right. The rule was at the upper level. Nevertheless even now we don't follow rules directly. For example, we allow "SET NOT NULL" and "TYPE" for the column which are restricted by the spec. Since we extend the spec by "ADD GENERATED...", "SEQUENCE NAME" and allow more than one identity column, why can't we extend it by allowing "SET GENERATED" for non-identity columns and drop "ADD GENERATED..."? >> 6. The docs mention a syntax: >> ALTER [ COLUMN ] > class="PARAMETER">column_name { SET GENERATED { ALWAYS | >> BY DEFAULT } | SET sequence_option | RESET >> } [...] >> >> but the command fails: >> test=# ALTER TABLE idnt ALTER COLUMN i RESET; >> ERROR: syntax error at or near ";" >> LINE 1: ALTER TABLE idnt ALTER COLUMN i RESET; > > This works for me. Check again please. I'm sorry, it still does not work for me (whether you mix it up with "RESTART"): test=# ALTER TABLE idnt ALTER COLUMN i RESET; ERROR: syntax error at or near ";" LINE 1: ALTER TABLE idnt ALTER COLUMN i RESET; >> 7. (the same line of the doc) >> usually ellipsis appears in inside parenthesis with clauses can be >> repeated, in other words it should be written as: >> ALTER column_name ( { SET GENERATED { ALWAYS | >> BY DEFAULT } | } [...] ) > > But there are no parentheses in that syntax. I think the syntax > synopsis as written is correct. I was wrong. Your version is correct. >> 9. The command fails: >> test=# ALTER TABLE idnt ALTER n2 ADD GENERATED ALWAYS A
Re: [HACKERS] identity columns
On 3/15/17, Peter Eisentraut wrote: > Vitaly, will you be able to review this again? > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ I apologize for a delay. Yes, I'm going to do it by Sunday. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS][REVIEW] macaddr 64 bit (EUI-64) datatype support
Hello, I've reviewed the patch[1]. Result of testing: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed The patch introduce a new type macaddr8 for EUI-64 addresses[2] (assuming OUI field is 24 bits wide) with EUI-48 (existing "macaddr" type) interoperability. It is a mostly copy-pasted macaddr implementation with necessary changes for increased range. Consensus was reached on such implementation in the current thread before. There are two patch files for convenient reviewing: base macaddr8 implementation and its supporting in btree-gin and btree-gist indexes. The patch: * cleanly applies to the current master (6af8b89adba16f97bee0d3b01256861e10d0e4f1); * passes tests; * looks fine, follows the PostgreSQL style guide; * has documentation changes; * has tests. All notes and requirements were took into account and the patch was changed according to them. I have no suggestions on improving it. The new status of this patch is: Ready for Committer P.S.: 1. The doc and error/hint messages should be proof-read by a native speaker. 2. A committer should bump catversion. It is not bumped in the patch because it is unclear when it is committed. [1]https://postgr.es/m/CAJrrPGeT8zrGPMcRVk_wRvYD-ETcgUz6WRrc2C=inubmrkr...@mail.gmail.com [2]http://standards.ieee.org/develop/regauth/tut/eui64.pdf -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
On 1/27/17, Haribabu Kommi wrote: > Updated patches are attached. Hello, I'm almost ready to mark it as Ready for committer. The final round. 1. >+DATA(insert OID = 774 ( macaddr8 ... >+#define MACADDR8OID 972 What does this number (972) mean? I think it should be 774, the same number as was used in the type definition. Since there is an editing required, please, fix whitespaces: 2. >+DATA(insert OID = 3371 ( 403 macaddr8_ops >PGNSP PGUID )); >+DATA(insert OID = 3372 ( 405 macaddr8_ops >PGNSP PGUID )); only one (not three) tab before "PGNSP" should be used (see other rows around yours: "OID = 1983", "OID = 1991" etc). 3. >diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile some new rows have two tabs instead of eight spaces. 4. Some other files need to be fixed by pgindent (it helps supporting in the future): contrib/btree_gist/btree_macaddr8.c (a lot of rows) src/include/utils/inet.h (I have no idea why it changes indentation to tabs for macaddr8 whereas it leaves spaces for macaddr) 5. src/backend/utils/adt/network.c pg_indent makes it uglier. I've just found how to write the line for it: res *= ((double) 256) * 256 * 256 * 256; -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
On 1/25/17, Haribabu Kommi wrote: > On Wed, Jan 25, 2017 at 6:43 PM, Vitaly Burovoy > wrote: > >> On 1/23/17, Haribabu Kommi wrote: >> > The patch is split into two parts. >> > 1. Macaddr8 datatype support >> > 2. Contrib module support. >> >> Hello, >> >> I'm sorry for the delay. >> The patch is almost done, but I have two requests since the last review. >> > > Thanks for the review. > > >> 1. >> src/backend/utils/adt/mac8.c: >> + int a, >> + b, >> + c, >> + d = 0, >> + e = 0, >> ... >> >> There is no reason to set them as 0. For EUI-48 they will be >> reassigned in the "if (count != 8)" block, for EUI-64 -- in one of >> sscanf. >> They could be set to "d = 0xFF, e = 0xFE," and avoid the "if" block >> mentioned above, but it makes the code be much less readable. >> > > Changed accordingly. I'm going to do (I hope) a final review tonight. Please, remove initialization of the variables "d" and "e" since there is really no reason to keep them be zero for a short time. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
On 1/23/17, Haribabu Kommi wrote: > The patch is split into two parts. > 1. Macaddr8 datatype support > 2. Contrib module support. Hello, I'm sorry for the delay. The patch is almost done, but I have two requests since the last review. 1. src/backend/utils/adt/mac8.c: + int a, + b, + c, + d = 0, + e = 0, ... There is no reason to set them as 0. For EUI-48 they will be reassigned in the "if (count != 8)" block, for EUI-64 -- in one of sscanf. They could be set to "d = 0xFF, e = 0xFE," and avoid the "if" block mentioned above, but it makes the code be much less readable. Oh. I see. In the current version it must be assigned because for EUI-48 memory can have values <0 or >255 in uninitialized variables. It is better to avoid initialization by merging two parts of the code: + if (count != 6) + { + /* May be a 8-byte MAC address */ ... + if (count != 8) + { + d = 0xFF; + e = 0xFE; + } to a single one: + if (count == 6) + { + d = 0xFF; + e = 0xFE; + } + else + { + /* May be a 8-byte MAC address */ ... 2. src/backend/utils/adt/network.c: + res = (mac->a << 24) | (mac->b << 16) | (mac->c << 8) | (mac->d); + res = (double)((uint64)res << 32); + res += (mac->e << 24) | (mac->f << 16) | (mac->g << 8) | (mac->h); Khm... I trust that modern compilers can do a lot of optimizations but for me it looks terrible because of needless conversions. The reason why earlier versions did have two lines "res *= 256 * 256" was an integer overflow for four multipliers, but it can be solved by defining the first multiplier as a double: + res = (mac->a << 24) | (mac->b << 16) | (mac->c << 8) | (mac->d); + res *= (double)256 * 256 * 256 * 256; + res += (mac->e << 24) | (mac->f << 16) | (mac->g << 8) | (mac->h); In this case the left-hand side argument for the "*=" operator is computed at the compile time as a single constant. The second line can be written as "res *= 256. * 256 * 256 * 256;" (pay attention to a dot in the first multiplier), but it is not readable at all (and produces the same code). -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
On 1/4/17, Haribabu Kommi wrote: > On Tue, Nov 29, 2016 at 8:36 PM, Haribabu Kommi > wrote: >> Updated patch attached with added cast function from macaddr8 to >> macaddr. >> >> Currently there are no support for cross operators. Is this required >> to be this patch only or can be handled later if required? >> > > Updated patch attached to address the duplicate OID problem. > There are no other functional changes to the previous patch. Hello, several thoughts about the patch: Documentation: 1. + The remaining six input formats are not part of any standard. Which ones (remaining six formats)? 2. + trunc(macaddr8) returns a MAC + address with the last 3 bytes set to zero. It is a misprinting or a copy-paste error. The implementation and the standard says that the last five bytes are set to zero and the first three are left as is. 3. + for lexicographical ordering I'm not a native English speaker, but I'd say just "for ordering" since there are no words, it is just a big number with a special input/output format. The code: 4. + if ((a == 0) && (b == 0) && (c == 0) && (d == 0) + && (e == 0) && (f == 0) && (g == 0) && (h == 0)) ... + if ((a == 255) && (b == 255) && (c == 255) && (d == 255) + && (e == 255) && (f == 255) && (g == 255) && (h == 255)) The standard forbids these values from using in real hardware, no reason to block them as input data. Moreover these values can be stored as a result of binary operations, users can dump them but can not restore. 5. + if (!eight_byte_address) + { + result->d = 0xFF; ... Don't you think the next version is simplier (all sscanf for macaddr6 skip "d" and "e")? + count = sscanf(str, "%x:%x:%x:%x:%x:%x%1s", + &a, &b, &c, &f, &g, &h, junk); ... + if (!eight_byte_address) + { + d = 0xFF; + e = 0xFE; + } + result->a = a; + result->b = b; + result->c = c; + result->d = d; + result->e = e; + result->f = f; + result->g = g; + result->h = h; Also: + if (buf->len == 6) + { + addr->d = 0xFF; + addr->e = 0xFE; + addr->f = pq_getmsgbyte(buf); + addr->g = pq_getmsgbyte(buf); + addr->h = pq_getmsgbyte(buf); + } + else + { + addr->d = pq_getmsgbyte(buf); + addr->e = pq_getmsgbyte(buf); + addr->f = pq_getmsgbyte(buf); + addr->g = pq_getmsgbyte(buf); + addr->h = pq_getmsgbyte(buf); + } can be written as: + if (buf->len == 6) + { + addr->d = 0xFF; + addr->e = 0xFE; + } + else + { + addr->d = pq_getmsgbyte(buf); + addr->e = pq_getmsgbyte(buf); + } + addr->f = pq_getmsgbyte(buf); + addr->g = pq_getmsgbyte(buf); + addr->h = pq_getmsgbyte(buf); but it is only my point of view (don't need to pay close attention that only those two bytes are written differently, not the last tree ones), it is not an error. 6. +errmsg("macaddr8 out of range to convert to macaddr"))); I think a hint should be added which values are allowed to convert to "macaddr". 7. +DATA(insert ( 829 7744123 i f )); +DATA(insert ( 774 829 4124 i f )); It is a nitpicking, but try to use tabs as in the code around. (tab between "774" and "829" and space instead of tab between "829" and "4124"). 8. +#define hibits(addr) \ + ((unsigned long)(((addr)->a<<24)|((addr)->b<<16)|((addr)->c<<8))) + +#define lobits(addr) \ + ((unsigned long)(((addr)->d<<16)|((addr)->e<<8)|((addr)->f))) + +#define lobits_extra(addr) \ + ((unsigned long)((addr)->g<<8)|((addr)->h)) I mentioned that fitting all 4 bytes is a wrong idea[1] > The macros "hibits" should be 3 octets long, not 4; ... but now I do not think so (there is no UB[2] because source and destination are not signed). Moreover you've already fill in "hibits" the topmost byte by shifting by 24. If you use those two macros ("hibits" and "lobits") it allows to avoid two extra comparisons in macaddr8_cmp_internal. Version from the "macaddr64_poc.patch" is correct. [1]https://www.postgresql.org/message-id/CAKOSWNng9_+=fvo6oz4tgv1kkhmom11ankihbokpuzki1ca...@mail.gmail.com [2]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1817.htm -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints
On 1/5/17, Tom Lane wrote: > Vitaly Burovoy writes: >> On 1/5/17, Tom Lane wrote: >>> My point is that ideally, any value that can physically fit into struct >>> Interval ought to be considered valid. The fact that interval_out can't >>> cope is a bug in interval_out, which ideally we would fix without >>> artificially restricting the range of the datatype. > >> Am I correct that we are still limited by ECPG which is limited by the >> system "tm" struct? > > I'm not really that concerned about whether ECPG can deal with enormous > intervals. If it bothers you, and you want to go fix it, more power to > you --- but I think fixing the backend is much higher priority. I really do not think it is possible since it uses system struct. My point - ECPG is a part of Postgres, we can't fix server side and leave the rest. >> Also users who use a binary protocol can also use the "tm" struct and >> can not expect overflow. > > If they store an enormous interval value, its really up to them not to > choke on it when they read it back. Not our problem. Those who store and those who read it back can be different groups of people. For example, the second group are libraries' writers. My point - that interval is big enough and limiting it can help people from errors. Because finally * either they have an error in their data and that change will not break their reslut (since it is wrong because of wrong source data) * or they still get overflow and they have to find a solution - with that patch or without it * or they get result in that 16% interval between fitting hours into "int" and "seconds" in PG_INT64, get silently corrupted result because of ECPG or a library. A solution for the third case can be the same as for the second one because these groups can be the same (just with different data). >> The docs say[1] the highest value of the interval type is 178_000_000 >> years which is > > ... irrelevant really. That's talking about the largest possible value of > the "months" field, viz (2^31-1)/12. Perhaps we ought to document the > other field limits, but right now there's nothing there about how large > the hours field can get. No, it was bad explanation of a point - now there is nothing documented about "seconds" part of the "interval" type. And we can just declare limit. Users don't mind reason of limitation, they just will plan workaround if their data do not fit in limits whatever they are. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints
On 1/5/17, Tom Lane wrote: > Vitaly Burovoy writes: >> On 1/5/17, Tom Lane wrote: >>> We could think about replacing interval2tm's output format with some >>> other struct that uses a TimeOffset for hours and so cannot overflow. >>> I'm not sure though how far the effects would propagate; it might be >>> more work than we want to put into this. > >> If values with overflow are already in a database, what do you expect >> a new output function should fix? > > My point is that ideally, any value that can physically fit into struct > Interval ought to be considered valid. The fact that interval_out can't > cope is a bug in interval_out, which ideally we would fix without > artificially restricting the range of the datatype. > > Now, the problem with that of course is that it's not only interval_out > but multiple other places. But your proposed patch also requires touching > nearly everything interval-related, so I'm not sure it has any advantage > that way over the less restrictive answer. OK, I try to limit values from 9223372036854775807/36/24/365=292471.2 years to 77309411328/36/24/365=245146.5 years i.e. cut 47325 years or ~16%, but it is only for interval's part measured in seconds. Am I correct that we are still limited by ECPG which is limited by the system "tm" struct? Also users who use a binary protocol can also use the "tm" struct and can not expect overflow. The docs say[1] the highest value of the interval type is 178_000_000 years which is significantly bigger than both of values (current and limited) measured in seconds. I don't think applying that limitation is a big deal. I tried to find functions should be touched and there are not so many of them: -- internal functions: interval_check_overflow(Interval *i) tm2interval(struct pg_tm * tm, fsec_t fsec, Interval *span) interval2tm(Interval span, struct pg_tm * tm, fsec_t *fsec) intervaltypmodleastfield(int32 typmod) -- output functions: interval_out(PG_FUNCTION_ARGS) -- skip output interval_send(PG_FUNCTION_ARGS) -- skip output -- main input/computing functions: interval_in(PG_FUNCTION_ARGS) -- forgot to cover interval_recv(PG_FUNCTION_ARGS) -- covered make_interval(PG_FUNCTION_ARGS) -- covered AdjustIntervalForTypmod(Interval *interval, int32 typmod) -- can lead to overflow interval_justify_interval(PG_FUNCTION_ARGS) -- can lead to overflow interval_justify_hours(PG_FUNCTION_ARGS) -- can lead to overflow interval_justify_days(PG_FUNCTION_ARGS) -- can lead to overflow interval_um(PG_FUNCTION_ARGS) -- covered interval_pl(PG_FUNCTION_ARGS) -- covered interval_mi(PG_FUNCTION_ARGS) -- covered interval_mul(PG_FUNCTION_ARGS) -- covered -- can not lead to overflow interval_transform(PG_FUNCTION_ARGS) interval_div(PG_FUNCTION_ARGS) interval_trunc(PG_FUNCTION_ARGS) interval_part(PG_FUNCTION_ARGS) -- based on other functions interval_scale(PG_FUNCTION_ARGS) -- based on AdjustIntervalForTypmod interval_accum(PG_FUNCTION_ARGS) -- based on interval_pl interval_accum_inv(PG_FUNCTION_ARGS) -- based on interval_mi interval_avg(PG_FUNCTION_ARGS) -- based on interval_pl interval_combine(PG_FUNCTION_ARGS) -- based on interval_pl mul_d_interval(PG_FUNCTION_ARGS) -- based on interval_mul -- checking, not changing: interval_finite(PG_FUNCTION_ARGS) interval_smaller(PG_FUNCTION_ARGS) interval_larger(PG_FUNCTION_ARGS) interval_cmp_value(const Interval *interval) interval_cmp_internal(Interval *interval1, Interval *interval2) interval_eq(PG_FUNCTION_ARGS) interval_ne(PG_FUNCTION_ARGS) interval_lt(PG_FUNCTION_ARGS) interval_gt(PG_FUNCTION_ARGS) interval_le(PG_FUNCTION_ARGS) interval_ge(PG_FUNCTION_ARGS) interval_cmp(PG_FUNCTION_ARGS) interval_hash(PG_FUNCTION_ARGS) -- not applicable: intervaltypmodin(PG_FUNCTION_ARGS) intervaltypmodout(PG_FUNCTION_ARGS) timestamp_pl_interval(PG_FUNCTION_ARGS) timestamp_mi_interval(PG_FUNCTION_ARGS) timestamptz_pl_interval(PG_FUNCTION_ARGS) timestamptz_mi_interval(PG_FUNCTION_ARGS) = In fact, only six of them (not "touching nearly everything interval-related") should be covered: * interval_in * interval_out (yes, the SAMESIGN there is useless) * AdjustIntervalForTypmod * interval_justify_interval * interval_justify_hours * interval_justify_days [1]https://www.postgresql.org/docs/current/static/datatype-datetime.html -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints
On 1/5/17, Tom Lane wrote: > Vitaly Burovoy writes: >>> I've written a patch which fixes that bug (in attachment). >>> Should it be registered in the CF? > >> Oops. Forgot to attach the patch. Fixed. > > I suspect that many of these SAMESIGN() tests you've added are not > actually adequate/useful. That's only sufficient when the output could be > at most a factor of 2 out-of-range. If it could overflow past the sign > bit then you need to work harder. I disagree. These SAMESIGN() tests cover addition where result can not be more than one out-of-range. Multiplications are covered just before. > (By the same token, the existing SAMESIGN test in interval2tm is > wrong.) > > Possibly we should consider alternatives before plowing ahead in this > direction, since adding guards to interval_in and interval computations > doesn't help with oversize values that are already stored in a database. We can do nothing with values are already stored in a database. But we can prevent appearing them later. > We could think about replacing interval2tm's output format with some > other struct that uses a TimeOffset for hours and so cannot overflow. > I'm not sure though how far the effects would propagate; it might be > more work than we want to put into this. If values with overflow are already in a database, what do you expect a new output function should fix? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints
On 1/5/17, Vitaly Burovoy wrote: > On 1/4/17, Pantelis Theodosiou wrote: >> On Wed, Jan 4, 2017 at 3:03 PM, wrote: >> >>> The following bug has been logged on the website: >>> >>> Bug reference: 14486 >>> Logged by: Per Modin >>> Email address: web+postgre...@modin.io >>> PostgreSQL version: 9.6.1 >>> Operating system: Linux 303a92173594 4.8.15-moby #1 SMP Sat Dec 17 0 >>> Description: >>> >>> Found this bug in 9.4.8, tried it in docker towards psql 9.6.1 and it's >>> in >>> there as well. A minimum working example would be as follows: >>> >>> ``` >>> postgres=# CREATE TABLE tbl AS SELECT 9223372036854 * interval '1 >>> second' >>> col; TABLE tbl; >>> SELECT 1 >>> ERROR: interval out of range >>> ``` >>> >>> ``` >>> postgres=# SELECT count(*) FROM tbl; >>> count >>> --- >>> 1 >>> (1 row) >>> ``` >>> >>> It seems that inserting and retrieving data have different constraints. >>> As >>> you >>> can see from query 2, the data still gets inserted. >>> >>> ``` >>> postgres=# select version(); >>> version >>> >>> -- >>> PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (Debian >>> 4.9.2-10) >>> 4.9.2, 64-bit >>> (1 row) >>> ``` >>> >>> Best regards, >>> Per Modin >>> >>> >> And these work without error: >> >> postgres=# select col - 9223372036854 * interval '1 second' from tbl ; >> ?column? >> -- >> 00:00:00 >> (1 row) >> >> postgres=# select col from xx where col < interval '10 year' ; >> col >> - >> (0 rows) >> > > Yes, it is a bug, but it is not a constraint, it is just different > internal checks. > Moreover even if a function does not raise an error, output could be wrong > (pay attention to the duplicated '-' sign) > postgres=# SELECT '-2147483647:59:59'::interval - '1s'::interval; > ?column? > > --2147483648:00:00 > (1 row) > > I've written a patch which fixes that bug (in attachment). > Should it be registered in the CF? Oops. Forgot to attach the patch. Fixed. -- Best regards, Vitaly Burovoy 0001-Add-check-for-overflow-to-interval-functions.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS][PATCH] BUG #14486: Inserting and selecting interval have different constraints
On 1/4/17, Pantelis Theodosiou wrote: > On Wed, Jan 4, 2017 at 3:03 PM, wrote: > >> The following bug has been logged on the website: >> >> Bug reference: 14486 >> Logged by: Per Modin >> Email address: web+postgre...@modin.io >> PostgreSQL version: 9.6.1 >> Operating system: Linux 303a92173594 4.8.15-moby #1 SMP Sat Dec 17 0 >> Description: >> >> Found this bug in 9.4.8, tried it in docker towards psql 9.6.1 and it's >> in >> there as well. A minimum working example would be as follows: >> >> ``` >> postgres=# CREATE TABLE tbl AS SELECT 9223372036854 * interval '1 second' >> col; TABLE tbl; >> SELECT 1 >> ERROR: interval out of range >> ``` >> >> ``` >> postgres=# SELECT count(*) FROM tbl; >> count >> --- >> 1 >> (1 row) >> ``` >> >> It seems that inserting and retrieving data have different constraints. >> As >> you >> can see from query 2, the data still gets inserted. >> >> ``` >> postgres=# select version(); >> version >> >> -- >> PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (Debian >> 4.9.2-10) >> 4.9.2, 64-bit >> (1 row) >> ``` >> >> Best regards, >> Per Modin >> >> > And these work without error: > > postgres=# select col - 9223372036854 * interval '1 second' from tbl ; > ?column? > -- > 00:00:00 > (1 row) > > postgres=# select col from xx where col < interval '10 year' ; > col > - > (0 rows) > Yes, it is a bug, but it is not a constraint, it is just different internal checks. Moreover even if a function does not raise an error, output could be wrong (pay attention to the duplicated '-' sign) postgres=# SELECT '-2147483647:59:59'::interval - '1s'::interval; ?column? --2147483648:00:00 (1 row) I've written a patch which fixes that bug (in attachment). Should it be registered in the CF? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
R VALUES FROM ('2016-07-01') TO ('2016-08-01'); ERROR: cannot inherit from table with identity column test=# INSERT INTO measurement(logdate) VALUES('2016-07-10'); ERROR: no partition of relation "measurement" found for row DETAIL: Failing row contains (1, 2016-07-10, null, null). 12. You've introduced a new parameter for a sequence declaration ("SEQUENCE NAME") which is not mentioned in the docs and not supported: a2=# CREATE SEQUENCE s SEQUENCE NAME s; ERROR: option "sequence_name" not recognized I think the error should look as: a2=# CREATE SEQUENCE s SEQUENCE__ NAME s; ERROR: syntax error at or near "SEQUENCE__" LINE 1: CREATE SEQUENCE s SEQUENCE__ NAME s; ^ 13. Also strange error message: test=# CREATE SCHEMA sch; test=# ALTER TABLE idnt ALTER COLUMN n1 ADD GENERATED BY DEFAULT AS IDENTITY (SEQUENCE NAME sch.s START 1); ERROR: relation "sch.idnt" does not exist But if a table sch.idnt exists, it leads to a silent inconsistency: test=# CREATE TABLE sch.idnt(n1 int); CREATE TABLE test=# ALTER TABLE idnt ALTER COLUMN n1 ADD GENERATED BY DEFAULT AS IDENTITY (SEQUENCE NAME sch.s START 1); ALTER TABLE test=# DROP TABLE sch.idnt; ERROR: could not find tuple for attrdef 0 Also dump/restore fails for it. Note that by default sequences have different error message: test=# CREATE SEQUENCE sch.s1 OWNED BY idnt.n1; ERROR: sequence must be in same schema as table it is linked to 14. Wrong hint message: test=# DROP SEQUENCE idnt_i_seq; ERROR: cannot drop sequence idnt_i_seq because default for table idnt column i requires it HINT: You can drop default for table idnt column i instead. but according to DDL there is no "default" which can be dropped: test=# ALTER TABLE idnt ALTER COLUMN i DROP DEFAULT; ERROR: column "i" of relation "idnt" is an identity column 15. And finally. The command: test=# CREATE TABLE t(i int GENERATED BY DEFAULT AS IDENTITY (SEQUENCE NAME s)); leads to a core dump. It happens when no sequence parameter (like "START") is set. [1]https://www.postgresql.org/docs/devel/static/sql-createtable.html -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
On 10/12/16, Tom Lane wrote: > Alvaro Herrera writes: >> Tom Lane wrote: >>> Vitaly Burovoy writes: >>>> P.S.: I still think it is a good idea to change storage format, >>> I'm not sure which part of "no" you didn't understand, I just paid attention to the words "likelihood" (mixed up with "likeliness"), "we wanted" and "probably". Also there was a note about "would also break send/recv" which behavior can be saved. And after your letter Julien Rouhaud wrote about mapping from MAC-48 to EUI-64 which leads absence of a bit indicated version of a stored value. Which can be considered as a new information. >>> but we're >>> not breaking on-disk compatibility of existing macaddr columns. Can I ask why? It will not be a varlen (typstorage will not be changed), it just changes typlen to 8 and typalign to 'd'. For every major release 9.0, 9.1, 9.2 .. 9.6 the docs says "A dump/restore using pg_dumpall, or use of pg_upgrade, is required". Both handle changes in a storage format. Do they? >>> Breaking the on-the-wire binary I/O representation seems like a >>> nonstarter as well. I wrote that for the EUI-48 (MAC-48) values the binary I/O representation can be saved. The binary format (in DataRow message) has a length of the column value which is reflected in PGresAttValue.len in libpq. If the client works with the binary format it must consult with the length of the data. But until the client works with (and columns have) MAC-48 nothing hurts it and PGresAttValue.len is "6" as now. >> I think the suggestion was to rename macaddr to macaddr6 or similar, >> keeping the existing behavior and the current OID. So existing columns >> would continue to work fine and maintain on-disk compatibility, but any >> newly created columns would become the 8-byte variant. > > ... and would have different I/O behavior from before, possibly breaking > applications that expect "macaddr" to mean what it used to. I'm still > dubious that that's a good idea. Only if a new type will send xx:xx:xx:FF:FF:xx:xx:xx instead of usual (expected) 6 octets long. Again, that case in my offer is similar (by I/O behavior) to "just change 'macaddr' to keep and accept both MAC-48 and MAC-64", but allows to use "-k" key for pg_upgrade to prevent rewriting possibly huge (for instance, 'log') tables (but users unexpectedly get "macaddr6" after upgrade in their columns and function names which looks strange enough). > The larger picture here is that we got very little thanks when we squeezed > IPv6 into the pre-existing inet datatype; Without a sarcasm, I thank a lot all people involved in it because it does not hurt me (and many other people) from distinguishing ipv4 and ipv6 at app-level. I write apps and just save remote address of clients to an "inet" column named "remote_ip" without thinking "what if we start serving clients via ipv6?"; or have a column named "allowed_ip" with IPs or subnets and just save client's IPv4 or IPv6 as a white list (and use "allowed_ip >>= $1"). It just works. > there's a large number of people > who just said "no thanks" and started using the add-on ip4r type instead. I found a repository[1] at github. From the description it is understandable why people used ip4r those days (2005 year). The reason "Furthermore, they are variable length types (to support ipv6) with non-trivial overheads" is mentioned as the last in its README. When you deal with IPv4 in 99.999%, storing it in TOAST tables leads to a big penalty, but the new version of macaddr is not so wide, so it does not lead to similar speed decrease (it will be stored inplace). > So I'm not sure why we want to complicate our lives in order to make > macaddr follow the same path. Because according to the Wiki[3] MAC addresses now "are formed according to the rules of one of three numbering name spaces ...: MAC-48, EUI-48, and EUI-64.", so IEEE extended range of allowed values from 48 to 64 bits and since Postgres claims supporting of "mac addresses", I (as a developer who still uses PG as a primary database) expect supporting of any kind of mac address, not a limited one. I expect it is just works. I reject to imagine what I have to do if I have a column of a type "macaddr" and unexpectedly I have to deal with an input of EUI-64 type. Add a new column or change columns's type? In the first case what to do with stored procedures? Duplicate input parameter to pass the new column of macaddr8 (if macaddr was passed before)? Duplicate stored procedure? Also I have to support two columns at the application level. Why? I just want
Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
On 10/12/16, Tom Lane wrote: > Vitaly Burovoy writes: >> I'm sorry for the offtopic, but does anyone know a reason why a >> condition in mac.c > >>> if ((a < 0) || (a > 255) || (b < 0) || (b > 255) || >>> (c < 0) || (c > 255) || (d < 0) || (d > 255) || >>> (e < 0) || (e > 255) || (f < 0) || (f > 255)) > >> can not be rewritten as: > >>> if (((a | b | c | d | e | f) < 0) || >>> ((a | b | c | d | e | f) > 255)) > > Well, it's ugly and > it adds a bunch of assumptions about arithmetic > behavior that we don't particularly need to make. It explains the reason, thank you. I'm just not familiar with other architectures where it is not the same as in X86/X86-64. > If this were some > amazingly hot hot-spot then maybe it would be worth making the code > unreadable to save a few nanoseconds, but I doubt that it is. > (Anyway, you've not shown that there actually is any benefit ...) I don't think it has a speed benefit, especially comparing with the sscanf call. But personally for me the second variant does not seem ugly, just "no one bit in all variables is out of a byte" (looks better with comparison with "0xff" as sscanf operates with "%2x"). Sorry for my bad taste and for a noise. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
I'm sorry for the offtopic, but does anyone know a reason why a condition in mac.c > if ((a < 0) || (a > 255) || (b < 0) || (b > 255) || > (c < 0) || (c > 255) || (d < 0) || (d > 255) || > (e < 0) || (e > 255) || (f < 0) || (f > 255)) can not be rewritten as: > if (((a | b | c | d | e | f) < 0) || > ((a | b | c | d | e | f) > 255)) It seems more compact and a compiler can optimize it to keep a result of a binary OR for the comparison with 255... -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
On 10/12/16, Vitaly Burovoy wrote: > On 10/12/16, Alvaro Herrera wrote: >> Julien Rouhaud wrote: >>> On 12/10/2016 14:32, Alvaro Herrera wrote: >>> > Julien Rouhaud wrote: >>> > >>> >> and you can instead make macaddr64 support both format, and provide a >>> >> macaddr::macaddr64 cast >>> > >>> > Having macaddr64 support both formats sounds nice, but how does it >>> > work? >>> > Will we have to reserve one additional bit to select the >>> > representation? >>> > That would make the type be 65 bits which is a clear loser IMO. >>> > >>> > Is it allowed to just leave 16 bits as zeroes which would indicate >>> > that >>> > the address is EUI48? I wouldn't think so ... >>> >>> From what I read, you can indicate it's an EUI-48 address by storing >>> FF:FF (or FF:FE for MAC-48) in 4th and 5th bytes of the EUI-64 address. >> >> That seems reasonable at first glance; so the new type would be able to >> store both 48-bit and 64-bit values, and there would be assignment casts >> in both directions > > I think either "macaddr" should be renamed to "macaddr6" (saved its > oid), a new type "macaddr8" is added with introducing a new alias > "macaddr" or the current "macaddr" should accept both versions as the > "inet" type does. > > The function "macaddr_recv" can distinguish them by the > StringInfoData.len member, "macaddr_in" - by number of parts split by > ":". > The "macaddr_out" and "macaddr_send" can out 6 octets if the stored > value is mapped to MAC-48. > Storing can be done always as 8 bytes using the rule above. > > In the other case there is hard from user's PoV to detect at the > design stage when it is necessary to define column as macaddr and when > to macaddr8. > If users need to update a column type (a new hardware appears with > EUI-64 address), they should keep in mind that all things are changed > for the new column type - stored procedure's parameters, application > code interacting with the DB etc.). > I don't agree with Tom's proposal to introduce a new type, it would be > inconvenient for users. > > We have types "int2", "int4", "int8" and an alias "int" for a type > "int4", at least psql does not show it: > postgres=# \dT+ int > List of data types > Schema | Name | Internal name | Size | Elements | Owner | Access > privileges | Description > +--+---+--+--+---+---+- > (0 rows) > > It is understandable to have 3 types for integers because most space > of the DB occupied by them and in the real life we just don't use big > numbers, but cases for "inet" and "macaddr" are different. > >> and a suite of operators to enable interoperability >> of 48-bit values in macaddr8 with values in type macaddr. Right? >> >> (The cast function from macaddr8 to macaddr would raise error if the >> 4th and 5th bytes are not either FF:FF or FF:FE -- I don't think we can >> in practice distinguish EUI-48 from MAC-48 in this context. > > The wikipedia says[1] they are the same things but MAC-48 is an > obsolete term for a special case, so there is no necessary to > distinguish them. > >> The cast in the other direction would have no restriction and should >> probably always use FF:FE). > > [1] > https://en.wikipedia.org/wiki/Organizationally_unique_identifier#48-bit_Media_Access_Control_Identifier_.28MAC-48.29 Haribabu Kommi, why have you read enough about EUI-64? Your function "macaddr64_trunc" sets 4 lower bytes as 0 whereas (according to the Wikipedia, but I can look for a standard) 3 bytes are still define an OUI (organizationally unique identifier), so truncation should be done for 5, not 4 lower octets. The macros "hibits" should be 3 octets long, not 4; "lobits" --- 5 bytes, not 4. In the other case your comparisons fail. What document have you used to write the patch? Are short form formats correct in macaddr64_in? P.S.: I still think it is a good idea to change storage format, macaddr_{in,out,send,recv}, fill 4th and 5th bytes if necessary; change "lobits" macros and add new fields to bit operation functions. It avoids a new type, casting, comparison functions between macaddr6 and macaddr8 etc. Proposed patch is mostly copy-paste of mac.c -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
On 10/12/16, Alvaro Herrera wrote: > Julien Rouhaud wrote: >> On 12/10/2016 14:32, Alvaro Herrera wrote: >> > Julien Rouhaud wrote: >> > >> >> and you can instead make macaddr64 support both format, and provide a >> >> macaddr::macaddr64 cast >> > >> > Having macaddr64 support both formats sounds nice, but how does it >> > work? >> > Will we have to reserve one additional bit to select the >> > representation? >> > That would make the type be 65 bits which is a clear loser IMO. >> > >> > Is it allowed to just leave 16 bits as zeroes which would indicate that >> > the address is EUI48? I wouldn't think so ... >> >> From what I read, you can indicate it's an EUI-48 address by storing >> FF:FF (or FF:FE for MAC-48) in 4th and 5th bytes of the EUI-64 address. > > That seems reasonable at first glance; so the new type would be able to > store both 48-bit and 64-bit values, and there would be assignment casts > in both directions I think either "macaddr" should be renamed to "macaddr6" (saved its oid), a new type "macaddr8" is added with introducing a new alias "macaddr" or the current "macaddr" should accept both versions as the "inet" type does. The function "macaddr_recv" can distinguish them by the StringInfoData.len member, "macaddr_in" - by number of parts split by ":". The "macaddr_out" and "macaddr_send" can out 6 octets if the stored value is mapped to MAC-48. Storing can be done always as 8 bytes using the rule above. In the other case there is hard from user's PoV to detect at the design stage when it is necessary to define column as macaddr and when to macaddr8. If users need to update a column type (a new hardware appears with EUI-64 address), they should keep in mind that all things are changed for the new column type - stored procedure's parameters, application code interacting with the DB etc.). I don't agree with Tom's proposal to introduce a new type, it would be inconvenient for users. We have types "int2", "int4", "int8" and an alias "int" for a type "int4", at least psql does not show it: postgres=# \dT+ int List of data types Schema | Name | Internal name | Size | Elements | Owner | Access privileges | Description +--+---+--+--+---+---+- (0 rows) It is understandable to have 3 types for integers because most space of the DB occupied by them and in the real life we just don't use big numbers, but cases for "inet" and "macaddr" are different. > and a suite of operators to enable interoperability > of 48-bit values in macaddr8 with values in type macaddr. Right? > > (The cast function from macaddr8 to macaddr would raise error if the > 4th and 5th bytes are not either FF:FF or FF:FE -- I don't think we can > in practice distinguish EUI-48 from MAC-48 in this context. The wikipedia says[1] they are the same things but MAC-48 is an obsolete term for a special case, so there is no necessary to distinguish them. > The cast in the other direction would have no restriction and should > probably always use FF:FE). [1] https://en.wikipedia.org/wiki/Organizationally_unique_identifier#48-bit_Media_Access_Control_Identifier_.28MAC-48.29 -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/6/16, Serge Rielau wrote: >> On Oct 6, 2016, at 9:20 AM, Tom Lane wrote: >> Vitaly Burovoy writes: >>> But what I discover for myself is that we have pg_attrdef separately >>> from the pg_attribute. Why? >> >> The core reason for that is that the default expression needs to be >> a separate object from the column for purposes of dependency analysis. >> For example, if you have a column whose default is "foo()", then the >> default expression depends on the function foo(), but the column should >> not: if you drop the function, only the default expression ought to >> be dropped, not the column. >> >> Because of this, the default expression needs to have its own OID >> (to be stored in pg_depend) and it's convenient to store it in a >> separate catalog so that the classoid can identify it as being a >> default expression rather than some other kind of object. > > Good to know. > >> If we were going to allow these missing_values or creation_defaults >> or whatever they're called to be general expressions, then they would >> need >> to have their own OIDs for dependency purposes. That would lead me to >> think that the best representation is to put them in their own rows in >> pg_attrdef, perhaps adding a boolean column to pg_attrdef to distinguish >> regular defaults from these things. Or maybe they even need their own >> catalog, depending on whether you think dependency analysis would want >> to distinguish them from regular defaults using just the classed. >> >> Now, as I just pointed out in another mail, realistically we're probably >> going to restrict the feature to simple constants, which'd mean they will >> depend only on the column's type and can never need any dependencies of >> their own. So we could take the shortcut of just storing them in a new >> column in pg_attribute. I agree with you. >> But maybe that's shortsighted and we'll >> eventually wish we'd done them as full-fledged separate objects. I don't think so. If we try to implement non-blocking adding columns with volatile defaults (and for instance update old rows in the background), we can end up with the next situation: CREATE TABLE a(i bigint PRIMARY KEY); INSERT INTO a SELECT generate_series(1,100); ALTER TABLE a ADD COLUMN b bigserial CHECK (b BETWEEN 1 AND 100); For indexes (even unique) created concurrently similar troubles are solved with a "not valid" mark, but what to do with a heap if we try to do it in the background? >> But on the third hand ... once one of these is in place, how could you >> drop it separately from the column? That would amount to a change in the >> column's stored data, which is not what one would expect from dropping >> a separate object. So maybe it's senseless to think that these things >> could ever be distinct objects. But that definitely leads to the >> conclusion that they're constants and nothing else. > I cannot follow this reasoning. > Let’s look past what PG does today: > For each row (whether that’s necessary or not) we evaluate the expression, > compute the value and > store it in the rewritten table. > We do not record dependencies on the “pedigree” of the value. > It happened to originate from the DEFAULT expression provided with the ADD > COLUMN, > but that is not remembered anywhere. > All we remember is the value - in each row. > So the only change that is proposed here - when it comes right down to it - > is to remember the value once only (IFF it is provably the same for each row) > and thus avoid the need to rewrite the table. > So I see no reason to impose any restriction other than “evaluated value is > provably the same for every row”. Tom says the same thing. The expression at the end should be a value if it allows to avoid rewriting table. > Regarding the location of storage. > I did start of using pg_attrdef, but ran into some snags. > My approach was to add the value as an extra column (rather than an extra > row). > That caused trouble since a SET DEFAULT operation is decomposed into a DROP > and a SET and > preserving the value across such operations did not come naturally. I'm sorry for making you be confused. The best way is to use an extra column in the pg_attribute to store serialized value. > If we were to use extra rows instead that issue would be solved, assuming we > add a “default kind” sort of column. > It would dictate the storage format though which may be considered overkill > for a a constant. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/6/16, Vitaly Burovoy wrote: > Ough. I made a mistake about pg_attribute because I forgot about the > pg_attrdef. > If we do not merge these tables, the pg_attrdef is the best place to > store evaluated expression as a constant the same way defaults are > stored in adbin. Oops. While I was writing the previous email, Tom explained necessity of the pg_attrdef. With that explanation it is obvious that I was wrong and a value for pre-existing rows should be in a new column in the pg_attribute. All the other thoughts from my previous email stand good. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/6/16, Serge Rielau wrote: >> On Oct 6, 2016, at 9:01 AM, Tom Lane wrote: >> BTW, it also occurs to me that there are going to be good implementation >> reasons for restricting it to be a hard constant, not any sort of >> expression. We are likely to need to be able to insert the value in >> low-level code where general expression evaluation is impractical. >> > Yes, the padding must happen primarily in the getAttr() routines. > Clearly we do not want to evaluate an expression there. > But what speaks against evaluating the expression before we store it? > After all we seem to all agree that this only works if the expression > computes to the same constant all the time. > > If we do not want to store an “untyped” datum straight in pg_attribute as a > BYTEA (my current approach) Ough. I made a mistake about pg_attribute because I forgot about the pg_attrdef. If we do not merge these tables, the pg_attrdef is the best place to store evaluated expression as a constant the same way defaults are stored in adbin. > we could store the pretty printed version of the constant It is a wrong way. It ruins commands like "ALTER COLUMN ... TYPE ... USING" > and evaluate that when we build the tuple descriptor. > This happens when we load the relation into the relcache. > > Anyway, I’m jumping ahead and it’s perhaps best to let the code speak for > itself once I have the WIP patch ready so we have something concrete to > discuss -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/6/16, Tom Lane wrote: > Serge Rielau writes: >>> On Oct 6, 2016, at 5:25 AM, Vitaly Burovoy >>> wrote: >>>> Which makes me think we should call this missing_value or absent_value Be honest Simon Rigg's wrote that words. >>>> so its clear that it is not a "default" it is the value we use for >>>> rows that do not have any value stored for them. > >> I like Tom’s “creation default”. Another one could be “initial default”. >> But that, too, can be misread. > > Something based on missing_value/absent_value could work for me too. > > If we name it something involving "default", that definitely increases > the possibility for confusion with the regular user-settable default. > > Also worth thinking about here is that the regular default expression > affects what will be put into future inserted rows, whereas this thing > affects the interpretation of past rows. So it's really quite a different > animal. That's kind of leading me away from calling it creation_default. > > BTW, it also occurs to me that there are going to be good implementation > reasons for restricting it to be a hard constant, not any sort of > expression. We are likely to need to be able to insert the value in > low-level code where general expression evaluation is impractical. Yes, I mentioned that it should be evaluated and stored as a value because user functions can be changed (besides the speed reason), that's why I like the "value" in its name. The "default" is usually identified with expressions, not values (which are particular cases of expressions). Serge mentioned the phrase "pre-existing rows", which makes me think about something like "pre_existing_value" -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/6/16, Simon Riggs wrote: > On 6 October 2016 at 04:43, Serge Rielau wrote: >>>> Or should I compose some sort of a design document? > > Having read this thread, I'm a little unclear as to what you're > writing now, though there's definitely good ideas here. > > I think it would be beneficial to write up a single coherent > description of this, including behaviour and a small sketch of > implementation, just so everyone knows what this is. No design doc, > but a summary. At the moment I think it can also be a good idea to post the current patch as a Proposal or a WIP to get initial feedback. > It would be very useful to be able to do this... > ALTER TABLE foo ADD last_updated_timestamp timestamp default > current_timestamp > so that it generates a constant value and stores that for all prior > rows, but then generates a new value for future rows. Yes, it works for stable "now()" but does not work for volatile functions like "random()", "uuid_generate_v4()" or default for serial columns. The only possible way I can see is to check an expression has only "T_Const"s, static and stable functions. In such case the expression can be evaluated and the result be saved as a value for absented attributes of a tuple. In the other case save NULL there and rewrite the table. > Which makes me think we should call this missing_value or absent_value > so its clear that it is not a "default" it is the value we use for > rows that do not have any value stored for them. It is definitely a default for a user, it is not a regular default internally. I'm not a native speaker, "absent_value" can be mixed up with a NULL. As for me the best phrase is "pre-add-column-default", but it is impossible to use it as a column name. :-( It is still an open question. (I remember funny versions in a discussion[1] when people tried to choose a name for a function reversed to pg_size_pretty...) [1] https://www.postgresql.org/message-id/flat/cafj8prd-tgodknxdygecza4on01_urqprwf-8ldkse-6bdh...@mail.gmail.com -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/5/16, Serge Rielau wrote: >On Wed, Oct 5, 2016 at 4:19 PM, Vitaly Burovoy >wrote: >> But what I discover for myself is that we have pg_attrdef separately >> from the pg_attribute. Why? >> Is it time to join them? For not presented defaults it would be only >> one bit per row(if we avoid "adsrc" as it is recommended), but for a >> separate table it is 11 columns with two indexes now... > > In terms of footprint we may be able to remove pg_attrdef. > I would consider that orthogonal to the proposed feature though. It was a question mostly to the community rather than to you. > The internal representation of defaults in the tuple descriptor still needs > to be a map of sorts. > > To comment on Pantelis SQL Server Reference: > Other vendors such as Oracle and DB2 also support this feature. > > The listed restriction made me loop back to Vitaly’s original serial example: > ALTER TABLE t ADD COLUMN c2 serial; > and rethink Tom’s struct restriction to constants. I'm sorry, the correct example with "now" should be: ALTER TABLE t ADD COLUMN c1 timestamptz NOT NULL DEFAULT 'now'::text::timestamptz; I wanted to show that for non-immutable (even stable) expression as a default one each time selected old tuples gives different result. But your implementation evaluates it before saving as a "pre-add-column" default and it breakes the current behavior in a different way. > In PG the proposed feature would also have to be limited to immutable(?) > default expressions to comply with existing behavior, which matches SQL > Servers. > > My current patch does not restrict that and thusly falsely "fills in" the > same value for all rows. If you change the current behavior (just making it "faster") you should save the current behavior. The best case is to determine whether it is possible to do a "fast" change and fill the "pre-add-column" default parameter or leave the current "long" behavior. I assure it is often enough to add columns with a default value which fills the current rows by non-static values. One of them is adding a serial column (which is a macros for "CREATE SEQUENCE+SET NULL+SET DEFAULT nextval(just_created_sequence_relation)"), others are "uuid_generate_vX(...)" and "random()" On 10/5/16, Serge Rielau wrote: > I want to point out as a minor "extension" that there is no need for the > default to be immutable. It is merely required that the default is evaluate > at time of ADD COLUMN and then we remember the actual value for the exist > default, rather than the parsed expression as we do for the "current" > default. I don't think it will be accepted. On 10/5/16, Serge Rielau wrote: > Thanks. > This would be my first contribution. > I take it I would post a patch based on a recent PG 9.6 master for review? Your patch must be based on a just "master" branch. If you haven't seen wiki pages [1], [2] and [3], it is the time to do it (and related ones). > Or should I compose some sort of a design document? Since Tom Lane, Andres Freund and other people agreed "it does work", you may post a patch to a new thread and write a short (but clean enough) description with a link to the current thread. Examples can be seen by links from the CF[4]. Nevertheless it is important to honor all thoughts mentioned here because if your patch breaks the current behavior (with no significant reason) it is senseless (even if it matches the behavior of other RDBMS) and will not be committed. Of cource, feel free to ask. [1] https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F [2] https://wiki.postgresql.org/wiki/Developer_FAQ [3] https://wiki.postgresql.org/wiki/Submitting_a_Patch [4] https://commitfest.postgresql.org/11/ -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/5/16, Tom Lane wrote: > I wrote: >> Need a better name for the concept, since evidently this name isn't >> conveying the idea. > > Maybe "creation default" would work better? Point being it's the > default value at the time of column creation. Hmm... Personaly for me the original topic name is good enough. But what I discover for myself is that we have pg_attrdef separately from the pg_attribute. Why? Is it time to join them? For not presented defaults it would be only one bit per row(if we avoid "adsrc" as it is recommended), but for a separate table it is 11 columns with two indexes now... -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/5/16, Andres Freund wrote: > On 2016-10-05 15:44:56 -0700, Jeff Janes wrote: >> On Wed, Oct 5, 2016 at 3:29 PM, Andres Freund wrote: >> > ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT; >> > INSERT id = 1; >> > ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 1; >> > ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT; >> > INSERT id = 2; >> > ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 2; >> > ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT; >> > INSERT id = 3; >> > ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 3; >> > >> > The result here would be that there's three rows with a default value >> > for foo that's the same as their id. None of them has that column >> > present in the row. >> > >> >> My understanding is that all of those would be materialized. > > But that'd require a table rewrite, as none of the above INSERTs were > done when a default was in place. Since they did not have the default value, that tuples are written with actual TupleDesc.natts where att_isnull for "withdefault" column is set (actually the column does not have default for inserted tuples in your case). > But each has a different "applicable" default value. No, their values are constructed "from scratch", not fetched from a heap, so "pre-alter-add-column" default is not applicable for them. >> The only >> default that isn't materialized is the one in effect in the same >> statement >> in which that column was added. Since a column can only be added once, >> the >> default in effect at the time the column was added can never change, no >> matter what you do to the default later on. > > DROP DEFAULT pretty much does that, because it allows multiple (set of) > rows with no value (or a NULL) for a specific column, but with differing > applicable default values. DROP DEFAULT is for "post-alter-add-column" tuples, it does not affects "pre-alter-add-column" ones. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/5/16, Vitaly Burovoy wrote: > On 10/5/16, Andres Freund wrote: >> On 2016-10-05 15:23:05 -0700, Vitaly Burovoy wrote: >>> On 10/5/16, Andres Freund wrote: >>> > On 2016-10-05 11:58:33 -0700, Serge Rielau wrote: >>> >> Dear Hackers, >>> >> I’m working on a patch that expands PG’s ability to add columns to a >>> >> table >>> >> without a table rewrite (i.e. at O(1) cost) from the >>> >> nullable-without-default to a more general case. >>> > >>> > If I understand this proposal correctly, altering a column default >>> > will >>> > still have trigger a rewrite unless there's previous default? >>> >>> No, "a second “exist default"" was mentioned, i.e. it is an additional >>> column in a system table (pg_attribute) as default column values of >>> the "pre-alter" era. It solves changing of the default expression of >>> the same column later. >> >> Don't think that actually solves the issue. The default might be unset >> for a while, for example. Essentially you'd need to be able to associate >> arbitrary number of default values with an arbitrary set of rows. >> >> ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT; >> INSERT id = 1; >> ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 1; >> ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT; >> INSERT id = 2; >> ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 2; >> ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT; >> INSERT id = 3; >> ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 3; >> >> The result here would be that there's three rows with a default value >> for foo that's the same as their id. None of them has that column >> present in the row. > > I'm sorry, while I was writting "pre-alter" I meant > "pre-alter-add-column" era (not "pre-alter-set-default"), all later > default changes "current" default, whereas "pre-alter-add-column" adds > value if current column number < TupleDesc.natts. > > All your DDL are in the "post-alter-add-column" era. I'm so sorry, I was in a hurry. Of course, - if current column number < TupleDesc.natts. + if current column number > TupleDesc.natts. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/5/16, Andres Freund wrote: > On 2016-10-05 15:23:05 -0700, Vitaly Burovoy wrote: >> On 10/5/16, Andres Freund wrote: >> > On 2016-10-05 11:58:33 -0700, Serge Rielau wrote: >> >> Dear Hackers, >> >> I’m working on a patch that expands PG’s ability to add columns to a >> >> table >> >> without a table rewrite (i.e. at O(1) cost) from the >> >> nullable-without-default to a more general case. >> > >> > If I understand this proposal correctly, altering a column default will >> > still have trigger a rewrite unless there's previous default? >> >> No, "a second “exist default"" was mentioned, i.e. it is an additional >> column in a system table (pg_attribute) as default column values of >> the "pre-alter" era. It solves changing of the default expression of >> the same column later. > > Don't think that actually solves the issue. The default might be unset > for a while, for example. Essentially you'd need to be able to associate > arbitrary number of default values with an arbitrary set of rows. > > ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT; > INSERT id = 1; > ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 1; > ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT; > INSERT id = 2; > ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 2; > ALTER TABLE foo ALTER COLUMN withdefault DROP DEFAULT; > INSERT id = 3; > ALTER TABLE foo ALTER COLUMN withdefault SET DEFAULT 3; > > The result here would be that there's three rows with a default value > for foo that's the same as their id. None of them has that column > present in the row. I'm sorry, while I was writting "pre-alter" I meant "pre-alter-add-column" era (not "pre-alter-set-default"), all later default changes "current" default, whereas "pre-alter-add-column" adds value if current column number < TupleDesc.natts. All your DDL are in the "post-alter-add-column" era. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/5/16, Andres Freund wrote: > On 2016-10-05 11:58:33 -0700, Serge Rielau wrote: >> Dear Hackers, >> I’m working on a patch that expands PG’s ability to add columns to a table >> without a table rewrite (i.e. at O(1) cost) from the >> nullable-without-default to a more general case. > > If I understand this proposal correctly, altering a column default will > still have trigger a rewrite unless there's previous default? No, "a second “exist default"" was mentioned, i.e. it is an additional column in a system table (pg_attribute) as default column values of the "pre-alter" era. It solves changing of the default expression of the same column later. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fast AT ADD COLUMN with DEFAULTs
On 10/5/16, Serge Rielau wrote: > Dear Hackers, > > I’m working on a patch that expands PG’s ability to add columns to a table > without a table rewrite (i.e. at O(1) cost) from the > nullable-without-default to a more general case. E.g. > > CREATE TABLE T(pk INT NOT NULL PRIMARY KEY); > INSERT INTO T VALEUS (1), (2), (3); > ALTER TABLE T ADD COLUMN c1 INTEGER NOT NULL DEFAULT 5; > INSERT INTO T VALUES (4, DEFAULT); > ALTER TABLE T ALTER COLUMN SET DEFAULT 6; > INSERT INTO T VALUS (5, DEFAULT); > SELECT * FROM T ORDER BY pk; > => > (1, 5), > (2, 5), > (3, 5), > (4, 5), > (5, 6); > > Rows 1-3 have never been updated, yet they know that their values of c1 is > 5. > > The requirement is driven by large tables for which add column takes too > much time and/or produces too large a transaction for comfort. > > In simplified terms: > > * a second “exist default” is computed and stored in > the catalogs at time of AT ADD COLUMN > > * The exist default is cached in the tuple descriptor (e.g in attrdef) > > * When one of the getAttr or copytuple related routines is invoked > the exist default is filled in instead of simply NULL padding if the > tuple is shorter the requested attribute number. > > Is there an interest in principle in the community for this functionality? Wow! I think it would be great! It also solves huge vacuuming after rewriting the table(s). Just pay attention to corner cases like indexes, statistics and speed. But I'd like to see solution for more important cases like: CREATE TABLE t (pk INT NOT NULL PRIMARY KEY); INSERT INTO t VALUES (1), (2), (3); ALTER TABLE t ADD COLUMN c1 timestamptz NOT NULL DEFAULT 'now'; SELECT * FROM t ORDER BY pk; ALTER TABLE t ADD COLUMN c2 serial; SELECT * FROM t ORDER BY pk; INSERT INTO t(pk) VALUES (4); SELECT * FROM t ORDER BY pk; P.S.: I really think it is a good idea, just some research is necessary and covering corner cases... -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Small doc fix
Hello, hackers, I've just noticed an extra word in a sentence in the docs in the "parallel.sgml". It seems the sentence was constructed one way and changed later with the extra word left. Please, find the fix attached. -- Best regards, Vitaly Burovoy pg-docs-fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: ON UPDATE REMOVE foreign key action
On 10/4/16, Kirill Berezin wrote: > Disclaimer: sorry, i dont understand, should i reply to each of you > personally, or just answer to channel. Some feedbacks were sended in > personal, and some include channel copy. Usually discussions are in the list, therefore you should use "reply to all" (see [1]). Exception is when a sender notes "Off the list". > Thanks for responses, you understand it correctly. > > When i said "anybody", i mean inclusive owner himself. For example cookie > poisoning. > There is no "another" session, technically. They similar to the server, > they even can have same IP. > Yes, we can't prevent it with CSRF cookies, but it is not the point of > current conversation. > > I can make business logic outside table: make extra query. Good decision. Your case needs exactly what you've just written. > Im just dont like how it looks from perspective of encapsulation. > Each table should describe itself, like object in OOP language. > With SQL constructions or triggers/constraits. SQL is not OOP. There is no "encapsulation". > Second part of my use case is data cache. Hmm. Usage of RDBMS as a cache with an overhead for Isolation and Durability (from ACID)? Really? As for me it is a bad idea for most cases. > When user update > version(generation), cache should be flushed. As easy example: imagine i am > fetching currency value. And till end of the day, i am honor current > course. (Or any other information, that has certain origin checkpoints). > When i am updating origin state (current day, server version, ip address, > neural network generation), i am have to invalidate all previous data. It is a bad example. Companies working with currency exchange rates always keep their values as historical data. > Like i am calculating digits of the square root, of some number. The more i > spend time, the closer my approx value to irrational result. But when > original value has changed - all previous data does not make sense. I am > flushing it and starting from digit 1. Why do you "update" original value instead of deleting old one and inserting new value? > This is allegorical examples to my real-world cases. I may try imagine some > hypothetical situations, when this functionality more welcomed. But, i am > respect reasons why do not apply this proposal. If my update didn't shift > the balance, its ok. on update trigger is not such painful. All your cases (except the exchange rate one) can be done using two queries: delete original row (which deletes other linked data "ON DELETE CASCADE") and insert a new one. You don't even have to use transactions! If your business logic is so "OOP", you can use stored procedures, but introducing new grammar specially for concrete task is a bad idea. Of course at first sight there is a meaningless sequence "ON UPDATE SET (NULL|DEFAULT)", but the meaning of SET NULL and SET DEFAULT for both ON UPDATE and ON DELETE is using them for "unlinking" data from the referenced one. It is similar to "NO ACTION" but explicitly change them as they are no longer connected to the referenced row (by referencing column list). Also your proposal is not consistent: ON UPDATE REMOVE (DELETE?), but ON DELETE - what? again remove/delete? [1] https://wiki.postgresql.org/wiki/Mailing_Lists#Using_the_discussion_lists -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: ON UPDATE REMOVE foreign key action
On 10/3/16, Kirill Berezin wrote: > *One-line Summary:* On foreign key update we unable to remove all depended > records. Currently we have "ON REMOVE CASCADE DELETE", but no "ON UPDATE > CASCADE DELETE". We can only update field to NULL or DEFAULT. I think there are three causes why we don't have it implemented. The first one is that there is no such grammar in the SQL spec (your version is also wrong: SQL spec has "ON DELETE CASCADE" as well as "ON DELETE CASCADE" [or any other action instead of "CASCADE"]). The second one is in almost all cases there is no reason to delete rows because of updating referenced row. If these rows are still connected, they should be updated, if not --- left as is ("NO ACTION") or with reference link deleted ("SET NULL" or "DEFAULT"). These rows has data, that's why they are still in tables. They can be deleted (by reference) if and only if "parent" or "linked" data (all data, not just referenced key) is deleted. > *Business Use-case:* Cache expiration on hash/version update. Revoke all > access on account id update. > In my case i met this situation: I am using access links to share user > account. Account owner can give private link to somebody, and its session > become mirrored. (Owner access to account granted). And the third cause is avoiding of bad design. If you has to give access to anyone and you know access will be revoked soon (or late), it is wise to give private link with different identificator which can be easily found and removed by a grantor id (your id). > You cant imagine facebook desktop and mobile sessions. Which, of course, have different session ids. You can revoke session without renaming your own. > It's just shortcut for > entering credentials. Now i am implementing "revoke all but me". Its done > simple, since each user is uuid indexed, i am just generate new uuid for > current account. Old uuid become invalid to other sessions - since no > record is found in base. > I want to remove any pending access links, prevent bad guy restore access. > I can possibly set linked account to NULL, Why just don't delete them when grantor revokes access? > and then clear record on > expiration, but i feel that automatically removing on update event is more > rational. I personally don't see necessity to introduce new non-spec grammar. If you think I has not understood you, send an example with schema --- what you have now and how you expect it should be. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_hba_file_settings view patch
On 10/2/16, Michael Paquier wrote: > On Mon, Oct 3, 2016 at 3:25 PM, Vitaly Burovoy > wrote: >> I guess for ability to use filtering like: >> >> SELECT * FROM pg_hba_rules WHERE options->>radiusserver LIKE >> '%.example.com'; >> >> I think it would be harder if options is an array of strings... > > With unnest() and a matching pattern, not that hard but.. I'm not a user of that feature, and I don't know how pg_hba files look like in really big companies... But for me filtering is more complicated than just a single comparison. What about more complex filtering --- several radiusserver and a user(s): WHERE options->>radiusserver = ANY(ARRAY['a.example.com', 'g.example.com']) AND options->>radiusidentifier = ANY(ARRAY['ID_a', 'ID_b', 'ID_c', 'ID_d', 'ID_e']) -- or even a subquery Again, I don't know whether it will be widely used, but in case of multiple param_name->param_value settings (column "options") I'd like to see native key-value store rather than array of strings (according to POLA). I guess you're expecting "key=value" format as they are written in the pg_hba file (and described in the doc), but sometimes they can be parsed and output differs from exact pg_hba records (for instance look at "ldapurl" parameter). -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_hba_file_settings view patch
On 10/2/16, Michael Paquier wrote: > + push_jsonb_string_key(&parseState, "map"); > + push_jsonb_string_value(&parseState, hba->usermap); > [...] > + > + options > + jsonb > + Configuration options set for authentication method > + > Why is it an advantage to use jsonb here instead of a simple array > made of name=value? If they were nested I'd see a case for it but it > seems to me that as presented this is just an overkill. I guess for ability to use filtering like: SELECT * FROM pg_hba_rules WHERE options->>radiusserver LIKE '%.example.com'; I think it would be harder if options is an array of strings... -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question / requests.
On 9/30/16, Francisco Olarte wrote: > Hello everyone. Hello, Francisco! > Also, although I feel confident in my coding I have zero knowledge of > developing for postgres, It is easy enough and all important steps are documented in the wiki. Also some interesting things can be found in presentations from hackers about how to hack PostgreSQL. > and I am not confident in my git or testing habilities. > I can develop it, as it is just modifying a single libpq > client program and only in the easy part of the string lists and may > be emitting a new error (as this can introduce a new failure mode of > 'no tables to vacuum'), I can test it locally and produce a patch for > that file, but I'm not confident on integrating it, making git patchs > or going further, so I would like to know if doing that would be > enough and then I can give the code to someone to review or integrate > it. Do your best and send a patch. No one is good enough at understanding all the code base at once. There are lot of people who know different parts of the code and who have ability to test patches in different ways. You can be sure you get a feedback, your code will not be merged to the code base without deep review and independent testing. Just be ready to improve your patch according to a feedback and be ready that usually it takes several rounds of sending-review before patch is committed. Also you can follow a discussion from one of simple patches in a commitfest to be familiar with the process. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On 9/27/16, Vitaly Burovoy wrote: > On 9/27/16, Tom Lane wrote: >> (The other thing I'd want here is a --target-version option so that >> you could get the same output alterations in pg_dump or pg_restore to >> text. Otherwise it's nigh undebuggable, and certainly much harder >> to test than it needs to be.) > > I thought that way. I'm ready to introduce that parameter, but again, > I see now it will influence only SET parameters. Does it worth it? The only reason I have not implemented it was attempt to avoid users being confused who could think that result of pg_dump (we need it there for the plain text output) or pg_restore can be converted for target version to be restored without new features (but now it is wrong). -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On 9/27/16, Tom Lane wrote: > Vitaly Burovoy writes: >> On 9/27/16, Tom Lane wrote: >>> I'm not exactly convinced that you did. There's only one copy of >>> Archive->remoteVersion, and you're overwriting it long before the >>> dump process is over. > >> It does not seem that I'm "overwriting it long before the dump process >> is over"... > > There's a lot that happens during RestoreArchive. Even if none of it > inspects remoteVersion today, I do not think that's a safe assumption to > make going forward. And... Khm... Note that even _now_ AHX->remoteVersion is set to a database version pg_restore connects to... So all the code has it during restoring process... > The easiest counterexample is that this very bit of > code you want to add does so. The only change I've done is set remoteVersion to the maximum allowed when output is the plain text format. > I really do not want to get into a design > that says "remoteVersion means the source server version until we reach > RestoreArchive, and the target version afterwards". That way madness lies. It is only if you think about "remoteVersion" as "sourceServerVersion", but even now it is not so. Moreover RestoreArchive is a delimter only for pg_dump and only when output is a plain text. For other modes of the pg_dump RestoreArchive is not called at all. > If we're going to try altering the emitted SQL based on target version, > let's first create a separation between those concepts; I've just found there is _archiveHandle.archiveRemoteVersion. Is it a parameter you were searched for? The pg_restore code does not use it (just as remoteVersion), but it can do so if it is necessary. > otherwise I will bet that we add more bugs than we remove. > > (The other thing I'd want here is a --target-version option so that > you could get the same output alterations in pg_dump or pg_restore to > text. Otherwise it's nigh undebuggable, and certainly much harder > to test than it needs to be.) I thought that way. I'm ready to introduce that parameter, but again, I see now it will influence only SET parameters. Does it worth it? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On 9/27/16, Tom Lane wrote: > Vitaly Burovoy writes: >> On 9/27/16, Tom Lane wrote: >>> The general policy has always been that pg_dump output is only expected >>> to >>> restore without errors into a server that's the same or newer version as >>> pg_dump (regardless of the server version being dumped from). > >> Why can't I use it if pg_dump92/pg_restore92 have the same behavior as >> pg_dump96/pg_restore96 except the SET block? > > That's a pretty large "if", and not one I'm prepared to commit to. > Before you assert that it's true, you might want to reflect on the > size of the diff between 9.2 and 9.6 pg_dump (hint: it's over 20K > lines in -u format). > >>> Either that, or the patch is overwriting pg_dump's idea of what the >>> source server version is at the start of the output phase, which is >>> likely to break all kinds of stuff when dumping from an older server.) > >> I agree, that's why I left current behavior as is for the plain text >> output. > > I'm not exactly convinced that you did. There's only one copy of > Archive->remoteVersion, and you're overwriting it long before the > dump process is over. I'm sorry, I'm not so good at knowing the code base but I see that my patch changes Archive->remoteVersion in the "RestoreArchive" which is called after all schema is fetched to pg_dump's structs and just before output is written, moreover, there is a comment that it is a final stage (9.2 has the same block of code): ... /* * And finally we can do the actual output. *... */ if (plainText) RestoreArchive(fout); CloseArchive(fout); exit_nicely(0); } It does not seem that I'm "overwriting it long before the dump process is over"... Also pg_dump -v proves that changing remoteVersion happens after all "pg_dump: finding *", "pg_dump: reading *" and "pg_dump: saving *". > That'd break anything that consulted it to > find the source server's version after RestoreArchive starts. I'm sorry again, could you or anyone else point me what part of the code use Archive->remoteVersion after schema is read? I set up break point at the line AHX->remoteVersion = 99; and ran pg_dump with plain text output to a file (via "-f" option). When the line was reached I set up break points at all lines I could find by a script: egrep -n -r --include '*.c' 'remoteVersion [><]' src/bin/pg_dump/ | awk -F: '{print "b "$1":"$2}' (there were 217 of them) and continued debuging. All three fired break points were expectedly in _doSetFixedOutputState (at the lines where my patch introduced them) after which the program exited normally without stopping. Also I wonder that the process of "restoring" consults Archive->remoteVersion because currently in the pg_restore to plain text output remoteVersion is zero. It means restoring process would output schema to the oldest supported version, but is not so. > I could get behind a patch that split remoteVersion into sourceVersion > and targetVersion and made sure that all the existing references to > remoteVersion were updated to the correct one of those. After that > we could do something like what you have here in a less shaky fashion. > As Robert noted, there'd probably still be a long way to go before it'd > really work to use a newer pg_dump with an older target version, but at > least we'd not be building on quicksand. It means support of restoring from a newer version to an older one. What's for? If new features are used it is impossible to restore (port) them to an older version, if they are not, restoring process is done (i guess) in 99% of cases. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detect supported SET parameters when pg_restore is run
On 9/27/16, Tom Lane wrote: > Robert Haas writes: >> On Mon, Sep 26, 2016 at 9:56 PM, Vitaly Burovoy >> wrote: >>> We do dump/restore schemas/data via custom/dir formats and we have to >>> keep several client versions for 9.2, 9.4 and 9.5 versions on local >>> workstations because after pg_restore95 connects to 9.2, it fails when >>> it sets run-time parameters via SET: > >> I think our policy is that a newer pg_dump needs to work with an older >> version of the database, but I don't think we make any similar >> guarantee for other tools, such as pg_restore. It's an interesting >> question whether we should try a little harder to do that, but I >> suspect it might require more changes than what you've got in this >> patch Well... I'm not inclined to insert support of restoring from a higher major version to a lower one, because it can lead to security issues (e.g. with RLS). But my observation is that all new features supported by the "pg_dump" are "incremental", e.g. the last feature "parallel" for pg_dump/pg_restore --- lack of "PARALLEL UNSAFE" (which is by default) from 9.6 and lack of it from pre-9.6. That behavior allows newer versions of pg_restore to use dumps from DB of older versions because of lack of new features grammar. With the patch I'm able to use pg_dump96/pg_restore96 for our database of 9.2 (dump from 9.2 and restore to 9.2). > The general policy has always been that pg_dump output is only expected to > restore without errors into a server that's the same or newer version as > pg_dump (regardless of the server version being dumped from). If you try > to restore into an older server version, you may get some errors, which we > try to minimize the scope of when possible. But it will never be possible > to promise none at all. I think the correct advice here is simply "don't > use pg_restore -e -1 when trying to restore into an older server version". Why can't I use it if pg_dump92/pg_restore92 have the same behavior as pg_dump96/pg_restore96 except the SET block? The patch does not give guarantee of a restoration, it just avoids setting unsupported parameters for pg_restore the same way as pg_dump does. The other issues are for solving by the user who wants to restore to a DB of older version. > Taking a step back, there are any number of ways that you might get > errors during a pg_restore into a DB that's not set up exactly as pg_dump > expected. Missing roles or tablespaces, for example. It does not depends on a pg_dump/pg_restore version and can be soved by using command line options: --no-tablespaces --no-owner --no-privileges > Again, the dump output is constructed so that you can survive those problems > and bull ahead ... but not with "-e -1". I think "-e -1" was invented specially for it --- stop restoring if something is going wrong. Wasn't it? > I don't see a very good reason why > older-server-version shouldn't be considered the same type of problem. > > The patch as given seems rather broken anyway --- won't it change text > output from pg_dump as well as on-line output from pg_restore? My opinion --- no (and I wrote it in the initial letter): because it is impossible to know what version of a database is used for that plain text output. Users who use output to a plain text are able to use sed (or something else) to delete unwanted rows. > (That is, > it looks to me like the SETs emitted by pg_dump to text format would > depend on the source server version, which they absolutely should not. > Either that, or the patch is overwriting pg_dump's idea of what the > source server version is at the start of the output phase, which is > likely to break all kinds of stuff when dumping from an older server.) I agree, that's why I left current behavior as is for the plain text output. > It's possible that we could alleviate this specific symptom by arranging > for the BEGIN caused by "-1" to come out after the initial SETs, but > I'm not sure how complicated that would be or whether it's worth the > trouble. It leads to change ERRORs to WARNINGs but current behavior discussed above is left the same. Why don't just avoid SET parameters when we know for sure they are not supported by the server to which pg_restore is connected? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Detect supported SET parameters when pg_restore is run
Hackers, At work we use several major versions of PostgreSQL, and developers use non-local clusters for developing and debugging. We do dump/restore schemas/data via custom/dir formats and we have to keep several client versions for 9.2, 9.4 and 9.5 versions on local workstations because after pg_restore95 connects to 9.2, it fails when it sets run-time parameters via SET: vitaly@work 01:21:26 ~ $ pg_restore95 --host DEV_HOST_9_2 -d DBNAME --data-only -e -1 -Fc arhcivefile Password: pg_restore95: [archiver (db)] Error while INITIALIZING: pg_restore95: [archiver (db)] could not execute query: ERROR: unrecognized configuration parameter "lock_timeout" Command was: SET lock_timeout = 0; Of course, it can be fixed avoiding "--single-transaction", but if there is inconsistent schema (or stricter constraints) part of schema/data is already changed/inserted and a lot of errors are generated for the next pg_restore run. The pd_dump has checks in "setup_connection" function to detect what to send after connection is done for dumping, but there is no checks in _doSetFixedOutputState for restoring. If there are checks it is possible to use a single version pg_dump96/pg_restore96 to dump/restore, for example 9.2->9.2 as well as 9.4->9.4 and so on. The only trouble we have is in "SET" block and after some research I discovered it is possible not to send unsupported SET options to the database. Please, find attached simple patch. For restoring to stdout (or dumping to a plain SQL file) I left current behavior: all options in the SET block are written. Also I left "SET row_security = on;" if "enable_row_security" is set to break restoring to a DB non-supported version. -- Best regards, Vitaly Burovoy detect_supported_set_parameters_for_pgrestore.001.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
On 9/12/16, Peter Eisentraut wrote: > Thank you for this extensive testing. I will work on getting the bugs > fixed. Just a couple of comments on some of your points: > > On 9/9/16 11:45 PM, Vitaly Burovoy wrote: >> It compiles and passes "make check" tests, but fails with "make >> check-world" at: >> test foreign_data ... FAILED > > I do not see that. You can you show the diffs? I can't reproduce it, it is my fault, may be I did not clean build dir. >> 1. The standard requires "... ALTER COLUMN ... SET GENERATED { ALWAYS >> | BY DEFAULT }" (9075-2:2011 subcl 11.20), but the patch implements it >> as "... ALTER COLUMN ... ADD GENERATED { ALWAYS | BY DEFAULT } AS >> IDENTITY" > > SET and ADD are two different things. The SET command just changes the > parameters of the underlying sequence. Well... As for me ADD is used when you can add more than one property of the same kind to a relation (e.g. column or constraint), but SET is used when you change something and it replaces previous state (e.g. NOT NULL, DEFAULT, STORAGE, SCHEMA, TABLESPACE etc.) You can't set ADD more than one IDENTITY to a column, so it should be "SET". > This can be implemented later and doesn't seem so important now. Hmm. Now you're passing params to CreateSeqStmt because they are the same. Is it hard to pass them to AlterSeqStmt (if there is no SET GENERATED")? > The ADD command is not in the standard, but I needed it for pg_dump, mainly. > I will need to document this. Firstly, why to introduce new grammar which differs from the standard instead of follow the standard? Secondly, I see no troubles to follow the standard: src/bin/pg_dump/pg_dump.c: - "ALTER COLUMN %s ADD GENERATED ", + "ALTER COLUMN %s SET GENERATED ", src/backend/parser/gram.y: - /* ALTER TABLE ALTER [COLUMN] ADD GENERATED ... AS IDENTITY ... */ - | ALTER opt_column ColId ADD_P GENERATED generated_when AS IDENTITY_P OptParenthesizedSeqOptList - c->options = $9; + /* ALTER TABLE ALTER [COLUMN] SET GENERATED ... */ + | ALTER opt_column ColId SET GENERATED generated_when OptSeqOptList - c->options = $7; I guess "ALTER opt_column ColId SET OptSeqOptList" is easy to be implemented, after some research "ALTER opt_column ColId RESTART [WITH ...]" also can be added. (and reflected in the docs) >> 14. It would be fine if psql has support of new clauses. > > What do you mean by that? Tab completion? Yes, I'm about it. Or tab completion is usually developed later? >> 16. I think it is a good idea to not raise exceptions for "SET >> GENERATED/DROP IDENTITY" if a column has the same type of identity/not >> an identity. To be consistent with "SET/DROP NOT NULL". > > These behaviors are per SQL standard. Can you point to concrete rule(s) in the standard? I could not find it in ISO/IEC 9075-2 subclauses 11.20 "" and 11.21 "". Only subclause 4.15.11 "Identity columns" says "The columns of a base table BT can optionally include not more than one identity column." (which you don't follow). For instance, subclause 11.42 , General Rules p.1 says explicitly about exception. Or (for columns): 11.4 , General Rules p.3: "The in the SHALL NOT be equivalent to the of any other column of T." > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services Several additional thoughts: 1. I think it is wise to add ability to set name of a sequence (as PG's extension of the standard) to SET GENERATED or GENERATED in a relation definition (something like CONSTRAINTs names), without it it is hard to fix conflicts with other sequences (e.g. from serial pseudo type) and manual changes of the sequence ("alter seq rename", "alter seq set schema" etc.). 2. Is it useful to rename sequence linked with identity constraint when table is renamed (similar way when sequence moves to another schema following the linked table)? 3. You're setting OWNER to a sequence, but what about USAGE privilege to roles have INSERT/UPDATE privileges to the table? For security reasons table is owned by a role different from roles which are using the table (to prevent changing its definition). -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...
-+--+-+-+--- db1 | fred | UTF8 | en_US.UTF-8 | en_US.UTF-8 | db2 | bob | UTF8 | en_US.UTF-8 | en_US.UTF-8 | (2 rows) postgres=# set role bob; SET postgres=> CREATE DATABASE ss TEMPLATE db -- shows both db1 db2 postgres=> CREATE DATABASE ss TEMPLATE db2; ERROR: permission denied to create database postgres=> So a check for the CREATEDB privilege should be done at the point whether to show CREATE DATABASE or not. But if a user has privileges, Tom's version works fine. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...
On 9/11/16, Tom Lane wrote: > Vitaly Burovoy writes: >> On 9/11/16, Kevin Grittner wrote: >>> I was able to find cases during test which were not handled >>> correctly with either version, so I tweaked the query a little. > >> Hmm. Which one? Attempt to "SET ROLE "? >> Unfortunately, I after reading your letter I realized that I missed a >> case (it is not working even with your version): > > I wasn't aware that this patch was doing anything nontrivial ... > > After looking at it I think it's basically uninformed about how to test > for ownership. An explicit join against pg_roles is almost never the > right way for SQL queries to do that. Lose the join and write it more > like this: > > +"SELECT pg_catalog.quote_ident(d.datname) "\ > +" FROM pg_catalog.pg_database d "\ > +" WHERE substring(pg_catalog.quote_ident(d.datname),1,%d)='%s' "\ > +" AND (d.datistemplate OR pg_catalog.pg_has_role(d.datdba, 'USAGE'))" > > See the information_schema views for precedent. > > regards, tom lane Wow! I have not pay enough attention to a description of "pg_has_role". Your version works for all my tests. Thank you. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...
On 9/11/16, Kevin Grittner wrote: > On Sat, Sep 10, 2016 at 12:26 AM, Vitaly Burovoy >> Mark it as "Ready for committer". >> >> P.S.: While I was reviewing I simplified SQL query: improved version >> only 2 seqscans instead of 3 seqscans with an inner loop in an >> original one. >> Please find a file "tab-complete-create-database-improved.patch" >> attached. > > I was able to find cases during test which were not handled > correctly with either version, so I tweaked the query a little. Hmm. Which one? Attempt to "SET ROLE "? Unfortunately, I after reading your letter I realized that I missed a case (it is not working even with your version): postgres=# CREATE GROUP g1; CREATE ROLE postgres=# CREATE GROUP g2; CREATE ROLE postgres=# GRANT g2 TO g1; GRANT ROLE postgres=# CREATE USER u1 CREATEDB; CREATE ROLE postgres=# GRANT g1 TO u1; GRANT ROLE postgres=# CREATE DATABASE db_tpl; CREATE DATABASE postgres=# ALTER DATABASE db_tpl OWNER TO g2; ALTER DATABASE postgres=# SET ROLE g1; SET postgres=> CREATE DATABASE db1 TEMPLATE db -- shows nothing postgres=> CREATE DATABASE db1 TEMPLATE db^C -- shows nothing postgres=> RESET ROLE; RESET postgres=# SET ROLE u1; SET postgres=> CREATE DATABASE db1 TEMPLATE db -- shows nothing postgres=> CREATE DATABASE db1 TEMPLATE db^C -- shows nothing postgres=> CREATE DATABASE db1 TEMPLATE db_tpl; CREATE DATABASE It seems if a user has the CREATEDB privilege and he is a member of a group (role) which owns a database, he can use the database as a template. But to support it recursive queries should be used. Is it useless or should be fixed? > Also, for future reference, please try to use white-space that > matches surrounding code -- it make scanning through code less > "jarring". I tried to. Should "FROM" be at the same row as sub-"SELECT" is placed? Or there should be something else (I'm now only about white-space formatting)? Surrounding code looks similar enough for me... =( > Thanks all for the patch and the reviews! Thank you! > -- > Kevin Grittner > EDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sequence data type
On 9/10/16, Jim Nasby wrote: > On 9/3/16 6:01 PM, Gavin Flower wrote: >> I am curious as to the use cases for other possibilities. > > A hex or base64 type might be interesting, should those ever get created... > -- > Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Hex or base64 are not data types. They are just different representation types of binary sequences. Even for bigints these representations are done after writing numbers as byte sequences. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS][REVIEW] Tab Completion for CREATE DATABASE ... TEMPLATE ...
Hello Sehrope Sarkuni, I have reviewed the patch. It is very simple (only an SQL query in the "psql" application changed). It applies at the top of master. It implements completion database names ("") for commands like "CREATE DATABASE ... TEMPLATE ". According to the documentation since 9.2 till devel a database can be used as a template if it has a "datistemplate" mark or by superusers or by their owners. Previous implementation checked only "datistemplate" mark. Tested manually in versions 9.2 and 10devel, I hope it can be back-patched to all supported versions. No documentation needed. Mark it as "Ready for committer". P.S.: While I was reviewing I simplified SQL query: improved version only 2 seqscans instead of 3 seqscans with an inner loop in an original one. Please find a file "tab-complete-create-database-improved.patch" attached. -- Best regards, Vitaly Burovoy tab-complete-create-database-improved.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
it is against SQL spec and the implementation: postgres=# CREATE TABLE itest10 (a int GENERATED BY DEFAULT AS IDENTITY, b text); CREATE TABLE postgres=# INSERT INTO itest10 DEFAULT VALUES; INSERT 0 1 postgres=# INSERT INTO itest10 OVERRIDING USER VALUE VALUES(1,2); INSERT 0 1 postgres=# INSERT INTO itest10 OVERRIDING USER VALUE DEFAULT VALUES; ERROR: syntax error at or near "DEFAULT" at character 43 --- 12. Dump/restore is broken for some cases: postgres=# CREATE SEQUENCE itest1_a_seq; CREATE SEQUENCE postgres=# CREATE TABLE itest1 (a int generated by default as identity, b text); CREATE TABLE postgres=# DROP SEQUENCE itest1_a_seq; DROP SEQUENCE postgres=# CREATE DATABASE a; CREATE DATABASE postgres=# \q comp ~ $ pg_dump postgres | psql a SET SET SET SET SET SET SET SET COMMENT CREATE EXTENSION COMMENT SET SET SET CREATE TABLE ALTER TABLE ALTER TABLE COPY 0 ERROR: relation "itest1_a_seq1" does not exist LINE 1: SELECT pg_catalog.setval('itest1_a_seq1', 2, true); --- 13. doc/src/sgml/ref/create_table.sgml (5th chunk) has "TODO". Why? --- 14. It would be fine if psql has support of new clauses. === Also several notes: 15. Initializing attidentity in most places is ' ' but makefuncs.c has "n->identity = 0;". Is it correct? --- 16. I think it is a good idea to not raise exceptions for "SET GENERATED/DROP IDENTITY" if a column has the same type of identity/not an identity. To be consistent with "SET/DROP NOT NULL". -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identity columns
Hello, The first look at the patch: On 8/30/16, Peter Eisentraut wrote: > Here is another attempt to implement identity columns. This is a > standard-conforming variant of PostgreSQL's serial columns. > > ... > > Some comments on the implementation, and where it differs from previous > patches: > > - The new attidentity column stores whether a column is an identity > column and when it is generated (always/by default). I kept this > independent from atthasdef mainly for he benefit of existing (client) > code querying those catalogs, but I kept confusing myself by this, so > I'm not sure about that choice. Basically we need to store four > distinct states (nothing, default, identity always, identity by default) > somehow. I don't have a string opinion which way is preferred. I think if the community is not against it, it can be left as is. > ... > - I did not implement the restriction of one identity column per table. > That didn't seem necessary. I think it should be mentioned in docs' "Compatibility" part as a PG's extension (similar to "Zero-column Tables"). > ... > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > Questions: 1. Is your patch implements T174 feature? Should a corresponding line be changed in src/backend/catalog/sql_features.txt? 2. Initializing attidentity in most places is ' ' but makefuncs.c has "n->identity = 0;". Is it correct? 3. doc/src/sgml/ref/create_table.sgml (5th chunk) has "TODO". Why? 4. There is "ADD GENERATED", but the standard says it should be "SET GENERATED" (ISO/IEC 9075-2 Subcl.11.20) 5. In ATExecAddIdentity: is it a good idea to check whether "attTup->attidentity" is the same as the given in "(ADD) SET GENERATED" and do nothing (except changing sequence's options) in addition to strict checking for "unset" (" ")? 6. In ATExecDropIdentity: is it a good idea to do nothing if the column is already not a identity (the same behavior as DROP NOT NULL/DROP DEFAULT)? 7. Is there any reason to insert CREATE_TABLE_LIKE_IDENTITY before CREATE_TABLE_LIKE_INDEXES, not at the end? Why do you change catversion.h? It can lead conflict when other patches influence it are committed... I'll have a closer look soon. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
On 7/28/16, Geoff Winkless wrote: > On 27 July 2016 at 17:04, Bruce Momjian wrote: > >> Well, their big complaint about binary replication is that a bug can >> spread from a master to all slaves, which doesn't happen with statement >> level replication. > > > I'm not sure that that makes sense to me. If there's a database bug that > occurs when you run a statement on the master, it seems there's a decent > chance that that same bug is going to occur when you run the same statement > on the slave. > > Obviously it depends on the type of bug and how identical the slave is, but > statement-level replication certainly doesn't preclude such a bug from > propagating. > > Geoff Please, read the article first! The bug is about wrong visibility of tuples after applying WAL at slaves. For example, you can see two different records selecting from a table by a primary key (moreover, their PKs are the same, but other columns differ). From the article (emphasizing is mine): The following query illustrates how this bug would affect our users table example: SELECT * FROM users WHERE id = 4; This query would return *TWO* records: ... And it affected slaves, not master. Slaves are for decreasing loading to master, if you run all queries (even) RO at master, why would you (or someone) have so many slaves? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
On 6/15/16, Michael Paquier wrote: > On Thu, Jun 16, 2016 at 12:10 PM, Vitaly Burovoy > wrote: >> In the initial letter[1] I posted a digest from the SQL-2011 standard >> and a conclusion as a design of a new patch. >> Now I have more free time and I'm hacking it that way. The new patch >> is at the very early stage, full of WIPs and TODOs. I hope it'll be >> ready to be shown in a month (may be two). > > I have just read both your patch and the one of Alvaro, but yours does > not touch pg_constraint in any way. Isn't that unexpected? The consensus was reached to use CHECK constraints instead of new type of constrains. Alvaro made attempt[1] to write PoC in 2012 but it failed to apply on top of master (and after solving conflicts led to core dumps) in Jan 2016. I just rebased Alvaro's one to understand how he wanted to solve issue and to run tests and queries. After all I sent rebased working patch for anyone who wants to read it and try it without core dumps. I have not published my patch for NOT NULLs yet. Alvaro, the correct path[2] in the second message[3] of the thread. What's wrong in it (I got the source in the previous[1] thread)? [1] https://www.postgresql.org/message-id/flat/1343682669-sup-2...@alvh.no-ip.org [2] https://www.postgresql.org/message-id/attachment/41886/catalog-notnull-2-c477e84_cleaned.patch [3] https://www.postgresql.org/message-id/CAKOSWNnXbOY4pEiwN9wePOx8J%2BB44yTj40BQ8RVo-eWkdiGDFQ%40mail.gmail.com -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prevent ALTER TABLE DROP NOT NULL on child tables if parent column has it
On 6/15/16, Tom Lane wrote: > Michael Paquier writes: >> To put it short, it should not be possible to drop a NOT NULL >> constraint on a child relation when its parent table is using it, this >> should only be possible from the parent. Attached is a patch handling >> this problem by adding a new function in pg_inherits.c to fetch the >> list of parent relations for a given relation OID, and did some >> refactoring to stick with what is done when scanning child relations. > > This doesn't sound like the right approach; in particular, it won't really > help for deciding whether to propagate a DROP NOT NULL on a parent rel to > its children. What we've discussed in the past is to store NOT NULL > constraints in pg_constraint, much like CHECK constraints are already, and > use use-count logic identical to the CHECK case to keep track of whether > NOT NULL constraints are inherited or not. My feeling is that we'd keep > the pg_attribute.attnotnull field and continue to drive actual enforcement > off that, but it would just reflect a summary of the pg_constraint state. > > IIRC, Alvaro posted a WIP patch for that awhile back. Not sure what the > current state is. > > regards, tom lane The last thread about NOT NULLs as constraints is accessible by the link[1]. I rebased[2] Alvaro's patch to the actual master at that time, but I have not repeated it since then. In the initial letter[1] I posted a digest from the SQL-2011 standard and a conclusion as a design of a new patch. Now I have more free time and I'm hacking it that way. The new patch is at the very early stage, full of WIPs and TODOs. I hope it'll be ready to be shown in a month (may be two). But it already forbids dropping NOT NULLs if they were set as result of inheritance. [1] https://www.postgresql.org/message-id/flat/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA%40mail.gmail.com [2] https://www.postgresql.org/message-id/attachment/41886/catalog-notnull-2-c477e84_cleaned.patch -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 5/13/16, Robert Haas wrote: > On Fri, May 13, 2016 at 2:49 PM, Tom Lane wrote: >> So I think we should solve these problems at a stroke, and save ourselves >> lots of breath in the future, by getting rid of the whole "major major" >> idea and going over to a two-part version numbering scheme. To be >> specific: >> >> * This year's major release will be 9.6.0, with minor updates 9.6.1, >> 9.6.2, etc. It's too late to do otherwise for this release cycle. >> >> * Next year's major release will be 10.0, with minor updates 10.1, >> 10.2, etc. >> >> * The year after, 11.0. Etc cetera. >> >> No confusion, no surprises, no debate ever again about what the next >> version number is. >> >> This is by no means a new idea, but I think its time has come. > > Man, I hate version number inflation. +1 > I'm running Firefox 45.0.2, and I think that's crazy. It hit 1.0 when were > at aversion 7.4! Granted, this wouldn't be that bad, but +1 Big numbers in version don't show big (or even equal) changes. How to compare 7.0…8.0 and 8.0…9.0 and 9.0…9.6(10.0?)? Is difference between FireFox 10.0…20.0 the same as FireFox 20.0…30.0? > I have always thought that burning through a first digit a few times a decade > is much more sensible than doing it every year. > We just have to remember to bump the first digit occasionally. +1 Better once a decade. =) > If we don't want to stick with the current practice of debating when > to bump the same digit, then let's agree that 10.0 will follow 9.6 and > after that we'll bump the first digit after X.4, as we did with 7.X > and 8.X. Why just don't agree that since each release has good features and we can't release multiple backward-incompatible features at one release there is no sense of bumping the first digit earlier than the second one reaches the value "9"? Pretend the current version is not 9.5, but 95 (PostgreSQL 95), current beta not 9.6 but 96 etc, but just divided by 10. In fact it is similar to Tom's offer but avoided proposed bump to 10.0 and left increasing by 0.1 instead of 1.0: curnt Tom's proposed 9.5.X 9.5.X 9.6.X 9.6.X 9.7.X 10.X 9.8.X 11.X 9.9.X 12.X 10.0.X 13.X 10.1.X 14.X 10.2.X 15.X ... 10.6.X 19.X 10.7.X 20.X 10.8.X 21.X 10.9.X 22.X 11.0.X 23.X 11.1.X 24.X Etc. > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 10.0
On 5/13/16, Bruce Momjian wrote: > On Fri, May 13, 2016 at 11:05:23AM -0400, Robert Haas wrote: >> The major arguments advanced in favor of 10.0 are: >> >> - There are a lot of exciting features in this release. If I'm mot mistaken each release introduced exciting features. In my opinion: 9.5 - kind of UPSERT & RLS; 9.4 - reduced locks for ALTER TABLE, concurrent matview updates and JSONB(!!!); 9.3 - autoupdatable views & LATERAL joins & event trigger; 9.2 - RANGE data types & index-only scans & planner improvements; 9.1 - FDW & extensions & true SERIALIZABLE isolation level & data modification in CTE. Each version is a candidate to be 10.0 by the same reasons as people try to show now. Since each version has quite big changes and you can't predict how great changes will be in the next release why don't just increase the second digit up to 9? As for me version 9.9 is beautiful enough (and version 9.9.9 is even more beautiful). =) >> - Even if you aren't super-excited by the features in this release, >> PostgreSQL 9.6/10.0 is a world away from 10.0, and therefore it makes > > I think you meant "a world away from 9.0". > > Actually, I don't see the distance from 9.0 as a valid argument as 9.5 > was probably also a world away from 9.0. > > I prefer calling 9.7 as 10.0 because there will be near-zero-downtime > major upgrades with pg_logical (?), I dream there is time of zero-downtime when admins are instantiate new version of PostgreSQL, make it slave, wait until it finishes synchronization (prepare to multimaster), replace both of them as multimaster online (without downtime), then reconnect clients to the new instance, then leave new instance as master and shutdown old version. If it appears in the release after next one (which is expected be 10.1 or 10.0 or 9.8, i.e. after the next year), whether we update major number again (11.0) or leave numeration as is (it will be 10.2/10.1/9.9) or it is the real reason to bump 9.8->10.0 (but in this particular case I'd leave 9.9)? > and parallelism will cover more cases. > Built-in logical replication in 9.7 would be big too, and > another reason to do 9.7 as 10.0. > > On the other hand, the _start_ of parallelism in 9.6 could be enough of > a reason to call it 10.0, with the idea that the 10-series is > increasingly parallel-aware. You could argue that parallelism is a much > bigger deal than near-zero-downtime upgrades. I think each developer/DBA search for different benefit from each release. E.g. parallelism is not so important for OLTP developers and near-zero-downtime is mostly for system administrators/DBA. > I think the fundamental issue is whether we want to lead the 10.0 branch > with parallelism, or wait for an administrative change like > near-zero-downtime major upgrades and built-in logical replication. > > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com I'd vote for 9.6 up to 9.9 then 10.0… -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Initial release notes created for 9.6
On 5/5/16, Tom Lane wrote: > Please review and comment before Monday, if you can. > > regards, tom lane 1. "YUriy Zhuravlev" should be "Yury Zhuravlev" Previously[1] he had the first version in his signature, but I guess it was misconfiguring, now[2] hi has second version. 2. I'd add Dean Rasheed as co-author to "Add pg_size_bytes()" since being a committer he made more than cosmetic changes[3] but he was humble enough not to mention himself in the commit message. [1] http://www.postgresql.org/message-id/2723429.ZaCixaFn1x@dinodell [2] http://www.postgresql.org/message-id/dd701b62-008d-4048-882e-20df0e4b5...@postgrespro.ru [3] http://www.postgresql.org/message-id/caezatcxhz5ggfrfcf9_yw5h6wdxr68qdfiwhvmgfr3ascnq...@mail.gmail.com -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011
On 5/3/16, Tom Lane wrote: > Vitaly Burovoy writes: >> On 4/27/16, Alvaro Herrera wrote: >>> Point 2 is where things differ from what I remember; my (possibly >>> flawed) understanding was that there's no difference between those >>> things. Many (maybe all) of the things from this point on are probably >>> fallout from that one change. > >> It is just mentioning that CHECK constraints have influence on >> nullability characteristic, but it differs from NNC. >> NNC creates CHECK constraint, but not vice versa. You can create >> several CHECK "col IS NOT NULL" constraints, but only one NNC (several >> ones by inheritance only?). And DROP NOT NULL should drop only those >> CHECK that is linked with NNC (and inherited), but no more (full >> explanation is in my initial letter). > > This seems to me to be a most curious reading of the standard. > SQL:2011 11.4 syntax rule 17a says > >If a is specified that contains >the NOT NULL, then it is equivalent to the >following : > > CND CHECK ( C IS NOT NULL ) CA > > As a rule, when the SQL spec says "equivalent", they do not mean "it's > sort of like this", they mean the effects are indistinguishable. In > particular, I see nothing whatsoever saying that you're not allowed to > write more than one per column. 1. SQL:2011 4.13 : — If C is a column of a base table, then an indication of whether it is defined as NOT NULL and, if so, the constraint name of the associated table constraint definition. NOTE 41 — This indication and the associated constraint name exist for definitional purposes only and are not exposed through the COLUMNS view in the Information Schema. There is only "constraint name", not "constraint names". 2. SQL:2011 11.15 General Rule 1: ... If the column descriptor of C does not contain an indication that C is defined as NOT NULL, then: And there is no rule 2. I.e. if the column is already set as NOT NULL you can't specify it as NOT NULL again. 3. SQL:2011 11.15 General Rule 1.d: The following is executed without further Access Rule checking: ALTER TABLE TN ADD CONSTRAINT IDCN CHECK ( CN IS NOT NULL ) > So I don't like the proposal to add an attnotnullid column to > pg_attribute. Why and where to place it? > What we'd talked about earlier was converting attnotnull > into, effectively, a hint flag saying that there's at least one NOT NULL > constraint attached to the column. That still seems like a good approach > to me. Ok. But not only NOT NULL constraint, but also non-deferrable PK, CHECK, domains, may be the strictest FK. > When we're actually ready to throw an error for a null value, > we could root through the table's constraint list for a not-null > constraint name to report. attnotnullid is not for reporting, it is for DROP NOT NULL and recreating "CREATE TABLE" statements via pg_dump. > It doesn't matter which one we select, because > constraint application order has never been promised to be deterministic; > and a few extra cycles at that point don't seem like a big problem to me. > > regards, tom lane -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011
I'm sorry for the late answer. On 4/27/16, Alvaro Herrera wrote: > Vitaly Burovoy wrote: > > Hi, > >> But before starting working on it I had a look at the SQL-2011 >> standard (ISO/IEC 9075-2)[3] and found that: >> >> 1. A name for a "NOT NULL" constraint can be given by a table >> definition (subcl. 11.4, "Format"->"column constraint definition"). >> 2. The standard splits NNC and CHECK constraints (subcl. 11.4, >> "Format"-> "column constraint") > > Point 2 is where things differ from what I remember; my (possibly > flawed) understanding was that there's no difference between those > things. Many (maybe all) of the things from this point on are probably > fallout from that one change. It is just mentioning that CHECK constraints have influence on nullability characteristic, but it differs from NNC. NNC creates CHECK constraint, but not vice versa. You can create several CHECK "col IS NOT NULL" constraints, but only one NNC (several ones by inheritance only?). And DROP NOT NULL should drop only those CHECK that is linked with NNC (and inherited), but no more (full explanation is in my initial letter). >> III. "pg_attribute" table should have an "attnotnullid oid" as an >> indicator of "NOT NULL" (p.4) and points to a CHECK constraint; It is >> in addition to a "Nullability characteristic" "attnotnull" (p.3). >> IV. "pg_constraint" should have a column "connotnullkey int2[]" as a >> "list of the nullable columns" which references to >> "pg_attribute.attnum" for fast checking whether a column is still >> nullable after deleting/updating constraints or not. Array is >> necessary for cases like "CHECK ((col1 IS NOT NULL) AND (col2 IS NOT >> NULL))" and for nondeferrable PKs. > > I think these points warrant some more consideration. I don't like the > idea that pg_attribute and pg_constraint are both getting considerably > bloated to support this. Ok, I'm ready for a discussion. Two additional columns are necessary: one for pointing to an underlying CHECK constraint (or boolean column whether current CHECK is NNC or not) and second for fast computation of "attnotnull" (which means "nullable characteristic") and ability to skip check if "attnotnull" is set but not triggered (I think it'll improve performance for inherited tables). I think placing the first column (attnotnullid) to pg_attribute is better because you can't have more than one value in it. The second is obviously should be placed in pg_constraint. >> P.S.: >> Since the SQL standard defines that "col NOT NULL" as an equivalent to >> "CHECK (col IS NOT NULL)" (p.8) what to do with that behavior: >> >> postgres=# create type t as (x int); >> CREATE TYPE >> postgres=# SELECT v, v IS NOT NULL AS should_be_in_table FROM >> (VALUES('(1)'::t),('()'),(NULL)) AS x(v); >> v | should_be_in_table >> -+ >> (1) | t >> () | f >> | f >> (3 rows) >> >> "attnotnull" in such case is stricter, like "CHECK (col IS DISTINCT FROM >> NULL)". >> >> Should such values (with NULL in each attribute of a composite type) >> violate NOT NULL constraints? > > I wonder if the standard has a concept of null composite values. If > not, then there is no difference between IS NOT NULL and IS DISTINCT > FROM NULL, which explains why they define NNC in terms of the former. Yes, it has. The PG's composite type is "Row types" (subcl.4.8) in the standard. The standard also differentiates IS [NOT] NULL and IS [NOT] DISTINCT FROM: >>> Subcl. 8.8 : >>> ... >>> 1) Let R be the and let V be the value of R. >>> 2) Case: >>> a) If V is the null value, then “R IS NULL” is True and >>> the value of “R IS NOT NULL” is False. >>> b) Otherwise: >>> i) The value of “R IS NULL” is >>>Case: >>>1) If the value of every field of V is the null value, then True. >>>2) Otherwise, False. >>> ... >>> >>> Subcl. 8.15 >>> ... >>> 1) Let V1 be the value of and let V2 be the value >>> of . >>> ... >>> b) If V1 is the null value and V2 is not the null value, or if V1 is not >>> the null value and V2 is the null >>> value, then the result is True. >>> ... In subcl.8.8 "each column" is mentioned, in 8.15 if one of value is the null value and the other is not then nothing more is checked and True is returned. > I think your email was too hard to read because of excessive density, > which would explain the complete lack of response. Hmm. I decided it was "silently approved". =) > I haven't had the chance to work on this topic again, but I encourage you to, > if you have the resources. Thank you, I think I'll find a time for it no earlier than the summer. > (TBH I haven't had the chance to study your proposed design in detail, > either). I hope somebody find a time to study it before someone sends a proposal. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [PATH] Jsonb, insert a new value into an array at arbitrary position
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:tested, passed I have reviewed this patch. It applies cleanly at the top of current master (3501f71), compiles silently and implements declared feature. The documentation describes behavior of the function with pointing to a difference between inserting into JsonbObject and JsonbArray. The code is clean and commented. Linked comment is changed too. Regression tests cover possible use cases and edge cases. Notes for a committer: 1. pg_proc.h has changed, so the CATALOG_VERSION_NO must also be changed. 2. Code comments and changes in the documentation need proof-reading by a native English speaker, which the Author and I are not. The new status of this patch is: Ready for Committer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position
On 3/30/16, Dmitry Dolgov <9erthali...@gmail.com> wrote: > On 31 March 2016 at 05:04, Vitaly Burovoy wrote: >> The documentation changes still has to be fixed. > Thanks for help. Looks like I'm not so good at text formulation. Fixed. Never mind. I'm also not so good at it. It is still should be rewritten by a native English speaker, but it is better to give him good source for it. === I'm almost ready to mark it as "Ready for committer". Few notes again. 1. > a current item was pushed to parse state after JB_PATH_INSERT_BEFORE I paid attention that the function's name 'pushJsonbValue' looks as working with stack (push/pop), but there is no function "pop*" neither in jsonfuncs.c nor jsonb_util.c. That's why I wrote "it seems the logic in the code is correct" - it is logical to insert new value if "before", then current value, then new value if "after". But yes, following by executing path to the "pushState" function anyone understands "JsonbParseState" is constructed as a stack, not a queue. So additional comment is needed why "JB_PATH_INSERT_AFTER" check is placed before inserting curent value and JB_PATH_INSERT_BEFORE check. The comment can be located just before "if (op_type & JB_PATH_INSERT_AFTER)" (line 3986) and look similar to: /* * JsonbParseState struct behaves as a stack -- see the "pushState" function, * so check for JB_PATH_INSERT_BEFORE/JB_PATH_INSERT_AFTER in a reverse order. */ 2. A comment for the function "setPath" is obsolete: "If newval is null" check and "If create is true" name do not longer exist. Since I answered so late last time I propose the next formulating to avoid one (possible) round: /* * Do most of the heavy work for jsonb_set/jsonb_insert * * If JB_PATH_DELETE bit is set in op_type, the element is to be removed. * * If any bit mentioned in JB_PATH_CREATE_OR_INSERT is set in op_type, * we create the new value if the key or array index does not exist. * * Bits JB_PATH_INSERT_BEFORE and JB_PATH_INSERT_AFTER in op_type * behave as JB_PATH_CREATE if new value is inserted in JsonbObject. * * All path elements before the last must already exist * whatever bits in op_type are set, or nothing is done. */ === I hope that notes are final (additional check in formulating is appreciated). P.S.: if it is not hard for you, please rebase the patch to the current master to avoid offset in func.sgml. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Supporting +-Infinity values by to_timestamp(float8)
On 3/29/16, Tom Lane wrote: > Pushed with minor adjustments. > > regards, tom lane > Thank you very much! -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position
On 3/25/16, Dmitry Dolgov <9erthali...@gmail.com> wrote: > Here is a new version of path, I hope I didn't miss anything. Few notes: > >> 4. >> or even create a new constant (there can be better name for it): >> #define JB_PATH_CREATE_OR_INSERT (JB_PATH_INSERT_BEFORE | >> JB_PATH_INSERT_AFTER | JB_PATH_CREATE) > > Good idea, thanks. You're welcome. === I'm sorry for the late letter. Unfortunately, it seems one more round is necessary. The documentation changes still has to be fixed. The main description points to a "target section designated by path is a JSONB array". It is a copy-paste from jsonb_set, but here it has wrong context. The main idea in jsonb_set is replacing any object which is pointed (designated) by "path" no matter where (array or object) it is located. But in the phrase for jsonb_insert above you want to point to an upper object _where_ "section designated by path" is located. I'd write something like "If target section designated by path is _IN_ a JSONB array, ...". The same thing with "a JSONB object". Also I'd rewrite "will act like" as "behaves as". Last time I missed the argument's name "insert_before_after". It is better to replace it as just "insert_after". Also reflect that name in the documentation properly. To be consistent change the constant "JB_PATH_NOOP" as "0x" instead of just "0x0". Also the variable's name "bool before" is incorrect because it contradicts with the SQL function's argument "after" (from the documentation) or "insert_after" (proposed above) or tests (de-facto behavior). Moreover it seems the logic in the code is correct, so I have no idea why "before ? JB_PATH_INSERT_BEFORE : JB_PATH_INSERT_AFTER" works. I think either proper comment should be added or lack in the code must be found. Anyway the variable's name must reflect the SQL argument's name. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in searching path in jsonb_set when walking through JSONB array
On 2016-03-23, Oleg Bartunov wrote: > On Wed, Mar 23, 2016 at 6:37 AM, Vitaly Burovoy > wrote: > >> Hello, Hackers! >> >> While I was reviewed a patch with "json_insert" function I found a bug >> which wasn't connected with the patch and reproduced at master. >> >> It claims about non-integer whereas input values are obvious integers >> and in an allowed range. >> More testing lead to understanding it appears when numbers length are >> multiplier of 4: >> >> postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', >> '"4"'); >> ERROR: path element at the position 2 is not an integer >> > > Hmm, I see in master > > select version(); > version > - > PostgreSQL 9.6devel on x86_64-apple-darwin15.4.0, compiled by Apple LLVM > version 7.3.0 (clang-703.0.29), 64-bit > (1 row) > > select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"'); > jsonb_set > > {"a": [[], 1, 2, 3, "4"], "b": []} > (1 row) Yes, I can't reproduce it with "CFLAGS=-O2", but it is still reproduced with "CFLAGS='-O0 -g3'". postgres=# select version(); version -- PostgreSQL 9.6devel on x86_64-pc-linux-gnu, compiled by gcc (Gentoo 4.8.4 p1.4, pie-0.6.1) 4.8.4, 64-bit (1 row) postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"'); ERROR: path element at the position 2 is not an integer It depends on memory after the string. In debug mode it always (most of the time?) has a garbage (in my case the char '~' following by '\x7f' multiple times) there. I think it is just a question of complexity of reproducing in release, not a question whether there is a bug or not. All the other occurrences of strtol in the file have TextDatumGetCString before, except the occurrence in the setPathArray function. It seems its type is TEXT (which is not null-terminated), not cstring. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Bug in searching path in jsonb_set when walking through JSONB array
Hello, Hackers! While I was reviewed a patch with "json_insert" function I found a bug which wasn't connected with the patch and reproduced at master. It claims about non-integer whereas input values are obvious integers and in an allowed range. More testing lead to understanding it appears when numbers length are multiplier of 4: postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", }', '"4"'); ERROR: path element at the position 2 is not an integer postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"b", 1000}', '"4"'); ERROR: path element at the position 2 is not an integer postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", -999}', '"4"'); ERROR: path element at the position 2 is not an integer postgres=# select jsonb_set('{"a":[[],1,2,3],"b":[]}', '{"a", 1000}', '"4"'); ERROR: path element at the position 2 is not an integer Close values are ok: postgres=# select jsonb_set('{"a":[[],1,2,3]}', '{"a", 0, 999}', '"4"'); jsonb_set - {"a": [["4"], 1, 2, 3]} (1 row) postgres=# select jsonb_set('{"a":[[],1,2,3]}', '{"a", 0, 1}', '"4"'); jsonb_set - {"a": [["4"], 1, 2, 3]} (1 row) Research lead to setPathArray where a string which is got via VARDATA_ANY but is passed to strtol which expects cstring. In case of string number length is not a multiplier of 4 rest bytes are padding by '\0', when length is a multiplier of 4 there is no padding, just garbage after the last digit of the value. Proposed patch in an attachment fixes it. There is a magic number "20" as a length of an array for copying key from a path before passing it to strtol. It is a maximal length of a value which can be parsed by the function. I could not find a proper constant for it. Also I found similar direct value in the code (e.g. in numeric.c). I've added a comment, I hope it is enough for it. -- Best regards, Vitaly Burovoy fix_jsonb_set_path.0001.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position
On 2016-03-16, Dmitry Dolgov <9erthali...@gmail.com> wrote: > Hi Vitaly, thanks for the review. I've attached a new version of path with > improvements. Few notes: > >> 7. Why did you remove "skip"? It is a comment what "true" means... > > Actually, I thought that this comment was about skipping an element from > jsonb in order to change/delete it, As I got it, it is "skipNested", skipping iterating over nested containers, not skipping an element. > not about the last argument. E.g. you can find several occurrences of > `JsonbIteratorNext` with `true` as the last > argument but without a "last argument is about skip" comment. I think it is not a reason for deleting this comment. ;-) > And there is a piece of code in the function `jsonb_delete` with a "skip > element" commentary: > > ``` > /* skip corresponding value as well */ > if (r == WJB_KEY) > JsonbIteratorNext(&it, &v, true); > ``` The comment in your example is for the extra "JsonbIteratorNext" call (the first one is in a "while" statement outside your example, but over it in the code), not for the "skip" parameter in the "JsonbIteratorNext" call here. === Notes for the jsonb_insert_v3.patch applied on the f6bd0da: 1a. Please, rebase your patch to the current master (it is easy to resolve conflict, but still necessary). 1b. Now OID=3318 is "pg_stat_get_progress_info" (Hint: 3324 is free now). 2. The documentation, func.sgml: > Iftarget Here must be a space after the "If" word. 3. There is a little odd formulating me in the documentation part (without formatting for better readability): > If target section designated by path is a JSONB array, new_value will be > inserted before it, or after if after is true (defailt is false). a. "new_value" is not inserted before "it" (a JSONB array), it is inserted before target; b. s/or after if/or after target if/; c. s/defailt/default/ > If ... is a JSONB object, new_value will be added just like a regular key. d. "new_value" is not a regular key: key is the final part of the "target"; e. "new_value" replaces target if it exists; f. "new_value" is added if target is not exists as if jsonb_set is called with create_missing=true. g. "will" is about possibility. "be" is about an action. 4. function setPathObject, block with "op_type = JB_PATH_CREATE" (jsonfuncs.c, lines 3833..3840). It seems it is not necessary now since you can easily check for all three options like: op_type & (JB_PATH_INSERT_BEFORE | JB_PATH_INSERT_AFTER | JB_PATH_CREATE) or even create a new constant (there can be better name for it): #define JB_PATH_CREATE_OR_INSERT (JB_PATH_INSERT_BEFORE | JB_PATH_INSERT_AFTER | JB_PATH_CREATE) and use it like: op_type & JB_PATH_CREATE_OR_INSERT all occurrences of JB_PATH_CREATE in the function are already with the "(level == path_len - 1)" condition, there is no additional check needed. also the new constant JB_PATH_CREATE_OR_INSERT can be used at lines 3969, 4025. 5. > if (op_type != JB_PATH_DELETE) It seems strange direct comparison among bitwise operators (lines 3987..3994) You can leave it as is, but I'd change it (and similar one at the line 3868) to a bitwise operator: if (!op_type & JB_PATH_DELETE) 6. I'd like to see a test with big negative index as a reverse for big positive index: select jsonb_insert('{"a": [0,1,2]}', '{a, -10}', '"new_value"'); I know now it works as expected, it is just as a defense against further changes that can be destructive for this special case. 7. Please, return the "skip" comment. 8. The documentation: add "jsonb_insert" to the note about importance of existing intermediate keys. Try to reword it since the function doesn't have a "create_missing" parameter support. > All the items of the path parameter of jsonb_set must be present in the > target, > ... in which case all but the last item must be present. Currently I can't break the code, so I think it is close to the final state. ;-) -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
On 2016-03-16, Tom Lane wrote: > My feeling is we ought to preserve the old behavior here, which would > involve making JULIAN_MAXYEAR_FOR_TIMESTAMPS format-dependent and > adjusting the float values for the two derived constants; not much of a > problem code-wise. I think though that it would break quite a number of > the proposed new regression tests for the float case. I think I can be solved by adding new tests for new upper bound for float values and creating a new version of expected file like date_1.out, horology_1.out, timestamp_1.out and timestamptz.out (the original files are for integer values; with prefix "_1" are for float ones). > TBH, I thought > the number of added test cases was rather excessive anyway, so I wouldn't > have a problem with just leaving out whichever ones don't pass with both > build options. The tests checks edge cases. In case of saving bigger interval of allowed values for float values you'll remove checks when they should fail. What's the left cases for? On 2016-03-16, Tom Lane wrote: > Actually, it seems a lot of the provided test cases fail for float > timestamps anyway; there's an assumption that > 294276-12-31 23:59:59.99 > 294277-01-01 00:00:00.00 > are distinct timestamps, which they are not in float mode. I'm ready to change fractional part '.99' to '.5' (to avoid issues of different implementation of floating point on different architectures) to emphasize "almost reached the threshold". On 2016-03-16, Tom Lane wrote: > AFAICS the only way that we can avoid a dump/reload hazard is to tighten > up the allowed range of timestamps by at least one day, so that any > timestamp that passes IS_VALID_TIMESTAMP() is guaranteed to print, in > any timezone, with a contained date that the Julian-day routines can > handle. I'd be inclined to set the lower limit of timestamps as > '4713-01-01 00:00 GMT BC' just to keep things simple. (The upper limit > can stay where it is.) > > While dates don't have this timezone rotation problem, the documentation > says that they have the same lower-bound limit as timestamps, and there > are places in the code that assume that too. Is it okay to move their > lower bound as well? I think it is important (see initial letter) since it is out of supported interval (according to the documentation) and don't work as expected in some cases (see "to_char(date_trunc('week', '4714-12-28 BC'::date),'day')"). It seems it leads to change IS_VALID_JULIAN[_FOR_STAMPS] as well to something like: #define IS_VALID_JULIAN(y,m,d) \ ((JULIAN_MINYEAR < (y) && (y) < JULIAN_MAXYEAR) ||(JULIAN_MINYEAR == (y) && (m) == 12 && (d) == 31) ||(JULIAN_MAXYEAR == (y) && (m) == 01 && (d) == 01)) It can be correct if checks for IS_VALID_DATE is added in date.c to places marked as "IS_VALID_JULIAN is enough for checking..." All other places are (I'm sure) covered by IS_VALID_DATE/IS_VALID_TIMESTAMP. What about the question: On 2016-02-24, Vitaly Burovoy wrote: > Also I'm asking for a help because the query (in default TZ='GMT+1'): > postgres=# SELECT '4714-11-24 00:00:00.00+00 BC'::timestamptz; > > in psql gives a result "4714-11-23 23:00:00-01 BC", > but in a testing system gives "Sun Nov 23 23:00:00 4714 GMT BC" > without TZ offset. Why there is "GMT" instead of "GMT+01"? Is it bug? If so should it be fixed in this patch or can be fixed later? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
On 2016-03-15, Mark Dilger wrote: > >> On Mar 14, 2016, at 5:12 PM, Vitaly Burovoy >> wrote: >> >> On 3/14/16, Mark Dilger wrote: >>> The first thing I notice about this patch is that >>> src/include/datatype/timestamp.h >>> has some #defines that are brittle. The #defines have comments >>> explaining >>> their logic, but I'd rather embed that in the #define directly: >>> >>> This: >>> >>> +#ifdef HAVE_INT64_TIMESTAMP >>> +#define MIN_TIMESTAMP INT64CONST(-2118134880) >>> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 * USECS_PER_SEC */ >>> +#define MAX_TIMESTAMP INT64CONST(92233713312) >>> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 * >>> USECS_PER_SEC >>> */ >>> +#else >>> +#define MIN_TIMESTAMP -211813488000.0 >>> +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 */ >>> +#define MAX_TIMESTAMP 9223371331200.0 >>> +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 */ >>> +#endif >>> >>> Could be written as: >>> >>> #ifdef HAVE_INT64_TIMESTAMP >>> #define MIN_TIMESTAMP >>> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY) >>> #define MAX_TIMESTAMP >>> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY) >>> #else >>> #define MIN_TIMESTAMP >>> ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY) >>> #define MAX_TIMESTAMP >>> ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY) >>> #endif >>> >>> I assume modern compilers would convert these to the same constants at >>> compile-time, >> >> Firstly, Postgres is compiling not only by modern compilers. > > Do you have an example of a compiler that will not do this constant folding > at compile time? No, I'm not good at knowing features of all versions and all kings of compilers, but I'm sure constants are better than expressions for big values. =) >>> rather than impose a runtime penalty. >> >> Secondary, It is hard to write it correctly obeying Postgres' coding >> standard (e.g. 80-columns border) and making it clear: it is not so >> visual difference between USECS_PER_DAY and SECS_PER_DAY in the >> expressions above (but it is vivid in comments in the file). > > Hmm. I think using USECS_PER_DAY is perfectly clear, but that is a > personal > opinion. I don't see any way to argue if you don't see it that way. I'm talking about perception of the constants when they a very similar but they are not justified by a single column (where difference _in_expressions_ are clear). There was a difference by a single char ("U") only which is not so obvious without deep looking on it (be honest I'd missed it until started to write an answer). >> or "Why is INT64CONST set for >> MIN_TIMESTAMP, but not for MAX_TIMESTAMP?" (JULIAN_MAX4STAMPS is _not_ >> int64). > > I was only casting the zero to int64. That doesn't seem necessary, so it > can > be removed. Both MIN_TIMESTAMP and MAX_TIMESTAMP were defined > in terms of USECS_PER_DAY, which itself is defined in terms of INT64CONST, > so I believe they both work out to be an int64 constant. I hope so. But in such cases I'd prefer to _begin_ calculations from int64, not to _finish_ by it. It is impossible to pass constants (like JULIAN_MAX4STAMPS) to INT64CONST macros. Inserting "(int64)" makes rows larger by 7 chars... ;-) >>> The #defines would be less brittle in >>> the event, for example, that the postgres epoch were ever changed. >> >> I don't think it is real, and even in such case all constants are >> collected together in the file and will be found and changed at once. > > I agree that they would be found at once. I disagree that the example > is not real, as I have changed the postgres epoch myself in some builds, > to be able to use int32 timestamps on small devices. Wow! I'm sorry, I didn't know about it. But in such case (tighten to int32) there are more than two places which should be changed. Two more (four with disabled HAVE_INT64_TIMESTAMP) constants are not big deal with it. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
On 3/16/16, Tom Lane wrote: > So I fixed that up and committed it, with a very short set of new > regression test cases. I seriously doubt that the other ones add > enough value to be worth trying to make them work in both float- and > int-timestamp cases; though if you want to submit a new patch to > add more test cases we could consider it. My feeling about it is that > that kind of testing does nothing for errors of omission (ie functions > that fail to range-check their results), which is the most likely sort > of bug here, and so I doubt it's worth adding them to regression tests > that many people will run many times a day for a long time to come. > > regards, tom lane Thank you very much! If you decide such kind of tests is not necessary, it is enough for me. Is there any reason to leave JULIAN_MINDAY and JULIAN_MAXDAY which are not used now? Also why JULIAN_MAXMONTH is set to "6" whereas {DATE|TIMESTAMP}_END_JULIAN use "1" as month? -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Supporting +-Infinity values by to_timestamp(float8)
On 2016-03-15, David Steele wrote: > On 3/4/16 2:56 PM, Vitaly Burovoy wrote: >> On 3/4/16, Anastasia Lubennikova wrote: >> >>> I think that you should update documentation. At least description of >>> epoch on this page: >>> http://www.postgresql.org/docs/devel/static/functions-datetime.html >> >> Thank you very much for pointing where it is located (I saw only >> "to_timestamp(TEXT, TEXT)"). >> I'll think how to update it. > > Vitaly, have you decided how to update this yet? Yes, there are three versions: * remove mentioning how to get timestamptz from UNIX stamp; * leave a note how to get timestamptz and add a note that such encapsulation existed prior to 9.6; * replace to the proposed current behavior (without interval). I decided to implement the third case (but a result there has a time zone which can be different, so the result can be not exactly same for a concrete user). If a committer decides to do somehow else, he is free to choose one from the list above or to do something else. >>> 3. (nitpicking) I don't sure about "4STAMPS" suffix. "4" is nice >>> abbreviation, but it seems slightly confusing to me. >> >> It doesn't matter for me what it is called, it is short enough and >> reflects a type on which it is applied. >> What would the best name be for it? > > Anastasia, any suggestions for a better name, or just leave it as is? > > I'm not in favor of the "4", either. I think I would prefer > JULIAN_MAXYEAR_STAMP. It turns out that Tom has changed almost one third of timestamp.h and now the constant has a name "TIMESTAMP_END_JULIAN". I've rebased the patch to the current master (5db5146) and changed it according to the new timestamp.h. Now it passes both versions (integer and float timestamps). I deleted test for the upper boundary for integer timestamps, changed a little comments. I decided to delete hints about minimum and maximum allowed values because no one type has such hint. Please find attached a new version of the patch. -- Best regards, Vitaly Burovoy to_timestamp_infs.v002.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
On 3/14/16, Anastasia Lubennikova wrote: > 14.03.2016 16:23, David Steele: >> On 2/25/16 4:44 PM, Vitaly Burovoy wrote: >> >>> Added to the commitfest 2016-03. >>> >>> [CF] https://commitfest.postgresql.org/9/540/ >> >> This looks like a fairly straight-forward bug fix (the size of the >> patch is deceptive because there a lot of new tests included). It >> applies cleanly. >> >> Anastasia, I see you have signed up to review. Do you have an idea >> when you will get the chance to do that? >> >> Thanks, > > I've read the patch thoroughly and haven't found any problems. I think > that the patch is in a very good shape. > It fixes a bug and has an excellent set of tests. > > There is an issue, mentioned in the thread above: > >>postgres=# select >>postgres-# to_char(date_trunc('week', '4713-01-01 BC'::date),'day') >>postgres-# ,to_char(date_trunc('week', '4714-12-29 BC'::date),'day') >>postgres-# ,to_char(date_trunc('week', '4714-12-28 BC'::date),'day'); >> to_char | to_char | to_char >>---+---+--- >> monday| monday| thursday >>(1 row) > >>since 4714-12-28 BC and to the past detection when a week is starting >>is broken (because it is boundary of isoyears -4713 and -4712). >>Is it worth to break undocumented range or leave it as is? > > But I suppose that behavior of undocumented dates is not essential. I'm sorry... What should I do with "Waiting on Author" state if you don't have complaints? > -- > Anastasia Lubennikova > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Integer overflow in timestamp[tz]_part() and date/time boundaries check
On 3/14/16, Mark Dilger wrote: > The first thing I notice about this patch is that > src/include/datatype/timestamp.h > has some #defines that are brittle. The #defines have comments explaining > their logic, but I'd rather embed that in the #define directly: > > This: > > +#ifdef HAVE_INT64_TIMESTAMP > +#define MIN_TIMESTAMP INT64CONST(-2118134880) > +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 * USECS_PER_SEC */ > +#define MAX_TIMESTAMP INT64CONST(92233713312) > +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 * USECS_PER_SEC > */ > +#else > +#define MIN_TIMESTAMP -211813488000.0 > +/* == (0 - POSTGRES_EPOCH_JDATE) * 86400 */ > +#define MAX_TIMESTAMP 9223371331200.0 > +/* == (JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * 86400 */ > +#endif > > Could be written as: > > #ifdef HAVE_INT64_TIMESTAMP > #define MIN_TIMESTAMP > ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY) > #define MAX_TIMESTAMP > ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * USECS_PER_DAY) > #else > #define MIN_TIMESTAMP > ((INT64CONST(0) - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY) > #define MAX_TIMESTAMP > ((JULIAN_MAX4STAMPS - POSTGRES_EPOCH_JDATE) * SECS_PER_DAY) > #endif > > I assume modern compilers would convert these to the same constants at > compile-time, Firstly, Postgres is compiling not only by modern compilers. > rather than impose a runtime penalty. Secondary, It is hard to write it correctly obeying Postgres' coding standard (e.g. 80-columns border) and making it clear: it is not so visual difference between USECS_PER_DAY and SECS_PER_DAY in the expressions above (but it is vivid in comments in the file). Also a questions raise "Why is INT64CONST(0) necessary in `#else' block" (whereas `float' is necessary there) or "Why is INT64CONST set for MIN_TIMESTAMP, but not for MAX_TIMESTAMP?" (JULIAN_MAX4STAMPS is _not_ int64). The file is full of constants. For example, JULIAN_MAX and SECS_PER_DAY are one of them. I would leave it as is. > The #defines would be less brittle in > the event, for example, that the postgres epoch were ever changed. I don't think it is real, and even in such case all constants are collected together in the file and will be found and changed at once. > Mark Dilger -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp
On 3/11/16, Tom Lane wrote: > [ catches up with thread... ] > > Yes. It would be more reasonable IMO for to_date to throw an error > because this is bad input. On the other hand, to_date mostly doesn't > throw an error no matter how bad the input is. I think that may have > been intentional, although its habit of producing garbage output > instead (and not, say, NULL) is certainly not very happy-making. > > It's a bit schizophrenic for this patch to be both adding ereport's > for year zero (thereby breaking the no-failure-on-bad-input policy) > *and* trying to produce sane output for arguably-insane input. > > I don't really see an argument why '0001-00-00' should be accepted > but '-01-01' should throw an error, but that would be the result > if we take this patch. Well. In case of zero year it could return the first year instead of an exception by the same way as "MM" and "DD" do it... > And I quite agree with Robert that it's insane > to consider '-2-06-01' as satisfying the format '-MM-DD'. The > fact that it even appears to do something related to a BC year is > an implementation artifact, and not a very nice one. > > I would be in favor of a ground-up rewrite of to_date and friends, with > some better-stated principles (in particular, a rationale why they even > exist when date_in and friends usually do it better) I think they exist because date_in can't convert something like "IYYY-IDDD" (I wonder if date_in can do so) or to parse dates/stamps independent from the "DateStyle" parameter. > and crisper error > detection. But I'm not seeing the argument that hacking at the margins > like this moves us forward on either point; what it does do is create > another backward-compatibility hazard for any such rewrite. > > In short, I vote with Robert to reject this patch. Accepted. Let's agree it is a case "garbage in, garbage out" and "an implementation artifact". > BTW, the context for the original report wasn't clear, The context was to make "extract" and "to_date"/"to_timestamp" be consistently reversible for "year"/"" for negative values (since both of them support ones). > but I wonder how > much of the actual problem could be addressed by teaching make_date() > and friends to accept negative year values as meaning BC. > > regards, tom lane Thank Thomas, Robert and Tom very much for an interesting (but short) discussion. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp
On 3/11/16, Robert Haas wrote: > On Sun, Feb 28, 2016 at 9:38 PM, Vitaly Burovoy > wrote: >>> However, I'm not sure we ought to tinker with the behavior in this >>> area. If -MM-DD is going to accept things that are not of the >>> format -MM-DD, and I'd argue that -1-06-01 is not in that format, >> >> It is not about format, it is about values. > > I disagree. In a format like "-1-06-01", you want the first minus to > indicate negation and the other two to be a separator. That's not > very far away from wanting the database to read your mind. It is not my wish. The database does it just now: postgres=# SELECT to_date('-1-06-01', ''); to_date --- 0002-01-01 BC (1 row) >> Because it is inconvenient a little. If one value ("-2345") is passed, >> another one ("2346 BC") is got. In the other case a programmer must >> check for negative value, and if so change a sign and add "BC" to the >> format. Moreover the programmer must keep in mind that it is not >> enough to have usual date format "DD/MM/", because sometimes there >> can be "BC" part. > > Yeah, well, that's life. You can write an alternative function to > construct dates that works the way you like, and that may well be a > good idea. But I think *this* change is not a good idea, and > accordingly I vote we reject this patch. My wish is to make the behavior be consistent. Since there are two reverse functions ("extract" and "to_date" ["to_timestamp" in fact is the same]), I expect that is described as "year" ("year"-"") means the same thing in both of them, the same with pairs "isoyear"-"IYYY", "dow"-"DDD", "isodow"-"IDDD", etc. Now "year" is _not_ the same as "" (but it cat be so according to the documentation: there is no mentioning of any ISO standard), whereas "isoyear" _is_ the same: postgres=# SELECT y, to_date(y, ''),to_date(y, 'IYYY')IYYY postgres-# FROM(VALUES('-1-06-01'))t(y); y | | iyyy --+---+--- -1-06-01 | 0002-01-01 BC | 0002-01-01 BC (1 row) and postgres=# SELECT y, date_part('year', y),date_part('isoyear', y)IYYY postgres-# FROM(VALUES('0002-06-01 BC'::date))t(y); y | | iyyy ---+--+-- 0002-06-01 BC | -2 | -1 (1 row) P.S.: proposed patch changes IYYY as well, but it is easy to fix it and I'm ready to do it after finding a consensus. -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Jsonb, insert a new value into an array at arbitrary position
On 2016-02-29 17:19+00, Dmitry Dolgov <9erthali...@gmail.com> wrote: On 2016-02-24 19:37+00, Petr Jelinek wrote: >> Also this patch needs documentation. > I've added new version in attachments, thanks. Hello! The first pass of a review is below. 1. Rename the "flag" variable to something more meaningful. (e.g. "op_type" - "operation_type") 2. There is two extra spaces after the "_DELETE" word +#define JB_PATH_DELETE 0x0002 3. res = setPath(&it, path_elems, path_nulls, path_len, &st, - 0, newval, create); + 0, newval, create ? JB_PATH_CREATE : 0x0); What is the magic constant "0x0"? If not "create", then what? Introduce something like JB_PATH_NOOP = 0x0... 4. In the "setPathArray" the block: if (newval != NULL) "newval == NULL" is considered as "to be deleted" from the previous convention and from the comments for the "setPath" function. I think since you've introduced special constants one of which is JB_PATH_DELETE (which is not used now) you should replace convention from checking for a value to checking for a constant: if (flag != JB_PATH_DELETE) or even better: if (!flag & JB_PATH_DELETE) 5. Change checking for equality (to bit constants) to bit operations: (flag == JB_PATH_INSERT_BEFORE || flag == JB_PATH_INSERT_AFTER) which can be replaced to: (flag & (JB_PATH_INSERT_BEFORE | JB_PATH_INSERT_AFTER)) also here: (flag == JB_PATH_CREATE || flag == JB_PATH_INSERT_BEFORE || flag == JB_PATH_INSERT_AFTER)) can be: (flag & (JB_PATH_CREATE | JB_PATH_INSERT_BEFORE | JB_PATH_INSERT_AFTER)) 6. Pay attention to parenthesises to make the code looks like similar one around: + if ((npairs == 0) && flag == JB_PATH_CREATE && (level == path_len - 1)) should be: + if ((npairs == 0) && (flag == JB_PATH_CREATE) && (level == path_len - 1)) 7. Why did you remove "skip"? It is a comment what "true" means... - r = JsonbIteratorNext(it, &v, true);/* skip */ + r = JsonbIteratorNext(it, &v, true); 8. After your changes some statements exceed 80-column threshold... The same rules for the documentation. 9. And finally... it does not work as expected in case of: postgres=# select jsonb_insert('{"a":[0,1,2,3]}', '{"a", 10}', '"4"'); jsonb_insert - {"a": [0, 1, 2, 3]} (1 row) wheras jsonb_set works: postgres=# select jsonb_set('{"a":[0,1,2,3]}', '{"a", 10}', '"4"'); jsonb_set -- {"a": [0, 1, 2, 3, "4"]} (1 row) -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS][PATCH] Supporting +-Infinity values by to_timestamp(float8)
On 3/4/16, Anastasia Lubennikova wrote: > 27.02.2016 09:57, Vitaly Burovoy: >> Hello, Hackers! >> >> I worked on a patch[1] allows "EXTRACT(epoch FROM >> +-Inf::timestamp[tz])" to return "+-Inf::float8". >> There is an opposite function "to_timestamp(float8)" which now defined >> as: >> SELECT ('epoch'::timestamptz + $1 * '1 second'::interval) > > Hi, > thank you for the patches. Thank you for the review. > Could you explain, whether they depend on each other? Only logically. They reverse each other: postgres=# SELECT v, to_timestamp(v), extract(epoch FROM to_timestamp(v)) FROM postgres-# unnest(ARRAY['+inf', '-inf', 0, 65536, 982384720.12]::float8[]) v; v | to_timestamp| date_part --+---+-- Infinity | infinity | Infinity -Infinity | -infinity |-Infinity 0 | 1970-01-01 00:00:00+00|0 65536 | 1970-01-01 18:12:16+00|65536 982384720.12 | 2001-02-17 04:38:40.12+00 | 982384720.12 (5 rows) >> Since intervals do not support infinity values, it is impossible to do >> something like: >> >> SELECT to_timestamp('infinity'::float8); >> >> ... which is not good. >> >> Supporting of such converting is in the TODO list[2] (by "converting >> between infinity timestamp and float8"). > > You mention intervals here, and TODO item definitely says about > 'infinity' interval, Yes, it is in the same block. But I wanted to point to the link "converting between infinity timestamp and float8". There are two-way conversion examples. > while patch and all the following discussion concerns to timestamps. > Is it a typo or I misunderstood something important? It is just a reason why I rewrote it as an internal function. I asked whether to just rewrite the function "pg_catalog.to_timestamp(float8)" as an internal one or to add support of infinite intervals. Tom Lane answered[5] "you should stay away from infinite intervals". So I left intervals as is. > I assumed that following query will work, but it isn't. Could you > clarify that? > select to_timestamp('infinity'::interval); It is not hard. There is no logical way to convert interval (e.g. "5minutes") to a timestamp (or date). There never was a function "to_timestamp(interval)" and never will be. postgres=# select to_timestamp('5min'::interval); ERROR: function to_timestamp(interval) does not exist LINE 1: select to_timestamp('1min'::interval); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. >> Proposed patch implements it. >> >> There is an other patch in the CF[3] 2016-03 implements checking of >> timestamp[tz] for being in allowed range. Since it is wise to set >> (fix) the upper boundary of timestamp[tz]s, I've included the file >> "src/include/datatype/timestamp.h" from there to check that an input >> value and a result are in the allowed range. >> >> There is no changes in a documentation because allowed range is the >> same as officially supported[4] (i.e. until 294277 AD). > > I think that you should update documentation. At least description of > epoch on this page: > http://www.postgresql.org/docs/devel/static/functions-datetime.html Thank you very much for pointing where it is located (I saw only "to_timestamp(TEXT, TEXT)"). I'll think how to update it. > More thoughts about the patch: > > 1. When I copy value from hints for min and max values (see examples > below), it works fine for min, while max still leads to error. > It comes from the check "if (seconds >= epoch_ubound)". I wonder, > whether you should change hint message? > > select to_timestamp(-210866803200.00); >to_timestamp > - > 4714-11-24 02:30:17+02:30:17 BC > (1 row) > > > select to_timestamp(9224318016000.00); > ERROR: UNIX epoch out of range: "9224318016000.00" > HINT: Maximal UNIX epoch value is "9224318016000.00" I agree, it is a little confusing. Do you (or anyone) know how to construct a condensed phrase that it is an exclusive upper bound of an allowed UNIX epoch range? > 2. There is a comment about JULIAN_MAXYEAR inaccuracy in timestamp.h: > > * IS_VALID_JULIAN checks the minimum date exactly, but is a bit sloppy > * about the maximum, since it's far enough out to not be especially > * interesting. It is just about the accuracy to the day for a lower
Re: [HACKERS][PATCH] Supporting +-Infinity values by to_timestamp(float8)
On 2/26/16, Vitaly Burovoy wrote: > Proposed patch implements it. I'm sorry, I forgot to leave a note for reviewers and committers: This patch requires CATALOG_VERSION_NO be bumped. Since pg_proc.h entry has changed, it is important to check and run regress tests on a new cluster (as if CATALOG_VERSION_NO was bumped). -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb array-style subscription
On 1/19/16, Alvaro Herrera wrote: > Dmitry Dolgov wrote: > >> I've cleaned up the code, created a separate JsonbRef node (and there are >> a >> lot of small changes because of that), abandoned an idea of "deep >> nesting" >> of assignments (because it doesn't relate to jsonb subscription, is more >> about the >> "jsonb_set" function, and anyway it's not a good idea). It looks fine for >> me, and I need a little guidance - is it ok to propose this feature for >> commitfest 2016-03 for a review? > > Has this patch been proposed in some commitfest previously? One of the > less-commonly-invoked rules of commitfests is that you can't add patches > that are too invasive to the last one -- so your last chance for 9.6 was > 2016-01. This is harsh to patch submitters, but it helps keep the size > of the last commitfest down to a reasonable level; otherwise we are > never able to finish it. I'd like to be a reviewer for the patch. It does not look big and very invasive. Is it a final decision or it has a chance? If something there hurts committers, it can end up as "Rejected with feedback" (since the patch is already in the CF[1])? [1]https://commitfest.postgresql.org/9/485/ -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp
On 2/27/16, Robert Haas wrote: > On Tue, Feb 23, 2016 at 6:23 AM, Thomas Munro > wrote: >> This seems to be a messy topic. The usage of "AD" and "BC" imply that >> TO_DATE is using the anno domini system which doesn't have a year 0, >> but in the DATE type perhaps we are using the ISO 8601 model[2] where >> 1 BC is represented as , leading to the difference of one in all >> years before 1 AD? > > Well, the way to figure that out, I think, is to look at the > documentation. I had a look at... > > http://www.postgresql.org/docs/9.5/static/functions-formatting.html > > ...which says... > > year (4 or more digits) > IYYY ISO 8601 week-numbering year (4 or more digits) > > I don't really understand ISO 8601, but if IYYY is giving us an ISO > 8601 thing, then presumably is not supposed to be giving us that. > The same page elsewhere refers to Gregorian dates, and other parts > of the documentation seem to agree that's what we use. > > But having said that, this is kind of a weird situation. We're > talking about this: > > rhaas=# SELECT y || '-06-01', to_date (y || '-06-01', '-MM-DD') > FROM (VALUES (2), (1), (0), (-1), (-2)) t(y); > ?column? |to_date > --+--- > 2-06-01 | 0002-06-01 > 1-06-01 | 0001-06-01 > 0-06-01 | 0001-06-01 BC > -1-06-01 | 0002-06-01 BC > -2-06-01 | 0003-06-01 BC > (5 rows) > > Now, I would be tempted to argue that passing to_date('-1-06-01', > '-MM-DD') ought to do the same thing as to_date('pickle', > '-MM-DD') i.e. throw an error. There's all kinds of what seems to > me to be shoddy error checking in this area: > > rhaas=# select to_date('-3', ':MM&DD'); > to_date > --- > 0004-01-01 BC > (1 row) > > It's pretty hard for me to swallow the argument that the input matches > the provided format. > > However, I'm not sure we ought to tinker with the behavior in this > area. If -MM-DD is going to accept things that are not of the > format -MM-DD, and I'd argue that -1-06-01 is not in that format, It is not about format, it is about values. > then I think it should probably keep doing the same things it's always > done. If you want to supply a BC date, why not do Because it is inconvenient a little. If one value ("-2345") is passed, another one ("2346 BC") is got. In the other case a programmer must check for negative value, and if so change a sign and add "BC" to the format. Moreover the programmer must keep in mind that it is not enough to have usual date format "DD/MM/", because sometimes there can be "BC" part. > this: > > rhaas=# select to_date('0001-06-01 BC', '-MM-DD BC'); > to_date > --- > 0001-06-01 BC > (1 row) Also because of: postgres=# SELECT EXTRACT(year FROM to_date('-3', '')); date_part --- -4 (1 row) Note that the documentation[1] says "Keep in mind there is no 0 AD". More examples by the link[2]. > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company [1]http://www.postgresql.org/docs/devel/static/functions-datetime.html [2]http://www.postgresql.org/message-id/cakoswnn6kpfabmrshzyn8+2nhpyfvbdmphya5pqguny8lp2...@mail.gmail.com -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS][PATCH] Supporting +-Infinity values by to_timestamp(float8)
Added to the CF 2016-03: https://commitfest.postgresql.org/9/546/ -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS][PATCH] Supporting +-Infinity values by to_timestamp(float8)
Hello, Hackers! I worked on a patch[1] allows "EXTRACT(epoch FROM +-Inf::timestamp[tz])" to return "+-Inf::float8". There is an opposite function "to_timestamp(float8)" which now defined as: SELECT ('epoch'::timestamptz + $1 * '1 second'::interval) Since intervals do not support infinity values, it is impossible to do something like: SELECT to_timestamp('infinity'::float8); ... which is not good. Supporting of such converting is in the TODO list[2] (by "converting between infinity timestamp and float8"). Proposed patch implements it. There is an other patch in the CF[3] 2016-03 implements checking of timestamp[tz] for being in allowed range. Since it is wise to set (fix) the upper boundary of timestamp[tz]s, I've included the file "src/include/datatype/timestamp.h" from there to check that an input value and a result are in the allowed range. There is no changes in a documentation because allowed range is the same as officially supported[4] (i.e. until 294277 AD). [1]http://git.postgresql.org/pg/commitdiff/647d87c56ab6da70adb753c08d7cdf7ee905ea8a [2]https://wiki.postgresql.org/wiki/Todo#Dates_and_Times [3]https://commitfest.postgresql.org/9/540/ [4]http://www.postgresql.org/docs/devel/static/datatype-datetime.html -- Best regards, Vitaly Burovoy to_timestamp_infs.v001.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATH] Correct negative/zero year in to_date/to_timestamp
On 2/26/16, Shulgin, Oleksandr wrote: > On Fri, Feb 26, 2016 at 3:24 PM, Ivan Kartyshov > > wrote: > >> The following review has been posted through the commitfest application: > >> make installcheck-world: tested, failed >> Implements feature: tested, failed >> Spec compliant: tested, failed >> Documentation:tested, failed >> >> Applied this patch, it works well, make what it expected correctly, code >> style is maintained. Regression tests passed OK. No documentation or >> comments. >> > > Why does it say "tested, failed" for all points above there? ;-) I hope Ivan misunderstood that "tested" and "passed" are two different options, not a single "tested, passed" ;-) Unfortunately, it seems there should be a discussion about meaning of "": it seems authors made it as ISO8601-compliant (but there are still several bugs), but there is no proofs in the documentation[1]. On the one hand there is only "year" mentioned for "", "YYY", etc., and "ISO 8601 ... year" is "week-numbering", i.e. "IYYY", "IYY", etc. only for using with "IDDD", "ID" and "IW". On the other hand "extract" has two different options: "year" and "isoyear" and only the second one is ISO8601-compliant (moreover it is week-numbering at the same time): postgres=# SELECT y src, EXTRACT(year FROM y) AS year, EXTRACT(isoyear FROM y) AS isoyear postgres-# FROM unnest(ARRAY[ postgres(# '4713-01-01 BC','4714-12-31 BC','4714-12-29 BC','4714-12-28 BC']::date[]) y; src | year | isoyear ---+---+- 4713-01-01 BC | -4713 | -4712 4714-12-31 BC | -4714 | -4712 4714-12-29 BC | -4714 | -4712 4714-12-28 BC | -4714 | -4713 (4 rows) So there is lack of consistency: something should be changed: either "extract(year..." (and the documentation[2], but there is "Keep in mind there is no 0 AD, ..." for a long time, so it is a change which breaks compatibility; and the patch will be completely different), or the patch (it has an influence on "IYYY", so it is obviously wrong). Now (after the letter[3] from Thomas Munro) I know the patch is not ready for committing even with minimal changes. But I'm waiting for a discussion: what part should be changed? I would change behavior of "to_date" and "to_timestamp" to match with extract options "year"/"isoyear", but I think getting decision of the community before modifying the patch is a better idea. [1]http://www.postgresql.org/docs/devel/static/functions-formatting.html [2]http://www.postgresql.org/docs/devel/static/functions-datetime.html [3]http://www.postgresql.org/message-id/CAEepm=0aznvy+frtyni04imdw4tlgzaelljqmhmcjbre+ln...@mail.gmail.com -- Best regards, Vitaly Burovoy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers