Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-03 Thread Tom Lane
Matthias van de Meent  writes:
> I don't quite get what the hard problem is that we haven't already
> solved for other systems:
> We already can add additional constraints to domains (e.g. VALUE::int
> <> 4), which (according to docs) scan existing data columns for
> violations.

That's "solved" only for rather small values of "solved".  While the
code does try to look for violations of the new constraint, there is
no interlock against race conditions (ie, concurrent insertions of
a conflicting value).  It doesn't check temp tables belonging to
other backends, because it can't.  And IIRC it doesn't look for
instances in metadata, such as stored views.

The reason we've considered that Good Enough(TM) for domain
constraints is that if something does sneak through those loopholes,
nothing terribly surprising happens.  You've got a value there that
shouldn't be there, but that's mostly your own fault, and the
system continues to behave sanely.  Also you don't have any problem
introspecting what you've got, because such values will still print
normally.  Plus, we can't really guarantee that users won't get
into such a state in other ways, for example if their constraint
isn't really immutable.

The problem with dropping an enum value is that surprising things
might very well happen, because removal of the pg_enum row risks
comparisons failing if they involve the dropped value.  Thus for
example you might find yourself with a broken index that fails
all insertion and search attempts, even if you'd carefully removed
every user-visible instance of the doomed value: an instance of
it high up in the index tree will break most index accesses.
Even for values in user-visible places like views, the fact that
enum_out will fail doesn't make it any easier to figure out what
is wrong.

We might be able to get to a place where the surprise factor is
low enough to tolerate, but the domain-constraint precedent isn't
good enough for that IMO.

Andrew's idea of DISABLE rather than full DROP is one way of
ameliorating these problems: comparisons would still work, and
we can still print a value that perhaps shouldn't have been there.
But it's not without other problems.

> We already drop columns without rewriting the table to
> remove the column's data, and reject new data insertions for those
> still-in-the-catalogs-but-inaccessible columns.

Those cases don't seem to have a lot of connection to the enum problem.

> The only real issue that I can think of is making sure that concurrent
> backends don't modify this data, but that shouldn't be very different
> from the other locks we already have to take in e.g. ALTER TYPE ...
> DROP ATTRIBUTE.

I'd bet a good deal of money that those cases aren't too bulletproof.
We do not lock types simply because somebody has a value of the type
in flight somewhere in a query, and the cost of doing so would be
quite discouraging I fear.  On the whole, I'd rather accept the
idea that the DROP might not be completely watertight; but then
we have to work out the details of coping with orphaned values
in an acceptable way.

regards, tom lane




Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-03 Thread Matthias van de Meent
On Tue, 3 Oct 2023 at 22:49, Tom Lane  wrote:
>
> Andrew Dunstan  writes:
> > On 2023-09-28 Th 14:46, Tom Lane wrote:
> >> We went through all these points years ago when the enum feature
> >> was first developed, as I recall.  Nobody thought that the ability
> >> to remove an enum value was worth the amount of complexity it'd
> >> entail.
>
> > That's quite true, and I accept my part in this history. But I'm not
> > sure we were correct back then.
>
> I think it was the right decision at the time, given that the
> alternative was to not add the enum feature at all.  The question
> is whether we're now prepared to do additional work to support DROP
> VALUE.  But the tradeoff still looks pretty grim, because the
> problems haven't gotten any easier.
>
> I've been trying to convince myself that there'd be some value in
> your idea about a DISABLE flag, but I feel like there's something
> missing there.  The easiest implementation would be to have
> enum_in() reject disabled values, while still allowing enum_out()
> to print them.  But that doesn't seem to lead to nice results:
>
> [...]
>
> On the whole this is still a long way from a clean easy-to-use DROP
> facility, and it adds a lot of complexity of its own for pg_dump.
> So I'm not sure we want to build it.

I don't quite get what the hard problem is that we haven't already
solved for other systems:
We already can add additional constraints to domains (e.g. VALUE::int
<> 4), which (according to docs) scan existing data columns for
violations. We already drop columns without rewriting the table to
remove the column's data, and reject new data insertions for those
still-in-the-catalogs-but-inaccessible columns.

So, if a user wants to drop an enum value, why couldn't we "just" use
the DOMAIN facilities and 1.) add a constraint WHERE value NOT IN
(deleted_values), and after validation of that constraint 2.) mark the
enum value as deleted like we do with table column's pg_attribute
entries?

The only real issue that I can think of is making sure that concurrent
backends don't modify this data, but that shouldn't be very different
from the other locks we already have to take in e.g. ALTER TYPE ...
DROP ATTRIBUTE.

Kind regards,

Matthias van de Meent




Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-03 Thread Tom Lane
Andrew Dunstan  writes:
> On 2023-09-28 Th 14:46, Tom Lane wrote:
>> We went through all these points years ago when the enum feature
>> was first developed, as I recall.  Nobody thought that the ability
>> to remove an enum value was worth the amount of complexity it'd
>> entail.

> That's quite true, and I accept my part in this history. But I'm not 
> sure we were correct back then.

I think it was the right decision at the time, given that the
alternative was to not add the enum feature at all.  The question
is whether we're now prepared to do additional work to support DROP
VALUE.  But the tradeoff still looks pretty grim, because the
problems haven't gotten any easier.

I've been trying to convince myself that there'd be some value in
your idea about a DISABLE flag, but I feel like there's something
missing there.  The easiest implementation would be to have
enum_in() reject disabled values, while still allowing enum_out()
to print them.  But that doesn't seem to lead to nice results:

* You couldn't do, say,
SELECT * FROM my_table WHERE enum_col = 'disabled_value'
to look for rows that you need to clean up.  I guess this'd work:
SELECT * FROM my_table WHERE enum_col::text = 'disabled_value'
but it's un-obvious and could not use an index on enum_col.

* If any of the disabled values remain, dump/restore would fail.
Maybe you'd want that to be sure you got rid of them, but it sounds
like a foot-gun.  ("What do you mean, our only backup doesn't
restore?")  Probably people would wish for two different behaviors:
either don't list disabled values at all in the dumped CREATE TYPE,
or do list them but disable them only after loading data.  The latter
approach will still have problems in data-only restores, but there are
comparable hazards with things like foreign keys.  (pg_upgrade would
need still a third behavior, perhaps.)

On the whole this is still a long way from a clean easy-to-use DROP
facility, and it adds a lot of complexity of its own for pg_dump.
So I'm not sure we want to build it.

regards, tom lane




Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-03 Thread Vik Fearing

On 10/3/23 17:44, Tom Lane wrote:

Vik Fearing  writes:

On 10/2/23 20:07, Dagfinn Ilmari Mannsåker wrote:

FWIW I'm +1 on this patch,



Thanks.



and with Tom on dropping the "yet".  To me it
makes it sound like we intend to implement it soon (fsvo).



I am not fundamentally opposed to it, nor to any other wordsmithing the
committer (probably Tom) wants to do.  The main point of the patch is to
list at least some of the problems that need to be solved in a correct
implementation.


Pushed with a bit more work on the text.

I left out the regression test, as it seems like it'd add test cycles
to little purpose.  It won't do anything to improve the odds that
someone finds this text.


Thanks!
--
Vik Fearing





Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-03 Thread Tom Lane
Vik Fearing  writes:
> On 10/2/23 20:07, Dagfinn Ilmari Mannsåker wrote:
>> FWIW I'm +1 on this patch,

> Thanks.

>> and with Tom on dropping the "yet".  To me it
>> makes it sound like we intend to implement it soon (fsvo).

> I am not fundamentally opposed to it, nor to any other wordsmithing the 
> committer (probably Tom) wants to do.  The main point of the patch is to 
> list at least some of the problems that need to be solved in a correct 
> implementation.

Pushed with a bit more work on the text.

I left out the regression test, as it seems like it'd add test cycles
to little purpose.  It won't do anything to improve the odds that
someone finds this text.

regards, tom lane




Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-02 Thread Vik Fearing

On 10/2/23 20:07, Dagfinn Ilmari Mannsåker wrote:

Vik Fearing  writes:


No one except you has said anything about this patch.  I think it would
be good to commit it, wordsmithing aside.


FWIW I'm +1 on this patch,


Thanks.


and with Tom on dropping the "yet".  To me it
makes it sound like we intend to implement it soon (fsvo).
I am not fundamentally opposed to it, nor to any other wordsmithing the 
committer (probably Tom) wants to do.  The main point of the patch is to 
list at least some of the problems that need to be solved in a correct 
implementation.

--
Vik Fearing





Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-02 Thread Dagfinn Ilmari Mannsåker
Vik Fearing  writes:

> On 9/29/23 03:17, Tom Lane wrote:
>> Vik Fearing  writes:
>>> On 9/28/23 20:46, Tom Lane wrote:
 We went through all these points years ago when the enum feature
 was first developed, as I recall.  Nobody thought that the ability
 to remove an enum value was worth the amount of complexity it'd
 entail.
>> 
>>> This issue comes up regularly (although far from often).  Do we want to
>>> put some comments right where would-be implementors would be sure to see it?
>> Perhaps.  I'd be kind of inclined to leave the "yet" out of "not yet
>> implemented" in the error message, as that wording sounds like we just
>> haven't got round to it.
>
> I see your point, but should we be dissuading people who might want to
> work on solving those problems?  I intentionally did not document that 
> this syntax exists so the only people seeing the message are those who
> just try it, and those wanting to write a patch like Danil did.
>
> No one except you has said anything about this patch.  I think it would
> be good to commit it, wordsmithing aside.

FWIW I'm +1 on this patch, and with Tom on dropping the "yet".  To me it
makes it sound like we intend to implement it soon (fsvo).

- ilmari




Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-02 Thread Vik Fearing

On 9/29/23 03:17, Tom Lane wrote:

Vik Fearing  writes:

On 9/28/23 20:46, Tom Lane wrote:

We went through all these points years ago when the enum feature
was first developed, as I recall.  Nobody thought that the ability
to remove an enum value was worth the amount of complexity it'd
entail.



This issue comes up regularly (although far from often).  Do we want to
put some comments right where would-be implementors would be sure to see it?


Perhaps.  I'd be kind of inclined to leave the "yet" out of "not yet
implemented" in the error message, as that wording sounds like we just
haven't got round to it.


I see your point, but should we be dissuading people who might want to 
work on solving those problems?  I intentionally did not document that 
this syntax exists so the only people seeing the message are those who 
just try it, and those wanting to write a patch like Danil did.


No one except you has said anything about this patch.  I think it would 
be good to commit it, wordsmithing aside.

--
Vik Fearing





Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-29 Thread Andrew Dunstan



On 2023-09-28 Th 14:46, Tom Lane wrote:

Andrew Dunstan  writes:

I wonder if we could have a boolean flag in pg_enum, indicating that
setting an enum to that value was forbidden.

Yeah, but that still offers no coherent solution to the problem of
what happens if there's a table that already contains such a value.
It doesn't seem terribly useful to forbid new entries if you can't
get rid of old ones.

Admittedly, a DISABLE flag would at least offer a chance at a
race-condition-free scan to verify that no such values remain
in tables.  But as somebody already mentioned upthread, that
wouldn't guarantee that the value doesn't appear in non-leaf
index pages.  So basically you could never get rid of the pg_enum
row, short of a full dump and restore.



or a reindex, I think, although getting the timing right would be messy. 
I agree the non-leaf index pages are rather pesky in dealing with this.


I guess the alternative would be to create a new enum with the 
to-be-deleted value missing, and then alter the column type to the new 
enum type. For massive tables that would be painful.





We went through all these points years ago when the enum feature
was first developed, as I recall.  Nobody thought that the ability
to remove an enum value was worth the amount of complexity it'd
entail.





That's quite true, and I accept my part in this history. But I'm not 
sure we were correct back then.



cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Tom Lane
Vik Fearing  writes:
> On 9/28/23 20:46, Tom Lane wrote:
>> We went through all these points years ago when the enum feature
>> was first developed, as I recall.  Nobody thought that the ability
>> to remove an enum value was worth the amount of complexity it'd
>> entail.

> This issue comes up regularly (although far from often).  Do we want to 
> put some comments right where would-be implementors would be sure to see it?

Perhaps.  I'd be kind of inclined to leave the "yet" out of "not yet
implemented" in the error message, as that wording sounds like we just
haven't got round to it.

regards, tom lane




Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Vik Fearing

On 9/28/23 20:46, Tom Lane wrote:

Andrew Dunstan  writes:

I wonder if we could have a boolean flag in pg_enum, indicating that
setting an enum to that value was forbidden.


Yeah, but that still offers no coherent solution to the problem of
what happens if there's a table that already contains such a value.
It doesn't seem terribly useful to forbid new entries if you can't
get rid of old ones.

Admittedly, a DISABLE flag would at least offer a chance at a
race-condition-free scan to verify that no such values remain
in tables.  But as somebody already mentioned upthread, that
wouldn't guarantee that the value doesn't appear in non-leaf
index pages.  So basically you could never get rid of the pg_enum
row, short of a full dump and restore.

We went through all these points years ago when the enum feature
was first developed, as I recall.  Nobody thought that the ability
to remove an enum value was worth the amount of complexity it'd
entail.


This issue comes up regularly (although far from often).  Do we want to 
put some comments right where would-be implementors would be sure to see it?


Attached is an example of what I mean.  Documentation is intentionally 
omitted.

--
Vik Fearing
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7d2032885e..f8d70cdaa0 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -6404,6 +6404,29 @@ AlterEnumStmt:
 n->skipIfNewValExists = false;
 $$ = (Node *) n;
 			}
+		 | ALTER TYPE_P any_name DROP VALUE_P Sconst
+			{
+/*
+ * The following problems must be solved before this can be
+ * implemented:
+ *
+ * - There can be no data of this type using this value in
+ *   any table
+ *
+ * - The value may not appear in any non-leaf page of a
+ *   btree (and similar issues with other index types)
+ *
+ * - Concurrent sessions must not be able to insert this
+ *   value while the preceding conditions are being checked
+ *
+ * - Possibly more...
+ */
+
+ereport(ERROR,
+		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		 errmsg("dropping an enum value is not yet implemented"),
+		 parser_errposition(@4)));
+			}
 		 ;
 
 opt_if_not_exists: IF_P NOT EXISTS  { $$ = true; }
diff --git a/src/test/regress/expected/enum.out b/src/test/regress/expected/enum.out
index 01159688e5..105ae7a2f9 100644
--- a/src/test/regress/expected/enum.out
+++ b/src/test/regress/expected/enum.out
@@ -605,6 +605,11 @@ ERROR:  "red" is not an existing enum label
 -- check that renaming to an existent value fails
 ALTER TYPE rainbow RENAME VALUE 'blue' TO 'green';
 ERROR:  enum label "green" already exists
+-- We still can't drop an enum value
+ALTER TYPE rainbow DROP VALUE 'green';
+ERROR:  dropping an enum value is not yet implemented
+LINE 1: ALTER TYPE rainbow DROP VALUE 'green';
+   ^
 --
 -- check transactional behaviour of ALTER TYPE ... ADD VALUE
 --
diff --git a/src/test/regress/sql/enum.sql b/src/test/regress/sql/enum.sql
index 93171379f2..a98df40ed8 100644
--- a/src/test/regress/sql/enum.sql
+++ b/src/test/regress/sql/enum.sql
@@ -276,6 +276,9 @@ ALTER TYPE rainbow RENAME VALUE 'red' TO 'crimson';
 -- check that renaming to an existent value fails
 ALTER TYPE rainbow RENAME VALUE 'blue' TO 'green';
 
+-- We still can't drop an enum value
+ALTER TYPE rainbow DROP VALUE 'green';
+
 --
 -- check transactional behaviour of ALTER TYPE ... ADD VALUE
 --


Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Tom Lane
Andrew Dunstan  writes:
> I wonder if we could have a boolean flag in pg_enum, indicating that 
> setting an enum to that value was forbidden.

Yeah, but that still offers no coherent solution to the problem of
what happens if there's a table that already contains such a value.
It doesn't seem terribly useful to forbid new entries if you can't
get rid of old ones.

Admittedly, a DISABLE flag would at least offer a chance at a
race-condition-free scan to verify that no such values remain
in tables.  But as somebody already mentioned upthread, that
wouldn't guarantee that the value doesn't appear in non-leaf
index pages.  So basically you could never get rid of the pg_enum
row, short of a full dump and restore.

We went through all these points years ago when the enum feature
was first developed, as I recall.  Nobody thought that the ability
to remove an enum value was worth the amount of complexity it'd
entail.

regards, tom lane




Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Andrew Dunstan



On 2023-09-28 Th 10:28, Tom Lane wrote:

=?UTF-8?B?0JTQsNC90LjQuyDQodGC0L7Qu9C/0L7QstGB0LrQuNGF?= 
 writes:

I would like to offer my patch on the problem of removing values from enums
It adds support for expression ALTER TYPE  DROP VALUE


This does not fix any of the hard problems that caused us not to
have such a feature to begin with.  Notably, what happens to
stored data of the enum type if it is a now-deleted value?





I wonder if we could have a boolean flag in pg_enum, indicating that 
setting an enum to that value was forbidden. That wouldn't delete the 
value but it wouldn't show up in enum_range and friends. We'd have to 
teach pg_dump and pg_upgrade to deal with it, but that shouldn't be too 
hard.



Perhaps the command could be something like


ALTER TYPE enum_name DISABLE value;


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Tom Lane
=?UTF-8?B?0JTQsNC90LjQuyDQodGC0L7Qu9C/0L7QstGB0LrQuNGF?= 
 writes:
> I would like to offer my patch on the problem of removing values from enums
> It adds support for expression ALTER TYPE  DROP VALUE
> 

This does not fix any of the hard problems that caused us not to
have such a feature to begin with.  Notably, what happens to
stored data of the enum type if it is a now-deleted value?

regards, tom lane




Re: Allow deleting enumerated values from an existing enumerated data type

2023-09-28 Thread Vik Fearing

On 9/28/23 14:13, Данил Столповских wrote:

Greetings, everyone!
I would like to offer my patch on the problem of removing values from enums

It adds support for expression ALTER TYPE  DROP VALUE


Added:
1. expression in grammar
2. function to drop enum values
3. regression tests
4. documentation


Thanks for this patch that a lot of people want.

However, it does not seem to address the issue of how to handle the 
dropped value being in the high key of an index.  Until we solve that 
problem (and maybe others), this kind of patch is insufficient to add 
the feature.

--
Vik Fearing