On Dec 29, 2010, at 7:56 AM, Noah Misch <n...@leadboat.com> wrote:
> ALTER TABLE ALTER TYPE always rewrites the table heap and its indexes.  In 
> some
> cases, we can determine that doing so is unhelpful, and that the conversion
> shall always succeed:
> 
> CREATE DOMAIN loosedom AS text;
> CREATE TABLE t (c varchar(2));
> ALTER TABLE t ALTER c TYPE varchar(4);
> ALTER TABLE t ALTER c TYPE text;
> ALTER TABLE t ALTER c TYPE loosedom;
> 
> In other cases, we can determine that the rewrite is unhelpful, but a cast 
> could
> still throw an error:
> 
> CREATE DOMAIN tightdom AS text CHECK (value LIKE '<%/>');
> CREATE TABLE t (c text);
> ALTER TABLE t ALTER c TYPE xml USING c::xml;
> ALTER TABLE t ALTER c TYPE varchar(64);
> ALTER TABLE t ALTER c TYPE tightdom;
> 
> I wish to replace table rewrites with table verification scans where possible,
> then skip those verification scans where possible.

Seems like a good idea.

> Having thought on it
> more, though, it actually seems best to attempt the verification scan *every*
> time.  In most ineligible conversions, an inequality will appear very early, 
> so
> the scan is effectively O(1) in the negative case.  A notable exception is
> something like char(6)->varchar(6) in a table with a billion tuples having
> length(col) = 6 and one with length(col) = 5.  The verification scan might 
> read
> most of the table before finding the one tuple that forces a rewrite. That
> isn't a particularly regular scenario in my experience, so the "just do the
> right thing" aspect of preceding every potential rewrite with a verification
> scan seems to win out.

I think this scenario will be more common than you might think.  Tables don't 
contain random data; they contain data that the DBA thinks is valid.  The 
situation where the data is mostly as you expect but with a few kooky rows is, 
in my experience, extremely common.  And it makes the worst case a LOT worse.  
I really doubt this is worth the complexity anyway - converting between two 
types that are mostly-but-not-quite binary compatible seems like an edge case 
in every sense.

> Certain very popular type changes (see introduction) can be _exempt_ from the
> verification scan: we can determine that they will always succeed.  To capture
> that, I propose extending CREATE CAST with the notion of an exemptor function:
> 
> CREATE CAST (source_type AS target_type)
>    { WITH FUNCTION function_name (argument_type [, ...])
>          [ WITH EXEMPTOR function_name ] |
>      WITHOUT FUNCTION |
>      WITH INOUT }
>    [ AS ASSIGNMENT | AS IMPLICIT ]
> 
> The exemptor shall have this signature:
> 
> exemptor_func(
>    integer, -- source_typmod
>    integer  -- dest_typmod
> ) RETURNS boolean
> 
> The exemptor shall return true iff datumIsEqual(x, 
> x::target_type(dest_typmod))
> for every x in source_type most recently coerced to 
> source_type(source_typmod).
> When the cast is WITHOUT FUNCTION (a binary coercion) and target_type lacks a
> length coercion cast, the cast has an implicit exemption, and an exemptor is
> superfluous: code can assume an exemptor that always returns true.  Use of 
> WITH
> EXEMPTOR mainly makes sense alongside multi-arg WITH FUNCTION (should probably
> raise an error upon other uses).

I am not sure whether it's worth trying to be general here. Maybe we should 
just hard-code the known cases involving core datatypes.

> 

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to