Re: Obtaining a more consistent view definition when a UNION subquery contains undecorated constants

2018-10-02 Thread Jacob Champion
On Thu, Sep 27, 2018 at 3:38 PM Tom Lane  wrote:
> Jimmy Yih  writes:
> > Looking at the internal code (mostly get_from_clause_item() function), we
> > saw that when a subquery is used, there is no tuple descriptor passed to
> > get_query_def() function. Because of this, get_target_list() uses the
> > resnames obtained from the pg_rewrite entry's ev_action field.  However, it
> > seems to be fairly simple to construct a dummy tuple descriptor from the
> > ev_action to pass down the call stack so that the column names will be
> > consistent when deparsing both T_A_Const and T_TypeCast parse tree nodes
> > involving a UNION.  Attached is a patch that demonstrates this.
>
> I'm afraid that this just moves the weird cases somewhere else, ie
> you might see an AS clause where you had none before, or a different
> AS clause from what you originally wrote.  The parser has emitted the
> same parse tree as if there were an explicit AS there; I doubt that
> we want ruleutils to second-guess that unless it really has to.

Can you give a quick example of something that "breaks" with this
approach? I think we're having trouble seeing it.

It might help to have some additional context on why we care: this
problem shows up in pg_upgrade testing, since we're basically
performing the following steps:
- create a view in the old database
- dump the old database schema
- load the schema into the new database
- dump the new database schema and compare to the old one

So from this perspective, we don't mind so much if the view definition
changes between creation and dump, but we do mind if it changes a
second time after the dump has been restored, since it shows up as a
false negative in the diff. In other words, we'd like the dumped view
definitions to be "stable" with respect to dumps and restores.

Thanks,
--Jacob



Re: Obtaining a more consistent view definition when a UNION subquery contains undecorated constants

2018-09-27 Thread Tom Lane
Jimmy Yih  writes:
> A colleague and I were playing around with dumping views and found an
> inconsistency for a view definition involving subqueries and undecorated
> constants in a UNION.  When we took that view definition and restored it,
> dumping the view gave different syntax again.  Although the slightly
> different view definitions were semantically the same, it was weird to see
> the syntax difference.

Yeah, this is a consequence of deciding to emit the explicit cast even
though there was none there to begin with.  In many cases we have to do
that to ensure that the view will re-parse with the same semantics, but
I wonder if it would be possible to suppress it for top-level SELECT
list items.  That would get rid of both stages of the apparent view
mutation not just one.  The question that requires thought is whether
the view could ever re-parse differently, but offhand it seems like it
could not: there's only one rule for what an implicitly typed constant
could become in that situation.

> Looking at the internal code (mostly get_from_clause_item() function), we
> saw that when a subquery is used, there is no tuple descriptor passed to
> get_query_def() function. Because of this, get_target_list() uses the
> resnames obtained from the pg_rewrite entry's ev_action field.  However, it
> seems to be fairly simple to construct a dummy tuple descriptor from the
> ev_action to pass down the call stack so that the column names will be
> consistent when deparsing both T_A_Const and T_TypeCast parse tree nodes
> involving a UNION.  Attached is a patch that demonstrates this.

I'm afraid that this just moves the weird cases somewhere else, ie
you might see an AS clause where you had none before, or a different
AS clause from what you originally wrote.  The parser has emitted the
same parse tree as if there were an explicit AS there; I doubt that
we want ruleutils to second-guess that unless it really has to.

regards, tom lane



Obtaining a more consistent view definition when a UNION subquery contains undecorated constants

2018-09-27 Thread Jimmy Yih
Hello,

A colleague and I were playing around with dumping views and found an
inconsistency for a view definition involving subqueries and undecorated
constants in a UNION.  When we took that view definition and restored it,
dumping the view gave different syntax again.  Although the slightly
different view definitions were semantically the same, it was weird to see
the syntax difference.

Our view SQL where 'bar' constant is not decorated:

CREATE VIEW fooview AS SELECT t.a FROM (SELECT 1 AS a, 'foo'::text AS baz
UNION SELECT 0 AS a, 'bar')t;
// view definition from pg_get_viewdef()
 SELECT t.a
   FROM ( SELECT 1 AS a,
'foo'::text AS baz
UNION
 SELECT 0 AS a,
'bar'::text) t;

Note that the type decorator is appended to 'bar' in the normal fashion.

Then taking the above view definition, and creating a view,

CREATE VIEW fooview AS SELECT t.a FROM (SELECT 1 AS a, 'foo'::text AS baz
UNION SELECT 0 AS a, 'bar'::text)t;
// view definition from pg_get_viewdef()
 SELECT t.a
   FROM ( SELECT 1 AS a,
'foo'::text AS baz
UNION
 SELECT 0 AS a,
'bar'::text AS text) t;

results in a view definition that has the alias 'AS text' appended to
'bar'::text.

Contrast this to creating a view without the subquery:

CREATE OR REPLACE VIEW fooview AS SELECT 1 AS a, 'foo'::text AS baz UNION
SELECT 0 AS a, 'bar';
// view definition from pg_get_viewdef()
 SELECT 1 AS a,
'foo'::text AS baz
UNION
 SELECT 0 AS a,
'bar'::text AS baz;

We see that this view will use the view's tuple descriptor to name the
columns.

Looking at the internal code (mostly get_from_clause_item() function), we
saw that when a subquery is used, there is no tuple descriptor passed to
get_query_def() function. Because of this, get_target_list() uses the
resnames obtained from the pg_rewrite entry's ev_action field.  However, it
seems to be fairly simple to construct a dummy tuple descriptor from the
ev_action to pass down the call stack so that the column names will be
consistent when deparsing both T_A_Const and T_TypeCast parse tree nodes
involving a UNION.  Attached is a patch that demonstrates this.

Running with the attached patch, it seems to work pretty well:

postgres=# CREATE VIEW fooview AS SELECT t.a FROM (SELECT 1 AS a,
'foo'::text AS baz UNION SELECT 0 AS a, 'bar')t;
postgres=# select pg_get_viewdef('fooview');
   pg_get_viewdef

  SELECT t.a
FROM ( SELECT 1 AS a,
 'foo'::text AS baz
 UNION
  SELECT 0 AS a,
 'bar'::text AS baz) t;
(1 row)

postgres=# CREATE VIEW fooview AS SELECT t.a FROM (SELECT 1 AS a,
'foo'::text AS baz UNION SELECT 0 AS a, 'bar'::text)t;
postgres=# select pg_get_viewdef('fooview');
   pg_get_viewdef

  SELECT t.a
FROM ( SELECT 1 AS a,
 'foo'::text AS baz
 UNION
  SELECT 0 AS a,
 'bar'::text AS baz) t;
(1 row)

Nested subqueries also work with the patch.  We're not sure how this could
break.

Is this an acceptable change that should be pursued?

Regards,
Jimmy Yih and Jim Doty


view_union_subquery_constant.patch
Description: Binary data