Re: Making the subquery alias optional in the FROM clause
Dean Rasheed writes: > On Mon, 2 Oct 2023 at 01:01, Tom Lane wrote: >> Yeah, that's exposing more of the implementation than we really want. > Note that this isn't a new issue, specific to unnamed subqueries. The > same thing happens for unnamed joins: True, and we've had few complaints about that. Still, if we can clean it up without too much effort, let's do so. > So I think perhaps it's better to just have searchRangeTableForRel() > exclude these kinds of RTE, if they haven't been given an alias. Would we need a new flag in the ParseNamespaceItem data structure, or will the existing data serve? I see how to do this if we add a "doesn't really have a name" flag, but it's not clear to me that we can reliably identify them otherwise. Maybe a test involving the rtekind and whether the "alias" field is set would do, but that way seems a bit ugly. regards, tom lane
Re: Making the subquery alias optional in the FROM clause
On Mon, 2 Oct 2023 at 01:01, Tom Lane wrote: > > Erwin Brandstetter writes: > > On Mon, 2 Oct 2023 at 00:33, Dean Rasheed wrote: > >> The only place that exposes the eref's made-up relation name is the > >> existing query deparsing code in ruleutils.c, which uniquifies it and > >> generates SQL spec-compliant output. For example: > > > I ran into one other place: error messages. > > SELECT unnamed_subquery.a > > FROM (SELECT 1 AS a) > > ERROR: There is an entry for table "unnamed_subquery", but it cannot be > > referenced from this part of the query.invalid reference to FROM-clause > > entry for table "unnamed_subquery" > > Yeah, that's exposing more of the implementation than we really want. > Note that this isn't a new issue, specific to unnamed subqueries. The same thing happens for unnamed joins: create table foo(a int); create table bar(a int); select unnamed_join.a from foo join bar using (a); ERROR: invalid reference to FROM-clause entry for table "unnamed_join" LINE 1: select unnamed_join.a from foo join bar using (a); ^ DETAIL: There is an entry for table "unnamed_join", but it cannot be referenced from this part of the query. And there's a similar problem with VALUES RTEs: insert into foo values (1),(2) returning "*VALUES*".a; ERROR: invalid reference to FROM-clause entry for table "*VALUES*" LINE 1: insert into foo values (1),(2) returning "*VALUES*".a; ^ DETAIL: There is an entry for table "*VALUES*", but it cannot be referenced from this part of the query. > I'm inclined to think we should avoid letting "unnamed_subquery" > appear in the parse tree, too. It might not be a good idea to > try to leave the eref field null, but could we set it to an > empty string instead, that is > > - eref = alias ? copyObject(alias) : makeAlias("unnamed_subquery", NIL); > + eref = alias ? copyObject(alias) : makeAlias("", NIL); > > and then let ruleutils replace that with "unnamed_subquery"? Hmm, I think that there would be other side-effects if we did that -- at least doing it for VALUES RTEs would also require additional changes to retain current EXPLAIN output. I think perhaps it would be better to try for a more targeted fix of the parser error reporting. In searchRangeTableForRel() we try to find any RTE that could possibly match the RangeVar, but certain kinds of RTE don't naturally have names, and if they also haven't been given aliases, then they can't possibly match anywhere in the query (and thus it's misleading to report that they can't be referred to from specific places). So I think perhaps it's better to just have searchRangeTableForRel() exclude these kinds of RTE, if they haven't been given an alias. Regards, Dean diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c new file mode 100644 index 864ea9b..0afe177 --- a/src/backend/parser/parse_relation.c +++ b/src/backend/parser/parse_relation.c @@ -395,6 +395,20 @@ searchRangeTableForRel(ParseState *pstat { RangeTblEntry *rte = (RangeTblEntry *) lfirst(l); + /* + * Ignore RTEs that do not automatically take their names from + * database objects, and have not been supplied with aliases + * (e.g., unnamed joins). Such RTEs cannot possibly match the + * RangeVar, and cannot be referenced from anywhere in the query. + * Thus, it would be misleading to complain that they can't be + * referenced from particular parts of the query. + */ + if (!rte->alias && +(rte->rtekind == RTE_SUBQUERY || + rte->rtekind == RTE_JOIN || + rte->rtekind == RTE_VALUES)) +continue; + if (rte->rtekind == RTE_RELATION && OidIsValid(relId) && rte->relid == relId) diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out new file mode 100644 index 9b8638f..2bbf3bf --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -2895,6 +2895,26 @@ select bar.*, unnamed_join.* from ERROR: FOR UPDATE cannot be applied to a join LINE 3: for update of bar; ^ +-- Test error reporting of references to internal aliases +select unnamed_join.* from + t1 join t2 on (t1.a = t2.a); +ERROR: missing FROM-clause entry for table "unnamed_join" +LINE 1: select unnamed_join.* from + ^ +select unnamed_join.* from + (select * from t1 join t2 using (a)) ss join t3 on (t3.x = ss.a); +ERROR: missing FROM-clause entry for table "unnamed_join" +LINE 1: select unnamed_join.* from + ^ +select unnamed_subquery.* from + (select * from t1); +ERROR: missing FROM-clause entry for table "unnamed_subquery" +LINE 1: select unnamed_subquery.* from + ^ +insert into t1 values (1, 10), (2, 20) returning "*VALUES*".*; +ERROR: missing FROM-clause entry for table "*VALUES*" +LINE 1: insert into t1 values (1, 10), (2, 20) returning "*VALUES*" + ^
Re: Making the subquery alias optional in the FROM clause
Erwin Brandstetter writes: > On Mon, 2 Oct 2023 at 00:33, Dean Rasheed wrote: >> The only place that exposes the eref's made-up relation name is the >> existing query deparsing code in ruleutils.c, which uniquifies it and >> generates SQL spec-compliant output. For example: > I ran into one other place: error messages. > SELECT unnamed_subquery.a > FROM (SELECT 1 AS a) > ERROR: There is an entry for table "unnamed_subquery", but it cannot be > referenced from this part of the query.invalid reference to FROM-clause > entry for table "unnamed_subquery" Yeah, that's exposing more of the implementation than we really want. > Notably, the same does not happen for "unnamed_subquery_1": > SELECT unnamed_subquery_1.a > FROM (SELECT 1 AS a), (SELECT 1 AS a) > ERROR: missing FROM-clause entry for table "unnamed_subquery_1" Actually, that happens because "unnamed_subquery_1" *isn't* in the parse tree. As implemented, both RTEs are labeled "unnamed_subquery" in the parser output, and it's ruleutils that de-duplicates them. I'm inclined to think we should avoid letting "unnamed_subquery" appear in the parse tree, too. It might not be a good idea to try to leave the eref field null, but could we set it to an empty string instead, that is - eref = alias ? copyObject(alias) : makeAlias("unnamed_subquery", NIL); + eref = alias ? copyObject(alias) : makeAlias("", NIL); and then let ruleutils replace that with "unnamed_subquery"? This would prevent accessing the subquery name in the way Erwin shows, because we don't let you write an empty identifier in SQL: regression=# select "".a from (select 1 as a); ERROR: zero-length delimited identifier at or near LINE 1: select "".a from (select 1 as a); ^ However, there might then be some parser error messages that refer to subquery "", so I'm not sure if this is totally without surprises either. regards, tom lane
Re: Making the subquery alias optional in the FROM clause
On Mon, 2 Oct 2023 at 00:33, Dean Rasheed wrote: > The only place that exposes the eref's made-up relation name is the > existing query deparsing code in ruleutils.c, which uniquifies it and > generates SQL spec-compliant output. For example: > I ran into one other place: error messages. SELECT unnamed_subquery.a FROM (SELECT 1 AS a) > ERROR: There is an entry for table "unnamed_subquery", but it cannot be referenced from this part of the query.invalid reference to FROM-clause entry for table "unnamed_subquery" Normally, we would find the cited name somewhere in the query. Confusing. Notably, the same does not happen for "unnamed_subquery_1": SELECT unnamed_subquery_1.a FROM (SELECT 1 AS a), (SELECT 1 AS a) > ERROR: missing FROM-clause entry for table "unnamed_subquery_1" That's the message anybody would expect. Also makes sense, as "uniquification" only happens in the above quoted case, and all invisible aliases seem to be "unnamed_subquery" at this point? But a bit confusing on a different level. Maybe error messages should not be aware of invisible aliases, and just complain about "missing FROM-clause entry"? Not sure whether a fix would be easy, nor whether it would be worth the effort. Just wanted to document the corner case issue in this thread. Regards Erwin
Re: Making the subquery alias optional in the FROM clause
Dean Rasheed writes: > Attached is an update with the following changes: > * Docs updated as suggested. > * transformLockingClause() updated to skip subquery and values rtes > without aliases. > * eref->aliasname changed to "unnamed_subquery" for subqueries without > aliases. This looks good to me. Marked RFC. regards, tom lane
Re: Making the subquery alias optional in the FROM clause
On Sat, Jul 9, 2022 at 5:18 AM Dean Rasheed wrote: > On Sat, 9 Jul 2022 at 12:24, Zhihong Yu wrote: > > > > It seems the code would be more readable if you keep the assignment in > else block below: > > > > + else if (rte->rtekind == RTE_SUBQUERY || > > +rte->rtekind == RTE_VALUES) > > continue; > > - rtename = rte->join_using_alias->aliasname; > > } > > - else > > - rtename = rte->eref->aliasname; > > > > because rtename would be assigned in the `rte->rtekind == RTE_JOIN` case. > > > > But then it would need 2 else blocks, one inside the rte->alias == > NULL block, for when rtekind is not RTE_JOIN, RTE_SUBQUERY or > RTE_VALUES, and another after the block, for when rte->alias != NULL. > I find it more readable this way. > > Regards, > Dean > Hi, Dean: Thanks for the explanation. I should have looked closer :-)
Re: Making the subquery alias optional in the FROM clause
On Sat, 9 Jul 2022 at 12:24, Zhihong Yu wrote: > > It seems the code would be more readable if you keep the assignment in else > block below: > > + else if (rte->rtekind == RTE_SUBQUERY || > +rte->rtekind == RTE_VALUES) > continue; > - rtename = rte->join_using_alias->aliasname; > } > - else > - rtename = rte->eref->aliasname; > > because rtename would be assigned in the `rte->rtekind == RTE_JOIN` case. > But then it would need 2 else blocks, one inside the rte->alias == NULL block, for when rtekind is not RTE_JOIN, RTE_SUBQUERY or RTE_VALUES, and another after the block, for when rte->alias != NULL. I find it more readable this way. Regards, Dean
Re: Making the subquery alias optional in the FROM clause
On Sat, Jul 9, 2022 at 3:28 AM Dean Rasheed wrote: > On Wed, 6 Jul 2022 at 15:09, Dean Rasheed > wrote: > > > > I'll post an update in a little while, but first, I found a bug, which > > revealed a pre-existing bug in transformLockingClause(). I'll start a > > new thread for that, since it'd be good to get that resolved > > independently of this patch. > > > > Attached is an update with the following changes: > > * Docs updated as suggested. > * transformLockingClause() updated to skip subquery and values rtes > without aliases. > * eref->aliasname changed to "unnamed_subquery" for subqueries without > aliases. > > Regards, > Dean > Hi, rtename is assigned at the beginning of the loop: + char *rtename = rte->eref->aliasname; It seems the code would be more readable if you keep the assignment in else block below: + else if (rte->rtekind == RTE_SUBQUERY || +rte->rtekind == RTE_VALUES) continue; - rtename = rte->join_using_alias->aliasname; } - else - rtename = rte->eref->aliasname; because rtename would be assigned in the `rte->rtekind == RTE_JOIN` case. Cheers
Re: Making the subquery alias optional in the FROM clause
On Wed, 6 Jul 2022 at 15:09, Dean Rasheed wrote: > > I'll post an update in a little while, but first, I found a bug, which > revealed a pre-existing bug in transformLockingClause(). I'll start a > new thread for that, since it'd be good to get that resolved > independently of this patch. > Attached is an update with the following changes: * Docs updated as suggested. * transformLockingClause() updated to skip subquery and values rtes without aliases. * eref->aliasname changed to "unnamed_subquery" for subqueries without aliases. Regards, Dean diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml new file mode 100644 index 80bb8ad..410c80e --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -51,7 +51,7 @@ SELECT [ ALL | DISTINCT [ ON ( table_name [ * ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ] [ TABLESAMPLE sampling_method ( argument [, ...] ) [ REPEATABLE ( seed ) ] ] -[ LATERAL ] ( select ) [ AS ] alias [ ( column_alias [, ...] ) ] +[ LATERAL ] ( select ) [ [ AS ] alias [ ( column_alias [, ...] ) ] ] with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ] [ LATERAL ] function_name ( [ argument [, ...] ] ) [ WITH ORDINALITY ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ] @@ -490,8 +490,8 @@ TABLE [ ONLY ] VALUES command can also be used here. @@ -2041,6 +2041,16 @@ SELECT 2+2; + + Omitting sub-SELECT aliases in FROM + + +According to the SQL standard, a sub-SELECT in the +FROM list must have an alias. In +PostgreSQL, this alias may be omitted. + + + ONLY and Inheritance diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c new file mode 100644 index 8ed2c4b..6688c2a --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -3291,7 +3291,7 @@ transformLockingClause(ParseState *pstat foreach(rt, qry->rtable) { RangeTblEntry *rte = (RangeTblEntry *) lfirst(rt); -char *rtename; +char *rtename = rte->eref->aliasname; ++i; if (!rte->inFromCl) @@ -3302,15 +3302,22 @@ transformLockingClause(ParseState *pstat * name and needs to be skipped (otherwise it might hide a * base relation with the same name), except if it has a USING * alias, which *is* visible. + * + * Subquery and values RTEs without aliases are never visible + * as relation names and must always be skipped. */ -if (rte->rtekind == RTE_JOIN && rte->alias == NULL) +if (rte->alias == NULL) { - if (rte->join_using_alias == NULL) + if (rte->rtekind == RTE_JOIN) + { + if (rte->join_using_alias == NULL) + continue; + rtename = rte->join_using_alias->aliasname; + } + else if (rte->rtekind == RTE_SUBQUERY || + rte->rtekind == RTE_VALUES) continue; - rtename = rte->join_using_alias->aliasname; } -else - rtename = rte->eref->aliasname; if (strcmp(rtename, thisrel->relname) == 0) { diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y new file mode 100644 index 0523013..0f8d056 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -13346,33 +13346,6 @@ table_ref: relation_expr opt_alias_claus n->lateral = false; n->subquery = $1; n->alias = $2; - /* - * The SQL spec does not permit a subselect - * () without an alias clause, - * so we don't either. This avoids the problem - * of needing to invent a unique refname for it. - * That could be surmounted if there's sufficient - * popular demand, but for now let's just implement - * the spec and see if anyone complains. - * However, it does seem like a good idea to emit - * an error message that's better than "syntax error". - */ - if ($2 == NULL) - { - if (IsA($1, SelectStmt) && - ((SelectStmt *) $1)->valuesLists) - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("VALUES in FROM must have an alias"), - errhint("For example, FROM (VALUES ...) [AS] foo."), - parser_errposition(@1))); - else - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("subquery in FROM must have an alias"), - errhint("For example, FROM (SELECT ...) [AS] foo."), - parser_errposition(@1))); - } $$ = (Node *) n; } | LATERAL_P select_with_parens opt_alias_clause @@ -13382,23 +13355,6 @@ table_ref: relation_expr opt_alias_claus n->lateral = true; n->subquery = $2; n->alias = $3; - /* same comment as above */ - if ($3 == NULL) - { - if (IsA($2, SelectStmt) && - ((SelectStmt *) $2)->valuesLists) - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("VALUES in FROM must have an alias"), - errhint("For example, FROM (VALUES ...) [AS] foo."), - parser_errposition(@2)))
Re: Making the subquery alias optional in the FROM clause
On Tue, 5 Jul 2022 at 19:00, Tom Lane wrote: > > Dean Rasheed writes: > > This was discussed previously in [1], and there seemed to be general > > consensus in favour of it, but no new patch emerged. > > As I said in that thread, I'm not super enthused about this, but I was > clearly in the minority so I think it should go forward. > Cool. Thanks for looking. > > Attached is a patch that takes the approach of not generating an alias > > at all, which seems to be neater and simpler, and less code than > > trying to generate a unique alias. > > Hm. Looking at the code surrounding what you touched, I'm reminded > that we allow JOIN nodes to not have an alias, and represent that > situation by rte->alias == NULL. I wonder if it'd be better in the > long run to make alias-less subqueries work similarly, That is what the patch does: transformRangeSubselect() passes a NULL alias to addRangeTableEntryForSubquery(), which has been modified to cope with that in a similar way to addRangeTableEntryForJoin() and other addRangeTableEntryFor...() functions. So for example, with the following query, this is what the output from the parser looks like: SELECT * FROM (SELECT 1); query->rtable: rte: rtekind = RTE_SUBQUERY alias = NULL eref = { aliasname = "subquery", colnames = ... } > rather than > generating something that after-the-fact will be indistinguishable > from a user-written alias. If that turns out to not work well, > I'd agree with "unnamed_subquery" as the inserted name. > The result is distinguishable from a user-written alias, because rte->alias is NULL. I think the confusion is that when I suggested using "unnamed_subquery", I was referring to rte->eref->aliasname, and I still think it's a good idea to change that, for consistency with unnamed joins. > Also, what about VALUES clauses? It seems inconsistent to remove > this restriction for sub-SELECT but not VALUES. Actually it looks > like your patch already does remove that restriction in gram.y, > but you didn't follow through elsewhere. > It does support unnamed VALUES clauses in the FROM list (there's a regression test exercising that). It wasn't necessary to make any additional code changes because addRangeTableEntryForValues() already supported having a NULL alias, and it all just flowed through. In fact, the grammar forces you to enclose a VALUES clause in the FROM list in parentheses, so this ends up being an unnamed subquery in the FROM list as well. For example: SELECT * FROM (VALUES(1),(2),(3)); produces query->rtable: rte: rtekind = RTE_SUBQUERY alias = NULL eref = { aliasname = "subquery", colnames = ... } subquery->rtable: rte: rtekind = RTE_VALUES alias = NULL eref = { aliasname = "*VALUES*", colnames = ... } So it's not really any different from a normal subquery. > As far as the docs go, I think it's sufficient to mention the > inconsistency with SQL down at the bottom; we don't need a > redundant in-line explanation. OK, fair enough. I'll post an update in a little while, but first, I found a bug, which revealed a pre-existing bug in transformLockingClause(). I'll start a new thread for that, since it'd be good to get that resolved independently of this patch. Regards, Dean
Re: Making the subquery alias optional in the FROM clause
Dean Rasheed writes: > This was discussed previously in [1], and there seemed to be general > consensus in favour of it, but no new patch emerged. As I said in that thread, I'm not super enthused about this, but I was clearly in the minority so I think it should go forward. > Attached is a patch that takes the approach of not generating an alias > at all, which seems to be neater and simpler, and less code than > trying to generate a unique alias. Hm. Looking at the code surrounding what you touched, I'm reminded that we allow JOIN nodes to not have an alias, and represent that situation by rte->alias == NULL. I wonder if it'd be better in the long run to make alias-less subqueries work similarly, rather than generating something that after-the-fact will be indistinguishable from a user-written alias. If that turns out to not work well, I'd agree with "unnamed_subquery" as the inserted name. Also, what about VALUES clauses? It seems inconsistent to remove this restriction for sub-SELECT but not VALUES. Actually it looks like your patch already does remove that restriction in gram.y, but you didn't follow through elsewhere. As far as the docs go, I think it's sufficient to mention the inconsistency with SQL down at the bottom; we don't need a redundant in-line explanation. regards, tom lane
Re: Making the subquery alias optional in the FROM clause
On Tue, 28 Jun 2022 at 00:32, Julien Rouhaud wrote: > As to forcing SQL-complaint queries, that ship sailed a long time ago: > > Postgres allows but does not enforce the use of SQL-compliant queries, > and > > many of its important features are extensions anyway, so forcing SQL > > compliant queries is out of the question (although I could see the > utility > > of a mode where it warns or errors on non-compliant queries, at least in > > principle). > > Sure, but it doesn't mean that we should support even more non-compliant > syntax > without any restraint. In this case, I don't see much benefit as it's not > solving performance problem or something like that. > It's improving developer performance by eliminating the need to make up utterly useless names. I don't care if behind the scenes names are assigned, although it would be even better if the names didn't exist at all. I just want the computer to do stuff for me that requires absolutely no human judgement whatsoever. > As to bad habits, I'm having trouble understanding. Why do you think > > leaving the alias off a subquery is a bad habit (assuming it were > allowed)? > > I think It's a bad habit because as far as I can see it's not supported on > mysql or sqlserver. > So it’s a bad habit to use features of Postgres that aren’t available on MySQL or SQL Server? For myself, I don’t care one bit about whether my code will run on those systems, or Oracle: as far as I’m concerned I write Postgres applications, not SQL applications. Of course, many people have a need to support other systems, so I appreciate the care we take to document the differences from the standard, and I hope we will continue to support standard queries. But if it’s a bad habit to use Postgres-specific features, why do we create any of those features? > If the name is never used, why are we required to supply it? > > But similarly, I many times relied on the fact that writable CTE are > executed > even if not explicitly referenced. So by the same argument shouldn't we > allow > something like this? > > WITH (INSERT INTO t SELECT * pending WHERE ts < now()) > SELECT now() AS last_processing_time; > I’m not necessarily opposed to allowing this too. But the part which causes me annoyance is normal subquery naming.
Re: Making the subquery alias optional in the FROM clause
Hi, On Mon, Jun 27, 2022 at 12:03:20PM -0400, Isaac Morland wrote: > On Mon, 27 Jun 2022 at 11:12, Julien Rouhaud wrote: > > > More generally, I'm -0.5 on the feature. > > I prefer to force using SQL-compliant queries, and also not take bad > > habits. > > > > As to forcing SQL-complaint queries, that ship sailed a long time ago: > Postgres allows but does not enforce the use of SQL-compliant queries, and > many of its important features are extensions anyway, so forcing SQL > compliant queries is out of the question (although I could see the utility > of a mode where it warns or errors on non-compliant queries, at least in > principle). Sure, but it doesn't mean that we should support even more non-compliant syntax without any restraint. In this case, I don't see much benefit as it's not solving performance problem or something like that. > As to bad habits, I'm having trouble understanding. Why do you think > leaving the alias off a subquery is a bad habit (assuming it were allowed)? I think It's a bad habit because as far as I can see it's not supported on mysql or sqlserver. > If the name is never used, why are we required to supply it? I'm not saying that I'm thrilled having to do so, but it's also not a huge trouble. And since it's required I have the habit to automatically put some random alias if I'm writing some one shot query that indeed doesn't need to use the alias. But similarly, I many times relied on the fact that writable CTE are executed even if not explicitly referenced. So by the same argument shouldn't we allow something like this? WITH (INSERT INTO t SELECT * pending WHERE ts < now()) SELECT now() AS last_processing_time;
Re: Making the subquery alias optional in the FROM clause
On Mon, 27 Jun 2022 at 19:43, David G. Johnston wrote: > > On Mon, Jun 27, 2022 at 11:25 AM Dean Rasheed > wrote: >> >> On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud wrote: >> > >> > It doesn't play that well if you have something called subquery though: >> > >> > [example that changes a user-provided alias] >> > >> > While the output is a valid query, it's not nice that it's replacing a >> > user provided alias with another one (or force an alias if you have a >> > relation called subquery). >> >> It's already the case that user-provided aliases can get replaced by >> new ones in the query-deparsing code, e.g.: >> > > Regardless, is there any reason to not just prefix our made-up aliases with > "pg_" to make it perfectly clear they were generated by the system and are > basically implementation details as opposed to something that appeared in the > originally written query? > > I suppose, "because we've haven't until now, so why start" suffices...but > still doing a rename/suffixing because of query rewriting and inventing one > where we made it optional seem different enough to justify implementing > something different. > I think "pg_" would be a bad idea, since it's too easily confused with things like system catalogs. The obvious precedent we have for a made-up alias is "unnamed_join", so perhaps "unnamed_subquery" would be better. Regards, Dean
Re: Making the subquery alias optional in the FROM clause
On Mon, Jun 27, 2022 at 11:25 AM Dean Rasheed wrote: > On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud wrote: > > > > It doesn't play that well if you have something called subquery though: > > > > [example that changes a user-provided alias] > > > > While the output is a valid query, it's not nice that it's replacing a > > user provided alias with another one (or force an alias if you have a > > relation called subquery). > > It's already the case that user-provided aliases can get replaced by > new ones in the query-deparsing code, e.g.: > > Regardless, is there any reason to not just prefix our made-up aliases with "pg_" to make it perfectly clear they were generated by the system and are basically implementation details as opposed to something that appeared in the originally written query? I suppose, "because we've haven't until now, so why start" suffices...but still doing a rename/suffixing because of query rewriting and inventing one where we made it optional seem different enough to justify implementing something different. David J.
Re: Making the subquery alias optional in the FROM clause
On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud wrote: > > It doesn't play that well if you have something called subquery though: > > [example that changes a user-provided alias] > > While the output is a valid query, it's not nice that it's replacing a > user provided alias with another one (or force an alias if you have a > relation called subquery). It's already the case that user-provided aliases can get replaced by new ones in the query-deparsing code, e.g.: CREATE OR REPLACE VIEW test_view AS SELECT x.a, y.b FROM foo AS x, (SELECT b FROM foo AS x) AS y; \sv test_view CREATE OR REPLACE VIEW public.test_view AS SELECT x.a, y.b FROM foo x, ( SELECT x_1.b FROM foo x_1) y and similarly it may invent technically unnecessary aliases where there were none before. The query-deparsing code has never been alias-preserving, unless you take care to give everything a globally unique alias. Regards, Dean
Re: Making the subquery alias optional in the FROM clause
On Mon, 27 Jun 2022 at 11:12, Julien Rouhaud wrote: > More generally, I'm -0.5 on the feature. > I prefer to force using SQL-compliant queries, and also not take bad > habits. > As to forcing SQL-complaint queries, that ship sailed a long time ago: Postgres allows but does not enforce the use of SQL-compliant queries, and many of its important features are extensions anyway, so forcing SQL compliant queries is out of the question (although I could see the utility of a mode where it warns or errors on non-compliant queries, at least in principle). As to bad habits, I'm having trouble understanding. Why do you think leaving the alias off a subquery is a bad habit (assuming it were allowed)? If the name is never used, why are we required to supply it?
Re: Making the subquery alias optional in the FROM clause
On Mon, Jun 27, 2022 at 9:49 PM Dean Rasheed wrote: > > This was discussed previously in [1], and there seemed to be general > consensus in favour of it, but no new patch emerged. > > Attached is a patch that takes the approach of not generating an alias > at all, which seems to be neater and simpler, and less code than > trying to generate a unique alias. > > It still generates an eref for the subquery RTE, which has a made-up > relation name, but that is marked as not visible on the > ParseNamespaceItem, so it doesn't conflict with anything else, need > not be unique, and cannot be used for qualified references to the > subquery's columns. > > The only place that exposes the eref's made-up relation name is the > existing query deparsing code in ruleutils.c, which uniquifies it and > generates SQL spec-compliant output. For example: > > CREATE OR REPLACE VIEW test_view AS > SELECT * > FROM (SELECT a, b FROM foo), > (SELECT c, d FROM bar) >WHERE a = c; > > \sv test_view > > CREATE OR REPLACE VIEW public.test_view AS > SELECT subquery.a, > subquery.b, > subquery_1.c, > subquery_1.d >FROM ( SELECT foo.a, > foo.b >FROM foo) subquery, > ( SELECT bar.c, > bar.d >FROM bar) subquery_1 > WHERE subquery.a = subquery_1.c It doesn't play that well if you have something called subquery though: CREATE OR REPLACE VIEW test_view AS SELECT * FROM (SELECT a, b FROM foo), (SELECT c, d FROM bar), (select relname from pg_class limit 1) as subquery WHERE a = c; \sv test_view CREATE OR REPLACE VIEW public.test_view AS SELECT subquery.a, subquery.b, subquery_1.c, subquery_1.d, subquery_2.relname FROM ( SELECT foo.a, foo.b FROM foo) subquery, ( SELECT bar.c, bar.d FROM bar) subquery_1, ( SELECT pg_class.relname FROM pg_class LIMIT 1) subquery_2 WHERE subquery.a = subquery_1.c While the output is a valid query, it's not nice that it's replacing a user provided alias with another one (or force an alias if you have a relation called subquery). More generally, I'm -0.5 on the feature. I prefer to force using SQL-compliant queries, and also not take bad habits.
Making the subquery alias optional in the FROM clause
This was discussed previously in [1], and there seemed to be general consensus in favour of it, but no new patch emerged. Attached is a patch that takes the approach of not generating an alias at all, which seems to be neater and simpler, and less code than trying to generate a unique alias. It still generates an eref for the subquery RTE, which has a made-up relation name, but that is marked as not visible on the ParseNamespaceItem, so it doesn't conflict with anything else, need not be unique, and cannot be used for qualified references to the subquery's columns. The only place that exposes the eref's made-up relation name is the existing query deparsing code in ruleutils.c, which uniquifies it and generates SQL spec-compliant output. For example: CREATE OR REPLACE VIEW test_view AS SELECT * FROM (SELECT a, b FROM foo), (SELECT c, d FROM bar) WHERE a = c; \sv test_view CREATE OR REPLACE VIEW public.test_view AS SELECT subquery.a, subquery.b, subquery_1.c, subquery_1.d FROM ( SELECT foo.a, foo.b FROM foo) subquery, ( SELECT bar.c, bar.d FROM bar) subquery_1 WHERE subquery.a = subquery_1.c Regards, Dean [1] https://www.postgresql.org/message-id/flat/1487773980.3143.15.camel%40oopsware.de diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml new file mode 100644 index 80bb8ad..8cae456 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -51,7 +51,7 @@ SELECT [ ALL | DISTINCT [ ON ( table_name [ * ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ] [ TABLESAMPLE sampling_method ( argument [, ...] ) [ REPEATABLE ( seed ) ] ] -[ LATERAL ] ( select ) [ AS ] alias [ ( column_alias [, ...] ) ] +[ LATERAL ] ( select ) [ [ AS ] alias [ ( column_alias [, ...] ) ] ] with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ] [ LATERAL ] function_name ( [ argument [, ...] ] ) [ WITH ORDINALITY ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ] @@ -490,11 +490,18 @@ TABLE [ ONLY ] VALUES command can also be used here. + + +An alias may be provided to name the output of the +sub-SELECT, and optionally rename its columns. +Note that according to the SQL standard, a +sub-SELECT must have an alias, +whereas in PostgreSQL it is optional. + @@ -2041,6 +2048,16 @@ SELECT 2+2; + + Omitting sub-SELECT aliases in FROM + + +According to the SQL standard, a sub-SELECT in the +FROM list must have an alias. In +PostgreSQL, this alias may be omitted. + + + ONLY and Inheritance diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y new file mode 100644 index 969c9c1..3b25f7c --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -13346,33 +13346,6 @@ table_ref: relation_expr opt_alias_claus n->lateral = false; n->subquery = $1; n->alias = $2; - /* - * The SQL spec does not permit a subselect - * () without an alias clause, - * so we don't either. This avoids the problem - * of needing to invent a unique refname for it. - * That could be surmounted if there's sufficient - * popular demand, but for now let's just implement - * the spec and see if anyone complains. - * However, it does seem like a good idea to emit - * an error message that's better than "syntax error". - */ - if ($2 == NULL) - { - if (IsA($1, SelectStmt) && - ((SelectStmt *) $1)->valuesLists) - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("VALUES in FROM must have an alias"), - errhint("For example, FROM (VALUES ...) [AS] foo."), - parser_errposition(@1))); - else - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("subquery in FROM must have an alias"), - errhint("For example, FROM (SELECT ...) [AS] foo."), - parser_errposition(@1))); - } $$ = (Node *) n; } | LATERAL_P select_with_parens opt_alias_clause @@ -13382,23 +13355,6 @@ table_ref: relation_expr opt_alias_claus n->lateral = true; n->subquery = $2; n->alias = $3; - /* same comment as above */ - if ($3 == NULL) - { - if (IsA($2, SelectStmt) && - ((SelectStmt *) $2)->valuesLists) - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("VALUES in FROM must have an alias"), - errhint("For example, FROM (VALUES ...) [AS] foo."), - parser_errposition(@2))); - else - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("subquery in FROM must have an alias"), - errhint("For example, FROM (SELECT ...) [AS] foo."), - parser_errposition(@2))); - } $$ = (Node *) n; } | joined_table diff --git a/src/backend/parser/parse_cla