Re: pg14 psql broke \d datname.nspname.relname

2022-04-20 Thread Thomas Munro
On Thu, Apr 21, 2022 at 8:38 AM Thomas Munro  wrote:
> On Thu, Apr 21, 2022 at 7:35 AM Robert Haas  wrote:
> > On Wed, Apr 20, 2022 at 3:08 PM Thomas Munro  wrote:
> > > Looks like this somehow broke on a Windows box:
> > >
> > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2022-04-20%2016%3A34%3A19
> >
> > So the issue here is that we are running this command:
> >
> > pg_dumpall --exclude-database .*
> >
> > And on that Windows machine, .* is being expanded to .gitignore, so
> > pg_dumpall prints:
> >
> > pg_dumpall: error: improper qualified name (too many dotted names): 
> > .gitignore
> >
> > Instead of:
> >
> > pg_dumpall: error: improper qualified name (too many dotted names): .*
> >
> > I don't know why that glob-expansion only happens on jacana, and I
> > don't know how to fix it, either.
>
> Perhaps bowerbird and jacana have different versions of IPC::Run?  I
> see some recent-ish changes to escaping logic in here from Noah:
>
> https://github.com/toddr/IPC-Run/commits/master/lib/IPC/Run/Win32Helper.pm
>
> Looks like the older version looks for meta characters not including
> '*', and the later one uses Win32::ShellQuote::quote_native.

This time with Andrew in CC.




Re: pg14 psql broke \d datname.nspname.relname

2022-04-20 Thread Thomas Munro
On Thu, Apr 21, 2022 at 7:35 AM Robert Haas  wrote:
> On Wed, Apr 20, 2022 at 3:08 PM Thomas Munro  wrote:
> > Looks like this somehow broke on a Windows box:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2022-04-20%2016%3A34%3A19
>
> So the issue here is that we are running this command:
>
> pg_dumpall --exclude-database .*
>
> And on that Windows machine, .* is being expanded to .gitignore, so
> pg_dumpall prints:
>
> pg_dumpall: error: improper qualified name (too many dotted names): .gitignore
>
> Instead of:
>
> pg_dumpall: error: improper qualified name (too many dotted names): .*
>
> I don't know why that glob-expansion only happens on jacana, and I
> don't know how to fix it, either.

Perhaps bowerbird and jacana have different versions of IPC::Run?  I
see some recent-ish changes to escaping logic in here from Noah:

https://github.com/toddr/IPC-Run/commits/master/lib/IPC/Run/Win32Helper.pm

Looks like the older version looks for meta characters not including
'*', and the later one uses Win32::ShellQuote::quote_native.




Re: pg14 psql broke \d datname.nspname.relname

2022-04-20 Thread Robert Haas
On Wed, Apr 20, 2022 at 3:08 PM Thomas Munro  wrote:
> Looks like this somehow broke on a Windows box:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2022-04-20%2016%3A34%3A19

So the issue here is that we are running this command:

pg_dumpall --exclude-database .*

And on that Windows machine, .* is being expanded to .gitignore, so
pg_dumpall prints:

pg_dumpall: error: improper qualified name (too many dotted names): .gitignore

Instead of:

pg_dumpall: error: improper qualified name (too many dotted names): .*

I don't know why that glob-expansion only happens on jacana, and I
don't know how to fix it, either.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2022-04-20 Thread Thomas Munro
On Thu, Apr 21, 2022 at 3:55 AM Robert Haas  wrote:
> On Tue, Apr 19, 2022 at 10:20 PM Mark Dilger
>  wrote:
> > Looks like most people voted for (B).  In support of that option, here are 
> > patches for master and REL_14_STABLE.  Note that I extended the tests 
> > compared to v9, which found a problem that is fixed for v10:
>
> OK, I committed these. I am not totally sure we've got all the
> problems sorted here, but I don't think that continuing to not commit
> anything is going to be better, so here we go.

Looks like this somehow broke on a Windows box:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2022-04-20%2016%3A34%3A19

[14:05:49.729](0.001s) not ok 16 - pg_dumpall: option
--exclude-database rejects multipart pattern ".*": matches
[14:05:49.730](0.000s)
[14:05:49.730](0.000s) #   Failed test 'pg_dumpall: option
--exclude-database rejects multipart pattern ".*": matches'
#   at t/002_pg_dump.pl line 3985.
[14:05:49.730](0.000s) #   'pg_dumpall: error:
improper qualified name (too many dotted names): .gitignore
# '
# doesn't match '(?^:pg_dumpall: error: improper qualified name
\\(too many dotted names\\): \\.\\*)'




Re: pg14 psql broke \d datname.nspname.relname

2022-04-20 Thread Robert Haas
On Tue, Apr 19, 2022 at 10:20 PM Mark Dilger
 wrote:
> Looks like most people voted for (B).  In support of that option, here are 
> patches for master and REL_14_STABLE.  Note that I extended the tests 
> compared to v9, which found a problem that is fixed for v10:

OK, I committed these. I am not totally sure we've got all the
problems sorted here, but I don't think that continuing to not commit
anything is going to be better, so here we go.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread Vik Fearing

On 4/19/22 16:00, Robert Haas wrote:

On Mon, Apr 18, 2022 at 3:39 PM Mark Dilger
 wrote:

Since there hasn't been any agreement on that point, I've just rebased the 
patch to apply cleanly against the current master:


This looks OK to me. There may be better ways to do some of it, but
there's no rule against further improving the code later. Also, since
the issue was introduced in v14, we probably shouldn't wait forever to
do something about it. However, there is a procedural issue here now
that we are past feature freeze. I think someone could defensibly take
any of the following positions:

(A) This is a new feature. Wait for v16.
(B) This is a bug fix. Commit it now and back-patch to v14.
(C) This is a cleanup that is OK to put into v15 even after feature
freeze but since it is a behavior change we shouldn't back-patch it.

I vote for (C). What do other people think?



I vote for (B).
--
Vik Fearing




Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread Tom Lane
Justin Pryzby  writes:
> On Tue, Apr 19, 2022 at 10:00:01AM -0400, Robert Haas wrote:
>> (A) This is a new feature. Wait for v16.
>> (B) This is a bug fix. Commit it now and back-patch to v14.
>> (C) This is a cleanup that is OK to put into v15 even after feature
>> freeze but since it is a behavior change we shouldn't back-patch it.
>> I vote for (C). What do other people think?

> I thought the plan was to backpatch to v14.

> v14 psql had an unintentional behavior change, rejecting \d
> datname.nspname.relname.

I agree that the v14 behavior is a bug, so ordinarily I'd vote
for back-patching.

A possible objection to doing that is that the patch changes the
APIs of processSQLNamePattern and patternToSQLRegex.  We would avoid
making such a change in core-backend APIs in a minor release, but
I'm not certain whether there are equivalent stability concerns
for src/fe_utils/.

On the whole I'd vote for (B), with (C) as second choice.

regards, tom lane




Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread Justin Pryzby
On Tue, Apr 19, 2022 at 10:00:01AM -0400, Robert Haas wrote:
> (A) This is a new feature. Wait for v16.
> (B) This is a bug fix. Commit it now and back-patch to v14.
> (C) This is a cleanup that is OK to put into v15 even after feature
> freeze but since it is a behavior change we shouldn't back-patch it.
> 
> I vote for (C). What do other people think?

I thought the plan was to backpatch to v14.

v14 psql had an unintentional behavior change, rejecting \d
datname.nspname.relname.

This patch is meant to relax that change by allowing datname, but only if it
matches the name of the current database ... without returning to the v13
behavior, which allowed arbitrary leading junk.

-- 
Justin




Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread David G. Johnston
On Tue, Apr 19, 2022 at 7:00 AM Robert Haas  wrote:

> On Mon, Apr 18, 2022 at 3:39 PM Mark Dilger
>  wrote:
> > Since there hasn't been any agreement on that point, I've just rebased
> the patch to apply cleanly against the current master:
>
> This looks OK to me. There may be better ways to do some of it, but
> there's no rule against further improving the code later. Also, since
> the issue was introduced in v14, we probably shouldn't wait forever to
> do something about it. However, there is a procedural issue here now
> that we are past feature freeze. I think someone could defensibly take
> any of the following positions:
>
> (A) This is a new feature. Wait for v16.
> (B) This is a bug fix. Commit it now and back-patch to v14.
> (C) This is a cleanup that is OK to put into v15 even after feature
> freeze but since it is a behavior change we shouldn't back-patch it.
>
> I vote for (C). What do other people think?
>
>
I vote for (B).  The behavioral change for v14 turns working usage patterns
into errors where it should not have.  It is a design bug and POLA
violation that should be corrected.

"""
such that the above example was
interpreted as schema=production, relation=marketing.customers.
This turns out to be highly unintuitive to users.
"""

My concern here about a behavior affecting bug fix - which we allow - is
reduced by the fact this feature is almost exclusively an interactive one.
Which supports not having only v14, and maybe v15, behave differently than
v13 and v16 when it comes to using it for expected usage patterns:

"""
We've had reports that users sometimes copy-and-paste database- and
schema-qualified relation names from the logs.
"""

David J.


Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread Robert Haas
On Mon, Apr 18, 2022 at 3:39 PM Mark Dilger
 wrote:
> Since there hasn't been any agreement on that point, I've just rebased the 
> patch to apply cleanly against the current master:

This looks OK to me. There may be better ways to do some of it, but
there's no rule against further improving the code later. Also, since
the issue was introduced in v14, we probably shouldn't wait forever to
do something about it. However, there is a procedural issue here now
that we are past feature freeze. I think someone could defensibly take
any of the following positions:

(A) This is a new feature. Wait for v16.
(B) This is a bug fix. Commit it now and back-patch to v14.
(C) This is a cleanup that is OK to put into v15 even after feature
freeze but since it is a behavior change we shouldn't back-patch it.

I vote for (C). What do other people think?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2022-04-18 Thread Mark Dilger


> On Apr 8, 2022, at 4:11 AM, Robert Haas  wrote:
> 
> I don't personally see how we're going to come out ahead with that
> approach, but if you or Tom or someone else want to put something
> together, that's fine with me. I'm not stuck on this approach, I just
> don't see how we come out ahead with the type of thing you're talking
> about. I mean we could return the error text, but it's only to a
> handful of places, so it just doesn't really seem like a win over what
> the patch is already doing.

Since there hasn't been any agreement on that point, I've just rebased the 
patch to apply cleanly against the current master:



v9-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg14 psql broke \d datname.nspname.relname

2022-04-08 Thread Robert Haas
On Thu, Apr 7, 2022 at 11:40 PM Greg Stark  wrote:
> That doesn't seem to be entirely inconsistent with what Tom describes.
> Instead of "throw an error" the function would return an error and
> possibly some extra info which the caller would use to handle the
> error appropriately.

I don't personally see how we're going to come out ahead with that
approach, but if you or Tom or someone else want to put something
together, that's fine with me. I'm not stuck on this approach, I just
don't see how we come out ahead with the type of thing you're talking
about. I mean we could return the error text, but it's only to a
handful of places, so it just doesn't really seem like a win over what
the patch is already doing.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2022-04-07 Thread Greg Stark
On Thu, 7 Apr 2022 at 22:32, Robert Haas  wrote:
>
> On Thu, Apr 7, 2022 at 7:41 PM Tom Lane  wrote:
>
> > Possibly a better idea is to add an enum argument telling the function
> > what to do (parse the whole thing as one name regardless of dots,
> > parse as two names if there's a dot, throw error if there's a dot,
> > etc etc as needed by existing call sites).  Perhaps some of the
> > existing arguments could be merged into such an enum, too.
>
> AFAICS there's not much to be done about the fact
> that one caller wants pg_log_error + exit(2), another wants fatal(), a
> third PQfinish(conn) and exit_nicely(), and the last termPQExpBuffer()
> and return false.

That doesn't seem to be entirely inconsistent with what Tom describes.
Instead of "throw an error" the function would return an error and
possibly some extra info which the caller would use to handle the
error appropriately.

It still has the nice property that the decision that it is in fact an
error would be made inside the parsing function based on the enum
declaring what's intended. And it wouldn't return a possibly bogus
parsing with information the caller might use to infer it isn't what
was desired (or fail to).

-- 
greg




Re: pg14 psql broke \d datname.nspname.relname

2022-04-07 Thread Justin Pryzby
On Thu, Apr 07, 2022 at 10:26:18PM -0400, Robert Haas wrote:
> +   pg_log_error("improper relation name (too many dotted names): 
> %s", pattern);
> 
> Come to think of it, maybe the error text there could stand some
> bikeshedding, but AFAICS

AFAICT the error text deliberately matches this, which I mentioned seems to me
the strongest argument for supporting \d datname.nspname.relname

ts=# SELECT 'a.a.a.a'::regclass;
ERROR:  42601: improper relation name (too many dotted names): a.a.a.a
LINE 1: SELECT 'a.a.a.a'::regclass;
   ^
LOCATION:  makeRangeVarFromNameList, namespace.c:3129




Re: pg14 psql broke \d datname.nspname.relname

2022-04-07 Thread Robert Haas
On Thu, Apr 7, 2022 at 7:41 PM Tom Lane  wrote:
> Mark Dilger  writes:
> > The patch submitted changes processSQLNamePattern() to return a dot count 
> > by reference.  It's up to the caller to decide whether to raise an error.  
> > If you pass in no schemavar, and you get back dotcnt=2, you know it parsed 
> > it as a two part pattern, and you can pg_fatal(...) or ereport(ERROR, ...) 
> > or whatever.
>
> Well, I'm not telling Robert what to do, but I wouldn't accept that
> API.  It requires duplicative error-handling code at every call site
> and is an open invitation to omitting necessary error checks.
>
> Possibly a better idea is to add an enum argument telling the function
> what to do (parse the whole thing as one name regardless of dots,
> parse as two names if there's a dot, throw error if there's a dot,
> etc etc as needed by existing call sites).  Perhaps some of the
> existing arguments could be merged into such an enum, too.

I hadn't considered that approach, but I don't think it works very
well, because front-end error handling is so inconsistent. From the
patch:

+   pg_log_error("improper relation name (too many dotted
names): %s", pattern);
+   exit(2);

+   fatal("improper qualified name (too many
dotted names): %s",
+ cell->val);

+   pg_log_error("improper qualified name (too
many dotted names): %s",
+cell->val);
+   PQfinish(conn);
+   exit_nicely(1);

+   pg_log_error("improper qualified name (too many dotted
names): %s",
+pattern);
+   termPQExpBuffer(&dbbuf);
+   return false;

Come to think of it, maybe the error text there could stand some
bikeshedding, but AFAICS there's not much to be done about the fact
that one caller wants pg_log_error + exit(2), another wants fatal(), a
third PQfinish(conn) and exit_nicely(), and the last termPQExpBuffer()
and return false.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2022-04-07 Thread Tom Lane
Mark Dilger  writes:
> The patch submitted changes processSQLNamePattern() to return a dot count by 
> reference.  It's up to the caller to decide whether to raise an error.  If 
> you pass in no schemavar, and you get back dotcnt=2, you know it parsed it as 
> a two part pattern, and you can pg_fatal(...) or ereport(ERROR, ...) or 
> whatever.

Well, I'm not telling Robert what to do, but I wouldn't accept that
API.  It requires duplicative error-handling code at every call site
and is an open invitation to omitting necessary error checks.

Possibly a better idea is to add an enum argument telling the function
what to do (parse the whole thing as one name regardless of dots,
parse as two names if there's a dot, throw error if there's a dot,
etc etc as needed by existing call sites).  Perhaps some of the
existing arguments could be merged into such an enum, too.

regards, tom lane




Re: pg14 psql broke \d datname.nspname.relname

2022-04-07 Thread Mark Dilger



> On Apr 7, 2022, at 3:37 PM, Tom Lane  wrote:
> 
>> 
>> I don't know whether that's a bug fix for the existing code or some
>> new bit of functionality that \dconfig requires and nothing else
>> needs.
> 
> Well, \dconfig needs it because it would like foo.bar to get processed
> as just a name.  But I think it's a bug fix because as things stood,
> if the caller doesn't provide a schemavar and the pattern contains a
> dot, the code just silently throws away the dot and all to the left.
> That doesn't seem very sane, even if it is a longstanding behavior.

The patch submitted changes processSQLNamePattern() to return a dot count by 
reference.  It's up to the caller to decide whether to raise an error.  If you 
pass in no schemavar, and you get back dotcnt=2, you know it parsed it as a two 
part pattern, and you can pg_fatal(...) or ereport(ERROR, ...) or whatever.

It looks like I'll need to post a new version of the patch with an argument 
telling the function to ignore dots, but I'm not prepared to say that for sure. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2022-04-07 Thread Tom Lane
Robert Haas  writes:
> I still have a vague feeling that there's probably some way of doing
> this better, but had more or less resolved to commit this patch as is
> anyway and had that all queued up. But then I had to go to a meeting
> and when I came out I discovered that Tom had done this:

Sorry, it didn't occur to me that that would impinge on what you
were doing over here ... though in retrospect I should have thought
of it.

> I don't know whether that's a bug fix for the existing code or some
> new bit of functionality that \dconfig requires and nothing else
> needs.

Well, \dconfig needs it because it would like foo.bar to get processed
as just a name.  But I think it's a bug fix because as things stood,
if the caller doesn't provide a schemavar and the pattern contains a
dot, the code just silently throws away the dot and all to the left.
That doesn't seem very sane, even if it is a longstanding behavior.

regards, tom lane




Re: pg14 psql broke \d datname.nspname.relname

2022-04-07 Thread Robert Haas
On Wed, Apr 6, 2022 at 12:07 PM Mark Dilger
 wrote:
> I was able to clean up the "if (left && want_literal_dbname)" stuff, though.

I still have a vague feeling that there's probably some way of doing
this better, but had more or less resolved to commit this patch as is
anyway and had that all queued up. But then I had to go to a meeting
and when I came out I discovered that Tom had done this:

--- a/src/fe_utils/string_utils.c
+++ b/src/fe_utils/string_utils.c
@@ -918,8 +918,12 @@ processSQLNamePattern(PGconn *conn, PQExpBuffer
buf, const char *pattern,
  * Convert shell-style 'pattern' into the regular expression(s) we want to
  * execute.  Quoting/escaping into SQL literal format will be done below
  * using appendStringLiteralConn().
+ *
+ * If the caller provided a schemavar, we want to split the pattern on
+ * ".", otherwise not.
  */
-patternToSQLRegex(PQclientEncoding(conn), NULL, &schemabuf, &namebuf,
+patternToSQLRegex(PQclientEncoding(conn), NULL,
+  (schemavar ? &schemabuf : NULL), &namebuf,
   pattern, force_escape);

 /*

I don't know whether that's a bug fix for the existing code or some
new bit of functionality that \dconfig requires and nothing else
needs. A related point that I had noticed during review is that these
existing tests look pretty bogus:

if (namebuf.len > 2)

if (schemabuf.len > 2)

In the v13 code, these tests occur at a point where we've definitely
added ^( to the buffer, but possibly nothing else. But starting in v14
that's no longer the case. So probably this test should be changed
somehow. The proposed patch changes these to something like this:

+   if (schemavar && schemabuf.len > 2)

But that doesn't really seem like it's fixing the problem I'm talking about.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2022-04-06 Thread Mark Dilger


> On Mar 29, 2022, at 8:20 AM, Robert Haas  wrote:
> 
> In describe.c, why are the various describeWhatever functions
> returning true when validateSQLNamePattern returns false? It seems to
> me that they should return false. That would cause exec_command_d() to
> set status = PSQL_CMD_ERROR, which seems appropriate. I wondered
> whether we should return PSQL_CMD_ERROR only for database errors, but
> that doesn't seem to be the case. For example, exec_command_a() sets
> PSQL_CMD_ERROR for a failure in do_pset().

Yes, I believe you are right.  For scripting, the following should echo, but 
doesn't under the version 7 patch.  Fixed in version 8.

% psql -c "\d a.b.c.d" || echo 'error'
improper qualified name (too many dotted names): a.b.c.d

> pg_dump's prohibit_crossdb_refs() has a special case for you are not
> connected to a database, but psql's validateSQLNamePattern() treats it
> as an invalid cross-database reference. Maybe that should be
> consistent, or just the other way around. After all, I would expect
> pg_dump to just bail out if we lose the database connection, but psql
> may continue, because we can reconnect. Putting more code into the
> tool where reconnecting doesn't really make sense seems odd.

Fixed psql in version 8 to issue the appropriate error message, either "You are 
currently not connected to a database." or "cross-database references are not 
implemented: %s".  That matches the output for pg_dump.

> processSQLNamePattern() documents that dotcnt can be NULL, and then
> asserts that it isn't.

That's ugly.  Fixed the documentation in version 8.

> processSQLNamePattern() introduces new local variables schema and
> name, which account for most of the notational churn in that function.
> I can't see a reason why those changes are needed. You do test whether
> the new variables are NULL in a couple of places, but you could
> equally well test schemavar/namevar/altnamevar directly. Actually, I
> don't really understand why this function needs any changes other than
> passing dbnamebuf and dotcnt through to patternToSQLRegex(). Is there
> a reason?

It looks like overeager optimization to me, to avoid passing buffers to 
patternToSQLRegex that aren't really wanted, consequently asking that function 
to parse things that the caller doesn't care about.  But I don't think the 
optimization is worth the git history churn.  Removed in version 8.

> patternToSQLRegex() restructures the system of buffers as well, and I
> don't understand the purpose of that either. It sort of looks like the
> idea might be to relax the rule against dbname.relname patterns, but
> why would we want to do that? If we don't want to do that, why remove
> the assertion?

This took a while to answer.

I don't remember exactly what I was trying to do here, but it looks like I 
wanted callers who only want a (possibly database-qualified) schema name to 
pass that in the (dbnamebuf and) schemabuf, rather than using the (schemabuf 
and ) namebuf.  I obviously didn't finish that conversion, because the clients 
never got the message.  What remained was some rearrangement in 
patternToSQLRegex which worked but served no purpose.

I've reverted the useless refactoring.

> It is not very nice that patternToSQLRegex() ends up repeating the
> locution "if (left && want_literal_dbname)
> appendPQExpBufferChar(&left_literal, '"')" a whole bunch of times.
> Suppose we remove all that. Then, in the if (!inquotes && ch == '.')
> case, if left = true, we copy "cp - pattern" bytes starting at
> "pattern" into the buffer. Wouldn't that accomplish the same thing
> with less code?

We don't *quite* want the literal left string.  If it is quoted, we still want 
the quotes removed.  For example:

  \d "robert.haas".accounts.acme

needs to return robert.haas (without the quotes) as the database name.  
Likewise, for embedded quotes:

  \d "robert""haas".accounts.acme

needs to return robert"haas, and so forth.

I was able to clean up the "if (left && want_literal_dbname)" stuff, though.



v8-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg14 psql broke \d datname.nspname.relname

2022-03-29 Thread Robert Haas
On Fri, Mar 25, 2022 at 3:42 PM Mark Dilger
 wrote:
> I think your change is fine, so I've rolled it into this next patch.

OK, cool. Here are some more comments.

In describe.c, why are the various describeWhatever functions
returning true when validateSQLNamePattern returns false? It seems to
me that they should return false. That would cause exec_command_d() to
set status = PSQL_CMD_ERROR, which seems appropriate. I wondered
whether we should return PSQL_CMD_ERROR only for database errors, but
that doesn't seem to be the case. For example, exec_command_a() sets
PSQL_CMD_ERROR for a failure in do_pset().

pg_dump's prohibit_crossdb_refs() has a special case for you are not
connected to a database, but psql's validateSQLNamePattern() treats it
as an invalid cross-database reference. Maybe that should be
consistent, or just the other way around. After all, I would expect
pg_dump to just bail out if we lose the database connection, but psql
may continue, because we can reconnect. Putting more code into the
tool where reconnecting doesn't really make sense seems odd.

processSQLNamePattern() documents that dotcnt can be NULL, and then
asserts that it isn't.

processSQLNamePattern() introduces new local variables schema and
name, which account for most of the notational churn in that function.
I can't see a reason why those changes are needed. You do test whether
the new variables are NULL in a couple of places, but you could
equally well test schemavar/namevar/altnamevar directly. Actually, I
don't really understand why this function needs any changes other than
passing dbnamebuf and dotcnt through to patternToSQLRegex(). Is there
a reason?

patternToSQLRegex() restructures the system of buffers as well, and I
don't understand the purpose of that either. It sort of looks like the
idea might be to relax the rule against dbname.relname patterns, but
why would we want to do that? If we don't want to do that, why remove
the assertion?

It is not very nice that patternToSQLRegex() ends up repeating the
locution "if (left && want_literal_dbname)
appendPQExpBufferChar(&left_literal, '"')" a whole bunch of times.
Suppose we remove all that. Then, in the if (!inquotes && ch == '.')
case, if left = true, we copy "cp - pattern" bytes starting at
"pattern" into the buffer. Wouldn't that accomplish the same thing
with less code?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2022-03-25 Thread Mark Dilger


> On Mar 22, 2022, at 11:04 AM, Robert Haas  wrote:
> 
> This patch adds three new arguments to processSQLNamePattern() and
> documents one of them. It adds three new parameters to
> patternToSQLRegex() as well, and documents none of them.

This next patch adds the missing comments.

> I think that
> the text of the comment might need some updating too, in particular
> the sentence "Additional dots in the name portion are not treated as
> special."

Changed. 

> There are no comments explaining the left_is_literal stuff. It appears
> that your intention here is that if the pattern string supplied by the
> user contains any of *?|+()[]{}.^\ not surrounded by double-quotes, we
> signal the caller. Some callers then use this to issue a complaint
> that the database name must be a literal. To me, this behavior doesn't
> really make sense. If something is a literal, that means we're not
> going to interpret the special characters that it contains. Here, we
> are interpreting the special characters just so we can complain that
> they exist. It seems to me that a simpler solution would be to not
> interpret them at all. I attach a patch showing what I mean by that.
> It just rips out the dbname_is_literal stuff in favor of doing nothing
> at all. To put the whole thing another way, if the user types "\d
> }.public.ft", your code wants to complain about the fact that the user
> is trying to use regular expression characters in a place where they
> are not allowed to do that. I argue that we should instead just be
> comparing "}" against the database name and see whether it happens to
> match.

I think your change is fine, so I've rolled it into this next patch.



v7-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg14 psql broke \d datname.nspname.relname

2022-03-22 Thread Robert Haas
On Mon, Mar 21, 2022 at 9:32 PM Mark Dilger
 wrote:
> [ new patch version ]

This patch adds three new arguments to processSQLNamePattern() and
documents one of them. It adds three new parameters to
patternToSQLRegex() as well, and documents none of them. I think that
the text of the comment might need some updating too, in particular
the sentence "Additional dots in the name portion are not treated as
special."

There are no comments explaining the left_is_literal stuff. It appears
that your intention here is that if the pattern string supplied by the
user contains any of *?|+()[]{}.^\ not surrounded by double-quotes, we
signal the caller. Some callers then use this to issue a complaint
that the database name must be a literal. To me, this behavior doesn't
really make sense. If something is a literal, that means we're not
going to interpret the special characters that it contains. Here, we
are interpreting the special characters just so we can complain that
they exist. It seems to me that a simpler solution would be to not
interpret them at all. I attach a patch showing what I mean by that.
It just rips out the dbname_is_literal stuff in favor of doing nothing
at all. To put the whole thing another way, if the user types "\d
}.public.ft", your code wants to complain about the fact that the user
is trying to use regular expression characters in a place where they
are not allowed to do that. I argue that we should instead just be
comparing "}" against the database name and see whether it happens to
match.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


0002-Remove-dbname_is_literal-stuff.patch
Description: Binary data


0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data


Re: pg14 psql broke \d datname.nspname.relname

2022-03-21 Thread Mark Dilger


> On Mar 21, 2022, at 6:12 PM, Andres Freund  wrote:
> 
> Needs another one: http://cfbot.cputube.org/patch_37_3367.log
> 
> Marked as waiting-on-author.



v6-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg14 psql broke \d datname.nspname.relname

2022-03-21 Thread Andres Freund
On 2022-01-26 09:04:15 -0800, Mark Dilger wrote:
> Also, rebased as necessary:

Needs another one: http://cfbot.cputube.org/patch_37_3367.log

Marked as waiting-on-author.

Greetings,

Andres Freund




Re: pg14 psql broke \d datname.nspname.relname

2022-03-15 Thread David G. Johnston
On Tue, Mar 15, 2022 at 12:31 PM Mark Dilger 
wrote:

>
> > On Mar 15, 2022, at 12:27 PM, Robert Haas  wrote:
> >
> > - Justin Pryzby, who originally discovered the problem, prefers the
> > same behavior that I prefer long-term, but thinks Tom's behavior is
> > better than doing nothing.
> > - Mark Dilger, Isaac Moreland, Garick Hamlin, Alvaro Herrera, and
> > Julien Rouhaud have commented on the thread but have not endorsed
> > either of these dueling proposals.
>
> I vote in favor of committing the patch, though I'd also say it's not
> super important to me.
>
>
I'm on board with leaving the v14 change in place - fixing the bug so that
a matching database name is accepted (the whole copy-from-logs argument is
quite compelling).  I'm not too concerned about psql, since \d is mainly
used interactively, and since the change will result in errors in
pg_dump/pg_restore the usual due diligence for upgrading should handle the
necessary tweaks should the case arise where bogus/ignore stuff is present.

David J.


Re: pg14 psql broke \d datname.nspname.relname

2022-03-15 Thread Justin Pryzby
On Wed, Oct 13, 2021 at 11:54:26AM -0500, Justin Pryzby wrote:
> It seems unfortunate if names from log messages qualified with datname were 
> now
> rejected.  Like this one:
> 
> | automatic analyze of table "ts.child.cdrs_2021_10_12"...

Mark mentioned this "log message" use case in his proposed commit message, but
I wanted to mention what seems like a more important parallel:

postgres=# SELECT 'postgres.public.postgres_log'::regclass;
regclass | postgres_log

postgres=# SELECT 'not.postgres.public.postgres_log'::regclass;
ERROR:  improper relation name (too many dotted names): 
not.postgres.public.postgres_log
^
postgres=# SELECT 'not.public.postgres_log'::regclass;
ERROR:  cross-database references are not implemented: "not.public.postgres_log"

I think Mark used this as the model behavior for \d for this patch, which
sounds right.  Since the "two dot" case wasn't fixed in 14.1 nor 2, it seems
better to implement the ultimate, intended behavior now, rather than trying to
exactly match what old versions did.  I'm of the understanding that's what
Mark's patch does, so +1 from me.

I don't know how someone upgrading from an old version would know about the
change, though (rejecting junk prefixes rather than ignoring them).  *If* it
were important, it seems like it'd need to be added to the 14.0 release notes.




Re: pg14 psql broke \d datname.nspname.relname

2022-03-15 Thread Mark Dilger



> On Mar 15, 2022, at 12:27 PM, Robert Haas  wrote:
> 
> - Justin Pryzby, who originally discovered the problem, prefers the
> same behavior that I prefer long-term, but thinks Tom's behavior is
> better than doing nothing.
> - Mark Dilger, Isaac Moreland, Garick Hamlin, Alvaro Herrera, and
> Julien Rouhaud have commented on the thread but have not endorsed
> either of these dueling proposals.

I vote in favor of committing the patch, though I'd also say it's not super 
important to me.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2022-03-15 Thread Robert Haas
Continuing my pass through the "bug fixes" section of the CommitFest,
I came upon this patch, which is contested. Here is my attempt to
summarize where things stand. As I understand it:

- Tom wants to revert to the previous behavior of accepting arbitrary
garbage, so that \d slkgjskld.jgdsjhgjklsdhg.saklasgh.foo.bar means \d
foo.bar.
- I want \d mydb.foo.bar to mean \d foo.bar if the dbname is mydb and
report an error otherwise; anything with dots>2 is also an error in my
view.
- Peter Geoghegan agrees with Tom.
- Stephen Frost agrees with me.
- Vik Fearing also agrees with me.
- Justin Pryzby, who originally discovered the problem, prefers the
same behavior that I prefer long-term, but thinks Tom's behavior is
better than doing nothing.
- Mark Dilger, Isaac Moreland, Garick Hamlin, Alvaro Herrera, and
Julien Rouhaud have commented on the thread but have not endorsed
either of these dueling proposals.

By my count, that's probably a vote of 4-2 in view of the preferred
solution, but it depends on whether you could Justin's vote as +1 for
my preferred solution or maybe +0.75 or +0.50 or something. At any
rate, it's close.

If anyone else would like to take a position, please do so in the next
few days. If there are no more votes, I'm going to proceed with trying
to fix up Mark's patch implementing my preferred solution and getting
it committed.

Thanks,

...Robert




Re: pg14 psql broke \d datname.nspname.relname

2022-01-26 Thread Mark Dilger


> On Jan 17, 2022, at 1:54 PM, Robert Haas  wrote:
> 
> + * dotcnt: how many separators were parsed from the pattern, by reference.
> + * Can be NULL.
> 
> But then:
> 
> +Assert(dotcnt != NULL);

Removed the "Can be NULL" part, as that use case doesn't make sense.  The 
caller should always care whether the number of dots was greater than they are 
prepared to handle.

> On a related note, it's unclear why you've added three new arguments
> to processSQLNamePattern() but only one of them gets a mention in the
> function header comment.

Updated the header comments to include all parameters.

> It's also pretty clear that the behavior of patternToSQLRegex() is
> changing, but the function header comments are not.

Updated the header comments for this, too.

Also, rebased as necessary:



v5-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg14 psql broke \d datname.nspname.relname

2022-01-17 Thread Robert Haas
On Mon, Jan 17, 2022 at 1:06 PM Mark Dilger
 wrote:
> > On Jan 15, 2022, at 12:28 AM, Julien Rouhaud  wrote:
> > Could you send a rebased version?
> Yes.  Here it is:

This is not a full review, but I just noticed that:

+ * dotcnt: how many separators were parsed from the pattern, by reference.
+ * Can be NULL.

But then:

+Assert(dotcnt != NULL);

On a related note, it's unclear why you've added three new arguments
to processSQLNamePattern() but only one of them gets a mention in the
function header comment.

It's also pretty clear that the behavior of patternToSQLRegex() is
changing, but the function header comments are not.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2022-01-17 Thread Mark Dilger


> On Jan 15, 2022, at 12:28 AM, Julien Rouhaud  wrote:
> 
> Could you send a rebased version?

Yes.  Here it is:



v4-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg14 psql broke \d datname.nspname.relname

2022-01-15 Thread Julien Rouhaud
Hi,

On Tue, Dec 21, 2021 at 10:58:39AM -0800, Mark Dilger wrote:
> 
> Rebased patch attached:

This version doesn't apply anymore:
http://cfbot.cputube.org/patch_36_3367.log
=== Applying patches on top of PostgreSQL commit ID 
5513dc6a304d8bda114004a3b906cc6fde5d6274 ===
=== applying patch 
./v3-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
[...]
1 out of 52 hunks FAILED -- saving rejects to file src/bin/psql/describe.c.rej

Could you send a rebased version?  In the meantime I will switch the cf entry
to Waiting on Author.




Re: pg14 psql broke \d datname.nspname.relname

2021-12-21 Thread Mark Dilger


Rebased patch attached:



v3-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg14 psql broke \d datname.nspname.relname

2021-11-05 Thread Mark Dilger



> On Nov 5, 2021, at 6:59 AM, Tom Lane  wrote:
> 
> Alvaro Herrera  writes:
>> I think it would appropriate to normalize identifiers that are going to
>> be stored in catalogs.  As presented, this is a bit ridiculous and I see
>> no reason to continue to support it.
> 
> If we had any sort of convention about the encoding of identifiers stored
> in shared catalogs, maybe we could do something about that.  But we don't,
> so any change is inevitably going to break someone's use-case.

I only started the discussion about normalization to demonstrate that existing 
behavior does not require it.

> In any case, that seems quite orthogonal to the question of how to treat
> names with too many dots in them.

Agreed.

>  Considering we are three days out from
> freezing 14.1, I think it is time to stop the meandering discussion and
> fix it.

Agreed.

>  And by "fix", I mean revert to the pre-14 behavior.

That's one solution.  The patch I posted on October 20, and rebased two days 
ago, has not received any negative feedback.  If you want to revert to pre-14 
behavior for 14.1, do you oppose the patch going in for v15?  (I'm not taking a 
position here, just asking what you'd prefer.)

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-11-05 Thread Robert Haas
On Fri, Nov 5, 2021 at 9:59 AM Tom Lane  wrote:
> In any case, that seems quite orthogonal to the question of how to treat
> names with too many dots in them.  Considering we are three days out from
> freezing 14.1, I think it is time to stop the meandering discussion and
> fix it.  And by "fix", I mean revert to the pre-14 behavior.

I do not think that there is consensus on that proposal.

And FWIW, I still oppose it. It's debatable whether this even
qualifies as a bug in the first place, and even more debatable whether
accepting and ignoring arbitrary garbage is the right solution.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-11-05 Thread Tom Lane
Alvaro Herrera  writes:
> I think it would appropriate to normalize identifiers that are going to
> be stored in catalogs.  As presented, this is a bit ridiculous and I see
> no reason to continue to support it.

If we had any sort of convention about the encoding of identifiers stored
in shared catalogs, maybe we could do something about that.  But we don't,
so any change is inevitably going to break someone's use-case.

In any case, that seems quite orthogonal to the question of how to treat
names with too many dots in them.  Considering we are three days out from
freezing 14.1, I think it is time to stop the meandering discussion and
fix it.  And by "fix", I mean revert to the pre-14 behavior.

regards, tom lane




Re: pg14 psql broke \d datname.nspname.relname

2021-11-05 Thread Alvaro Herrera
On 2021-Oct-20, Mark Dilger wrote:

> I tried testing how this plays out by handing `createdb` the name é
> (U+00E9 "LATIN SMALL LETTER E WITH ACCUTE") and then again the name é
> (U+0065 "LATIN SMALL LETTER E" followed by U+0301 "COMBINING ACCUTE
> ACCENT".)  That results in two distinct databases, not an error about
> a duplicate database name:
> 
> # select oid, datname, datdba, encoding, datcollate, datctype from 
> pg_catalog.pg_database where datname IN ('é', 'é');
>   oid  | datname | datdba | encoding | datcollate  |  datctype   
> ---+-++--+-+-
>  37852 | é   | 10 |6 | en_US.UTF-8 | en_US.UTF-8
>  37855 | é   | 10 |6 | en_US.UTF-8 | en_US.UTF-8
> (2 rows)
> 
> But that doesn't seem to prove much, as other tools in my locale don't
> treat those as equal either.  (Testing with perl's "eq" operator, they
> compare as distinct.)  I expected to find regression tests providing
> better coverage for this somewhere, but did not.  Anybody know more
> about it?

I think it would appropriate to normalize identifiers that are going to
be stored in catalogs.  As presented, this is a bit ridiculous and I see
no reason to continue to support it.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Ed is the standard text editor."
  http://groups.google.com/group/alt.religion.emacs/msg/8d94ddab6a9b0ad3




Re: pg14 psql broke \d datname.nspname.relname

2021-11-04 Thread Mark Dilger



> On Nov 4, 2021, at 6:37 AM, Hamlin, Garick L  wrote:
> 
>> If we pass the dots through to the POSIX regular expression, we can
>> only do that either for the table name or the schema name, not both -
>> either the first or last dot must mark the boundary between the two.
>> That means that you can't use all the same regexy things for one as
>> you can for the other, which is a strange system. I knew that your
>> patch made it do that, and I committed it that way because I didn't
>> think it really mattered, and also because the whole system is already
>> pretty strange, so what's one more bit of strangeness?
> 
> Rather than trying to guess at the meaning of each '.' based on the total
> string.  I wonder, if we could for v15 require '.' to be spelled in longer way
> if it needs to be treated as part of the regex.

We're trying to fix an edge case, not change how the basic case works.  Most 
users are accustomed to using patterns from within psql like:

  \dt myschema.mytable

Whatever patch we accept must not break these totally normal and currently 
working cases.

> Perhaps requiring something like '(.)' be used rather than a bare '.' 
> might be good enough and documenting otherwise it's really a separator?  
> I suppose we could also invent a non-standard class as a stand in like
> '[::any::]', but that seems kinda weird.

If I understand you, that would require the above example to be written as:

  \dt myschema(.)mytable

which nobody expects to have to do, and which would be a very significant 
breaking change in v15.  I can't see anything like that being accepted.

> I think it might be possible to give better error messages long term
> if we knew what '.' should mean without looking at the whole thing.

You quote a portion of an email from Robert.  After that email, there were 
several more, and a new patch.  The commit message of the new patch explains 
what it does.  I wonder if you'd review that message, quoted here, or even 
better, review the entire patch.  Does this seem like an ok fix to you?

Subject: [PATCH v2] Reject patterns with too many parts or wrong db

Object name patterns used by pg_dump and psql potentially contain
multiple parts (dotted names), and nothing prevents users from
specifying a name with too many parts, nor specifying a
database-qualified name for a database other than the currently
connected database.  Prior to PostgreSQL version 14, pg_dump,
pg_dumpall and psql quietly discarded extra parts of the name on the
left.  For example, `pg_dump -t` only expected a possibly schema
qualified table name, not a database name, and the following command

  pg_dump -t production.marketing.customers

quietly ignored the "production" database name with neither warning
nor error.  Commit 2c8726c4b0a496608919d1f78a5abc8c9b6e0868 changed
the behavior of name parsing.  Where names contain more than the
maximum expected number of dots, the extra dots on the right were
interpreted as part of the name, such that the above example was
interpreted as schema=production, relation=marketing.customers.
This turns out to be highly unintuitive to users.

We've had reports that users sometimes copy-and-paste database- and
schema-qualified relation names from the logs.
https://www.postgresql.org/message-id/20211013165426.GD27491%40telsasoft.com

There is no support for cross database references, but allowing a
database qualified pattern when the database portion matches the
current database, as in the above report, seems more friendly than
rejecting it, so do that.  We don't allow the database portion
itself to be a pattern, because if it matched more than one database
(including the current one), there would be confusion about which
database(s) were processed.

Consistent with how we allow db.schemapat.relpat in pg_dump and psql,
also allow db.schemapat for specifying schemas, as:

\dn mydb.myschema

in psql and

pg_dump --schema=mydb.myschema

Fix the pre-v14 behavior of ignoring leading portions of patterns
containing too many dotted names, and the v14.0 misfeature of
combining trailing portions of such patterns, and instead reject
such patterns in all cases by raising an error.



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-11-04 Thread Hamlin, Garick L
On Wed, Oct 13, 2021 at 09:24:53AM -0400, Robert Haas wrote:
> > Splitting the pattern on all the dots and throwing away any additional
> > leftmost fields is a bug, ...
> 
> I also agree with you right up to here.
> 
> > and when you stop doing that, passing additional dots through to the POSIX
> > regular expression for processing is the most natural thing to do.  This
> > is, in fact, how v14 works.  It is a bit debatable whether treating the
> > first dot as a separator and the additional dots as stuff to be passed
> > through is the right thing, so we could call the v14 behavior a
> > mis-feature, but it's not as clearcut as the discussion upthread suggested.
> > Reverting to v13 behavior seems wrong, but I'm now uncertain how to
> > proceed.
> 
> But not this part, or at least not entirely.
> 
> If we pass the dots through to the POSIX regular expression, we can
> only do that either for the table name or the schema name, not both -
> either the first or last dot must mark the boundary between the two.
> That means that you can't use all the same regexy things for one as
> you can for the other, which is a strange system. I knew that your
> patch made it do that, and I committed it that way because I didn't
> think it really mattered, and also because the whole system is already
> pretty strange, so what's one more bit of strangeness?

Rather than trying to guess at the meaning of each '.' based on the total
string.  I wonder, if we could for v15 require '.' to be spelled in longer way
if it needs to be treated as part of the regex.

Perhaps requiring something like '(.)' be used rather than a bare '.' 
might be good enough and documenting otherwise it's really a separator?  
I suppose we could also invent a non-standard class as a stand in like
'[::any::]', but that seems kinda weird.

I think it might be possible to give better error messages long term
if we knew what '.' should mean without looking at the whole thing.

Garick



Re: pg14 psql broke \d datname.nspname.relname

2021-11-03 Thread Mark Dilger


> On Nov 3, 2021, at 12:07 PM, Tom Lane  wrote:
> 
> Mark Dilger  writes:
>> [ v1-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch ]
> 
> This needs a rebase over the recent renaming of our Perl test modules.
> (Per the cfbot, so do several of your other pending patches.)
> 
>   regards, tom lane

Thanks for calling my attention to it.



v2-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg14 psql broke \d datname.nspname.relname

2021-11-03 Thread Tom Lane
Mark Dilger  writes:
> [ v1-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch ]

This needs a rebase over the recent renaming of our Perl test modules.
(Per the cfbot, so do several of your other pending patches.)

regards, tom lane




Re: pg14 psql broke \d datname.nspname.relname

2021-10-20 Thread Mark Dilger

> On Oct 13, 2021, at 1:43 PM, Mark Dilger  wrote:
> 
> The issue of name parsing impacts pg_dump and pg_dumpall, also.  Consider 
> what happens with:
> 
> pg_dump -t production.critical.secrets > secrets.dump
> dropdb production
> 
> In v13, if your default database is "testing", and database "testing" has the 
> same schemas and tables (but not data) as production, you are unhappy.  You 
> just dumped a copy of your test data and blew away the production data.
> 
> You could end up unhappy in v14, if database "testing" has a schema named 
> "production" and a table that matches the pattern /^critical.secrets$/, but 
> otherwise, you'll get an error from pg_dump, "pg_dump: error: no matching 
> tables were found".  Neither behavior seems correct.

With the attached patch, this scenario results in a "cross-database references 
are not implemented" error.

> The function where the processing occurs is processSQLNamePattern, which is 
> called by pg_dump, pg_dumpall, and psql.  All three callers expect 
> processSQLNamePattern to append where-clauses to a buffer, not to execute any 
> sql of its own.  I propose that processSQLNamePattern return an error code if 
> the pattern contains more than three parts, but otherwise insert the database 
> portion into the buffer as a "pg_catalog.current_database() 
> OPERATOR(pg_catalog.=) ", where  is a properly escaped 
> representation of the database portion.  Maybe someday we can change that to 
> OPERATOR(pg_catalog.~), but for now we lack the sufficient logic for handling 
> multiple matching database names.  (The situation is different for 
> pg_dumpall, as it's using the normal logic for matching a relation name, not 
> for matching a database, and we'd still be fine matching that against a 
> pattern.)

I ultimately went with your strcmp idea rather than OPERATOR(pg_catalog.=), as 
rejecting the database name as part of the query complicates the calling 
convention for no apparent benefit.  I had been concerned about database names 
that were collation-wise equal but byte-wise unequal, but it seems we already 
treat those as distinct database names, so my concern was unnecessary.  We 
already use strcmp on database names from frontend clients 
(fe_utils/parallel_slots.c, psql/prompt.c, pg_amcheck.c, pg_dump.c, 
pg_upgrade/relfilenode.c), from libpq (libpq/hba.c) and from the backend 
(commands/dbcommands.c, init/postinit.c).  

I tried testing how this plays out by handing `createdb` the name é (U+00E9 
"LATIN SMALL LETTER E WITH ACCUTE") and then again the name é (U+0065 "LATIN 
SMALL LETTER E" followed by U+0301 "COMBINING ACCUTE ACCENT".)  That results in 
two distinct databases, not an error about a duplicate database name:

# select oid, datname, datdba, encoding, datcollate, datctype from 
pg_catalog.pg_database where datname IN ('é', 'é');
  oid  | datname | datdba | encoding | datcollate  |  datctype   
---+-++--+-+-
 37852 | é   | 10 |6 | en_US.UTF-8 | en_US.UTF-8
 37855 | é   | 10 |6 | en_US.UTF-8 | en_US.UTF-8
(2 rows)

But that doesn't seem to prove much, as other tools in my locale don't treat 
those as equal either.  (Testing with perl's "eq" operator, they compare as 
distinct.)  I expected to find regression tests providing better coverage for 
this somewhere, but did not.  Anybody know more about it?

> For psql and pg_dump, I'm tempted to restrict the database portion (if not 
> quoted) to neither contain shell glob characters nor POSIX regex characters, 
> and return an error code if any are found, so that the clients can raise an 
> appropriate error to the user.

With the patch, using pattern characters in an unquoted database portion 
results in a "database name must be literal" error.  Using them in a quoted 
database name is allowed, but unless you are connected to a database that 
literally equals that name, you will get a "cross-database references are not 
implemented" error.

> In psql, this proposal would result in no tables matching \d 
> wrongdb.schema.table, which would differ from v13's behavior.  You wouldn't 
> get an error about having specified the wrong database.  You'd just get no 
> matching relations.  \d ??db??.schema.table would complain about the db 
> portion being a pattern.  \d "??db??".schema.table would work, assuming 
> you're connected to a database literally named ??db??

With the patch, psql will treat \d wrongdb.schema.table as a "cross-database 
references are not implemented" error.

> In pg_dumpall, --exclude-database=more.than.one.part would give an error 
> about too many dotted parts rather than simply trying to exclude the last 
> "part" and silently ignoring the prefix, which I think is what v13's 
> pg_dumpall would do.  --exclude-database=db?? would work to exclude four 
> character database names beginning in "db".

The patch implements this.

> In pg_dump, the -t wrongdb.schema.table would match nothing and giv

Re: pg14 psql broke \d datname.nspname.relname

2021-10-14 Thread Robert Haas
On Wed, Oct 13, 2021 at 4:43 PM Mark Dilger
 wrote:
> The function where the processing occurs is processSQLNamePattern, which is 
> called by pg_dump, pg_dumpall, and psql.  All three callers expect 
> processSQLNamePattern to append where-clauses to a buffer, not to execute any 
> sql of its own.  I propose that processSQLNamePattern return an error code if 
> the pattern contains more than three parts, but otherwise insert the database 
> portion into the buffer as a "pg_catalog.current_database() 
> OPERATOR(pg_catalog.=) ", where  is a properly escaped 
> representation of the database portion.  Maybe someday we can change that to 
> OPERATOR(pg_catalog.~), but for now we lack the sufficient logic for handling 
> multiple matching database names.  (The situation is different for 
> pg_dumpall, as it's using the normal logic for matching a relation name, not 
> for matching a database, and we'd still be fine matching that against a 
> pattern.)

I agree with matching using OPERATOR(pg_catalog.=) but I think it
should be an error, not a silently-return-nothing case.

> In pg_dumpall, --exclude-database=more.than.one.part would give an error 
> about too many dotted parts rather than simply trying to exclude the last 
> "part" and silently ignoring the prefix, which I think is what v13's 
> pg_dumpall would do.  --exclude-database=db?? would work to exclude four 
> character database names beginning in "db".

Those things sound good.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-13 Thread Mark Dilger



> On Oct 13, 2021, at 8:43 AM, Robert Haas  wrote:
> 
> On Wed, Oct 13, 2021 at 10:40 AM Mark Dilger
>  wrote:
>> 3a is a bit strange, when considered in the context of patterns.  If db1, 
>> db2, and db3 all exist and each have a table foo.bar, and psql is connected 
>> to db1, how should the command \d db?.foo.bar behave?  We have no problem 
>> with db1.foo.bar, but we do have problems with the other two.  If the answer 
>> is to complain about the databases that are unconnected, consider what 
>> happens if the user writes this in a script when only db1 exists, and later 
>> the script stops working because somebody created database db2.  Maybe 
>> that's not completely horrible, but surely it is less than ideal.
>> 
>> 3b is what pg_amcheck does.  It accepts database.schema.relname, and it will 
>> complain if no matching database/schema/relation can be found (unless 
>> --no-strict-names was given.)
> 
> Well, like I said, we can't treat a part that's purportedly a DB name
> as a pattern, so when connected to db1, I presume the command \d
> db?.foo.bar would have to behave just like \d
> dskjlglsghdksgdjkshg.foo.bar. I suppose technically I'm wrong: db?
> could be matched against the list of database names as a pattern, and
> then we could complain only if it doesn't match exactly and only the
> current DB. But I don't like adding a bunch of extra code to
> accomplish nothing useful, so if we're going to match it all I think
> it should just strcmp().
> 
> But I'm still not sure what the best thing to do overall is here.

The issue of name parsing impacts pg_dump and pg_dumpall, also.  Consider what 
happens with:

pg_dump -t production.critical.secrets > secrets.dump
dropdb production

In v13, if your default database is "testing", and database "testing" has the 
same schemas and tables (but not data) as production, you are unhappy.  You 
just dumped a copy of your test data and blew away the production data.

You could end up unhappy in v14, if database "testing" has a schema named 
"production" and a table that matches the pattern /^critical.secrets$/, but 
otherwise, you'll get an error from pg_dump, "pg_dump: error: no matching 
tables were found".  Neither behavior seems correct.

The function where the processing occurs is processSQLNamePattern, which is 
called by pg_dump, pg_dumpall, and psql.  All three callers expect 
processSQLNamePattern to append where-clauses to a buffer, not to execute any 
sql of its own.  I propose that processSQLNamePattern return an error code if 
the pattern contains more than three parts, but otherwise insert the database 
portion into the buffer as a "pg_catalog.current_database() 
OPERATOR(pg_catalog.=) ", where  is a properly escaped 
representation of the database portion.  Maybe someday we can change that to 
OPERATOR(pg_catalog.~), but for now we lack the sufficient logic for handling 
multiple matching database names.  (The situation is different for pg_dumpall, 
as it's using the normal logic for matching a relation name, not for matching a 
database, and we'd still be fine matching that against a pattern.)

For psql and pg_dump, I'm tempted to restrict the database portion (if not 
quoted) to neither contain shell glob characters nor POSIX regex characters, 
and return an error code if any are found, so that the clients can raise an 
appropriate error to the user.

In psql, this proposal would result in no tables matching \d 
wrongdb.schema.table, which would differ from v13's behavior.  You wouldn't get 
an error about having specified the wrong database.  You'd just get no matching 
relations.  \d ??db??.schema.table would complain about the db portion being a 
pattern.  \d "??db??".schema.table would work, assuming you're connected to a 
database literally named ??db??

In pg_dumpall, --exclude-database=more.than.one.part would give an error about 
too many dotted parts rather than simply trying to exclude the last "part" and 
silently ignoring the prefix, which I think is what v13's pg_dumpall would do.  
--exclude-database=db?? would work to exclude four character database names 
beginning in "db".

In pg_dump, the -t wrongdb.schema.table would match nothing and give the 
familiar error "pg_dump: error: no matching tables were found".  pg_dump -t 
too.many.dotted.names would give a different error about too many parts.  
pg_dump -t db??.foo.bar would give an error about the database needing to be a 
literal name rather than a pattern.

I don't like your proposal to use a strcmp() rather than a pg_catalog.= match, 
because it diverges from how the rest of the pattern is treated, including in 
how encoding settings might interact with the name, needing to be executed on 
the client side rather than in the server where the rest of the name resolution 
is happening. 

Does this sound like a workable proposal?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-13 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Wed, Oct 13, 2021 at 12:54 PM Justin Pryzby  wrote:
> > It seems unfortunate if names from log messages qualified with datname were 
> > now
> > rejected.  Like this one:
> >
> > | automatic analyze of table "ts.child.cdrs_2021_10_12"...
> 
> That's a good argument, IMHO.

Agreed.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pg14 psql broke \d datname.nspname.relname

2021-10-13 Thread Robert Haas
On Wed, Oct 13, 2021 at 12:54 PM Justin Pryzby  wrote:
> It seems unfortunate if names from log messages qualified with datname were 
> now
> rejected.  Like this one:
>
> | automatic analyze of table "ts.child.cdrs_2021_10_12"...

That's a good argument, IMHO.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-13 Thread Justin Pryzby
On Wed, Oct 13, 2021 at 12:46:27PM -0400, Robert Haas wrote:
> On Tue, Oct 12, 2021 at 12:57 PM Justin Pryzby  wrote:
> > I would prefer if it errored if the datname didn't match the current 
> > database.
> > After all, it would've helped me to avoid making a confusing problem report.
> 
> How would you have felt if it had said something like:
> 
> error: argument to \d should be of the form
> [schema-name-pattern.]relation-name-pattern
> 
> Would that have been better or worse for you than accepting a third
> part of the pattern as a database name if and only if it matched the
> current database name exactly?

I don't normally type \d a.b.c.  I think I copied it out of a log message and
pasted it, and didn't even really know or expect it to work without removing
the datname prefix.  After it worked, I noticed a short while later when using
the pg14 client that it had stopped working.

It seems unfortunate if names from log messages qualified with datname were now
rejected.  Like this one:

| automatic analyze of table "ts.child.cdrs_2021_10_12"...

-- 
Justin




Re: pg14 psql broke \d datname.nspname.relname

2021-10-13 Thread Robert Haas
On Tue, Oct 12, 2021 at 12:57 PM Justin Pryzby  wrote:
> I would prefer if it errored if the datname didn't match the current database.
> After all, it would've helped me to avoid making a confusing problem report.

How would you have felt if it had said something like:

error: argument to \d should be of the form
[schema-name-pattern.]relation-name-pattern

Would that have been better or worse for you than accepting a third
part of the pattern as a database name if and only if it matched the
current database name exactly?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-13 Thread Robert Haas
On Wed, Oct 13, 2021 at 10:40 AM Mark Dilger
 wrote:
> 3a is a bit strange, when considered in the context of patterns.  If db1, 
> db2, and db3 all exist and each have a table foo.bar, and psql is connected 
> to db1, how should the command \d db?.foo.bar behave?  We have no problem 
> with db1.foo.bar, but we do have problems with the other two.  If the answer 
> is to complain about the databases that are unconnected, consider what 
> happens if the user writes this in a script when only db1 exists, and later 
> the script stops working because somebody created database db2.  Maybe that's 
> not completely horrible, but surely it is less than ideal.
>
> 3b is what pg_amcheck does.  It accepts database.schema.relname, and it will 
> complain if no matching database/schema/relation can be found (unless 
> --no-strict-names was given.)

Well, like I said, we can't treat a part that's purportedly a DB name
as a pattern, so when connected to db1, I presume the command \d
db?.foo.bar would have to behave just like \d
dskjlglsghdksgdjkshg.foo.bar. I suppose technically I'm wrong: db?
could be matched against the list of database names as a pattern, and
then we could complain only if it doesn't match exactly and only the
current DB. But I don't like adding a bunch of extra code to
accomplish nothing useful, so if we're going to match it all I think
it should just strcmp().

But I'm still not sure what the best thing to do overall is here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-13 Thread Mark Dilger



> On Oct 13, 2021, at 6:24 AM, Robert Haas  wrote:
> 
>> and when you stop doing that, passing additional dots through to the POSIX 
>> regular expression for processing is the most natural thing to do.  This is, 
>> in fact, how v14 works.  It is a bit debatable whether treating the first 
>> dot as a separator and the additional dots as stuff to be passed through is 
>> the right thing, so we could call the v14 behavior a mis-feature, but it's 
>> not as clearcut as the discussion upthread suggested.  Reverting to v13 
>> behavior seems wrong, but I'm now uncertain how to proceed.
> 
> But not this part, or at least not entirely.
> 
> If we pass the dots through to the POSIX regular expression, we can
> only do that either for the table name or the schema name, not both -

Agreed.

> either the first or last dot must mark the boundary between the two.
> That means that you can't use all the same regexy things for one as
> you can for the other, which is a strange system.

The closest analogy is how regular expressions consider \1 \2 .. \9 as 
backreferences, but \10 \11 ... are dependent on context:  "A multi-digit 
sequence not starting with a zero is taken as a back reference if it comes 
after a suitable subexpression (i.e., the number is in the legal range for a 
back reference), and otherwise is taken as octal."  Taking a dot as a separator 
if it can be taken that way, and as a regex character otherwise, is not totally 
out of line with existing precedent.  On the other hand, the backreference vs. 
octal precedent is not one I particularly like.

> I knew that your
> patch made it do that, and I committed it that way because I didn't
> think it really mattered, and also because the whole system is already
> pretty strange, so what's one more bit of strangeness?
> 
> I think there are at least 3 defensible behaviors here:
> 
> 1. Leave it like it is. If there is more than one dot, the extra ones
> are part of one of the regex-glob thingies.
> 
> 2. If there is more than one dot, error! Tell the user they messed up.

I don't like the backward compatibility issues with this one.  Justin's use of 
database.schema.relname will work up until v14 (by throwing away the database 
part), then draw an error in v14, then (assuming we support the database 
portion in v15 onward) start working again.

> 3. If there are exactly two dots, treat it as db-schema-user. Accept
> it if the dbname matches the current db, and otherwise say we can't
> access the named db. If there are more than two dots, then (a) it's an
> error as in (2) or (b) the extra ones become part of the regex-glob
> thingies as in (2).

3a is a bit strange, when considered in the context of patterns.  If db1, db2, 
and db3 all exist and each have a table foo.bar, and psql is connected to db1, 
how should the command \d db?.foo.bar behave?  We have no problem with 
db1.foo.bar, but we do have problems with the other two.  If the answer is to 
complain about the databases that are unconnected, consider what happens if the 
user writes this in a script when only db1 exists, and later the script stops 
working because somebody created database db2.  Maybe that's not completely 
horrible, but surely it is less than ideal.

3b is what pg_amcheck does.  It accepts database.schema.relname, and it will 
complain if no matching database/schema/relation can be found (unless 
--no-strict-names was given.)

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-13 Thread Robert Haas
On Tue, Oct 12, 2021 at 5:21 PM Mark Dilger
 wrote:
> I wasn't thinking critically enough about how psql handles \d when I accepted 
> Justin's initial characterization of the bug.  The psql client has never 
> thought about the stuff to the left of the schema name as a database name, 
> even if some users thought about it that way.  It also doesn't think about 
> the pattern as a literal string.

I agree.

> The psql client's interpretation of the pattern is a bit of a chimera, 
> following shell glob patterns for some things and POSIX regex rules for 
> others.

Yes. And that's pretty weird, but it's long-established precedent so
we have to deal with it.

> Splitting the pattern on all the dots and throwing away any additional 
> leftmost fields is a bug, ...

I also agree with you right up to here.

> and when you stop doing that, passing additional dots through to the POSIX 
> regular expression for processing is the most natural thing to do.  This is, 
> in fact, how v14 works.  It is a bit debatable whether treating the first dot 
> as a separator and the additional dots as stuff to be passed through is the 
> right thing, so we could call the v14 behavior a mis-feature, but it's not as 
> clearcut as the discussion upthread suggested.  Reverting to v13 behavior 
> seems wrong, but I'm now uncertain how to proceed.

But not this part, or at least not entirely.

If we pass the dots through to the POSIX regular expression, we can
only do that either for the table name or the schema name, not both -
either the first or last dot must mark the boundary between the two.
That means that you can't use all the same regexy things for one as
you can for the other, which is a strange system. I knew that your
patch made it do that, and I committed it that way because I didn't
think it really mattered, and also because the whole system is already
pretty strange, so what's one more bit of strangeness?

I think there are at least 3 defensible behaviors here:

1. Leave it like it is. If there is more than one dot, the extra ones
are part of one of the regex-glob thingies.

2. If there is more than one dot, error! Tell the user they messed up.

3. If there are exactly two dots, treat it as db-schema-user. Accept
it if the dbname matches the current db, and otherwise say we can't
access the named db. If there are more than two dots, then (a) it's an
error as in (2) or (b) the extra ones become part of the regex-glob
thingies as in (2).

The thing that's unprincipled about (3) is that we can't support a
regexp-glob thingy there -- we can only test for a literal string
match. And I already said what I thought was wrong with (1). But none
of these are horrible, and I don't think it really matters which one
we adopt. I don't even know if I can really rank the choices I just
listed against each other. Before I was arguing for (3a) but I'm not
sure I actually like that one particularly better.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Mark Dilger



> On Oct 12, 2021, at 10:18 AM, Mark Dilger  
> wrote:
> 
> Here is a WIP patch that restores the old behavior, just so you can eyeball 
> how large it is.  (It passes check-world and I've read it over once, but I'm 
> not ready to stand by this as correct quite yet.)  I need to add a regression 
> test to make sure this behavior is not accidentally changed in the future, 
> and will repost after doing so.

I wasn't thinking critically enough about how psql handles \d when I accepted 
Justin's initial characterization of the bug.  The psql client has never 
thought about the stuff to the left of the schema name as a database name, even 
if some users thought about it that way.  It also doesn't think about the 
pattern as a literal string.

The psql client's interpretation of the pattern is a bit of a chimera, 
following shell glob patterns for some things and POSIX regex rules for others. 
 The reason for that is shell glob stuff gets transliterated into the 
corresponding POSIX syntax, but non-shell-glob stuff is left in tact, with the 
one outlier being dots, which have a very special interpretation.  The 
interpretation of a dot as meaning "match one character" is not a shell glob 
rule but a regex one, and one that psql never supported because it split the 
pattern on all dots and threw away stuff to the left.  There was therefore 
never an opportunity for an unquoted dot to make it through to the POSIX 
regular expression for processing.  For other regex type stuff, it happily 
passed it through to the POSIX regex, so that the following examples work even 
though they contain non-shell-glob regex stuff:

v13=# create table ababab (i integer);
CREATE TABLE

v13=# \dt (ab){3}
   List of relations
 Schema |  Name  | Type  |Owner
++---+-
 public | ababab | table | mark.dilger
(1 row)

v13=# \dt pg_catalog.pg_clas{1,2}
  List of relations
   Schema   |   Name   | Type  |Owner
+--+---+-
 pg_catalog | pg_class | table | mark.dilger

v13=# \dt pg_catalog.pg_[am]{1,3}
List of relations
   Schema   | Name  | Type  |Owner
+---+---+-
 pg_catalog | pg_am | table | mark.dilger
(1 row)

Splitting the pattern on all the dots and throwing away any additional leftmost 
fields is a bug, and when you stop doing that, passing additional dots through 
to the POSIX regular expression for processing is the most natural thing to do. 
 This is, in fact, how v14 works.  It is a bit debatable whether treating the 
first dot as a separator and the additional dots as stuff to be passed through 
is the right thing, so we could call the v14 behavior a mis-feature, but it's 
not as clearcut as the discussion upthread suggested.  Reverting to v13 
behavior seems wrong, but I'm now uncertain how to proceed.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Mark Dilger



> On Oct 12, 2021, at 10:54 AM, Robert Haas  wrote:
> 
> On Tue, Oct 12, 2021 at 1:18 PM Mark Dilger
>  wrote:
>> Here is a WIP patch that restores the old behavior, just so you can eyeball 
>> how large it is.
> 
> I guess that's not that bad. Why did we end up with the behavior that
> the current comment describes this way?
> 
> "(Additional dots in the name portion are not treated as special.)"
> 
> I thought there was some reason why it needed to work that way.

We're not talking about the parsing of string literals, but rather about the 
parsing of shell-style patterns.  The primary caller of this logic is 
processSQLNamePattern(), which expects only a relname or a (schema,relname) 
pair, not database names nor anything else.

The pattern myschema.my.*table is not a three-part pattern, but a two part 
pattern, with a literal schema name and a relation name pattern.  In v14 it can 
be seen to work as follows:

\d pg_toast.pg_.oast_2619
TOAST table "pg_toast.pg_toast_2619"
   Column   |  Type
+-
 chunk_id   | oid
 chunk_seq  | integer
 chunk_data | bytea
Owning table: "pg_catalog.pg_statistic"
Indexes:
"pg_toast_2619_index" PRIMARY KEY, btree (chunk_id, chunk_seq)

\d pg_toast.pg_.*_2619
TOAST table "pg_toast.pg_toast_2619"
   Column   |  Type
+-
 chunk_id   | oid
 chunk_seq  | integer
 chunk_data | bytea
Owning table: "pg_catalog.pg_statistic"
Indexes:
"pg_toast_2619_index" PRIMARY KEY, btree (chunk_id, chunk_seq)

In v13, neither of those matched anything (which is defensible, I guess) but 
the following did match, which is really nuts:

+CREATE SCHEMA g_;
+CREATE TABLE g_.oast_2619 (i integer);
+\d pg_toast..g_.oast_2619
+   Table "g_.oast_2619"
+ Column |  Type   | Collation | Nullable | Default 
++-+---+--+-
+ i  | integer |   |  | 


The behavior Justin reported in the original complaint was \d 
regresion.public.bit_defaults, which gets handled as schema =~ /^(regresion)$/ 
and relname =~ /^(public.bit_defaults)$/.  That gives no results for him, but I 
tend to think no results is defensible.

Apparently, this behavior breaks an old bug, and we need to restore the old bug 
and then debate this behavioral change for v15.  I'd rather people had engaged 
in the discussion about this feature during the v14 cycle, since this patch was 
posted and reviewed on -hackers, and I don't recall anybody complaining about 
it.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Robert Haas
On Tue, Oct 12, 2021 at 1:18 PM Mark Dilger
 wrote:
> Here is a WIP patch that restores the old behavior, just so you can eyeball 
> how large it is.

I guess that's not that bad. Why did we end up with the behavior that
the current comment describes this way?

"(Additional dots in the name portion are not treated as special.)"

I thought there was some reason why it needed to work that way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Mark Dilger



> On Oct 12, 2021, at 10:01 AM, Robert Haas  wrote:
> 
> On Tue, Oct 12, 2021 at 12:44 PM Peter Geoghegan  wrote:
>> You're asking us to imagine a counterfactual. But this counterfactual
>> bug report would have to describe a real practical problem.
> 
> Yes. And I think this one should be held to the same standard: \d
> mydb.myschema.mytable not working is potentially a real, practical
> problem. \d sdlgkjdss.dsgkjsk.sdgskldjgds.myschema.mytable not working
> isn't.

I favor restoring the v13 behavior, but I don't think \d mydb.myschema.mytable 
was ever legitimate.  You got exactly the same results with \d 
nosuchdb.myschema.mytable, meaning the user was given a false sense of security 
that the database name was being used to fetch the definition from the database 
they specified.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Mark Dilger


> On Oct 12, 2021, at 10:03 AM, Robert Haas  wrote:
> 
> On Tue, Oct 12, 2021 at 12:57 PM Justin Pryzby  wrote:
>> I think there's an easy answer here that would satisfy everyone; two patches:
>> 0001 to fix the unintentional behavior change;
>> 0002 to reject garbage input: anything with more than 3 dot-separated
>> components, or with 3 components where the first doesn't match
>> current_database.
>> 
>> 0001 would be backpatched to v14.
>> 
>> If it turns out there's no consensus on 0002, or if it were really hard for
>> some reason, or (more likely) nobody went to the bother to implement it this
>> year, then that's okay.
> 
> This might work, but I fear that 0001 would end up being substantially
> more complicated than a combined patch that solves both problems
> together.

Here is a WIP patch that restores the old behavior, just so you can eyeball how 
large it is.  (It passes check-world and I've read it over once, but I'm not 
ready to stand by this as correct quite yet.)  I need to add a regression test 
to make sure this behavior is not accidentally changed in the future, and will 
repost after doing so.



string_utils.patch.WIP
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Robert Haas
On Tue, Oct 12, 2021 at 12:57 PM Justin Pryzby  wrote:
> I think there's an easy answer here that would satisfy everyone; two patches:
> 0001 to fix the unintentional behavior change;
> 0002 to reject garbage input: anything with more than 3 dot-separated
>  components, or with 3 components where the first doesn't match
>  current_database.
>
> 0001 would be backpatched to v14.
>
> If it turns out there's no consensus on 0002, or if it were really hard for
> some reason, or (more likely) nobody went to the bother to implement it this
> year, then that's okay.

This might work, but I fear that 0001 would end up being substantially
more complicated than a combined patch that solves both problems
together.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Robert Haas
On Tue, Oct 12, 2021 at 12:44 PM Peter Geoghegan  wrote:
> You're asking us to imagine a counterfactual. But this counterfactual
> bug report would have to describe a real practical problem.

Yes. And I think this one should be held to the same standard: \d
mydb.myschema.mytable not working is potentially a real, practical
problem. \d sdlgkjdss.dsgkjsk.sdgskldjgds.myschema.mytable not working
isn't.

> Let's assume that it is bug compatibility. Is that intrinsically a bad thing?

Well my view is that having the same bugs is better than having
different ones, but fixing the bugs is superior to either.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Justin Pryzby
I understand Tom's position to be that the behavior should be changed back,
since it was 1) unintentional; and 2) breaks legitimate use (when the datname
matches current_database).

I think there's an easy answer here that would satisfy everyone; two patches:
0001 to fix the unintentional behavior change;
0002 to reject garbage input: anything with more than 3 dot-separated
 components, or with 3 components where the first doesn't match
 current_database.

0001 would be backpatched to v14.

If it turns out there's no consensus on 0002, or if it were really hard for
some reason, or (more likely) nobody went to the bother to implement it this
year, then that's okay.

I would prefer if it errored if the datname didn't match the current database.
After all, it would've helped me to avoid making a confusing problem report.

-- 
Justin




Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Vik Fearing
On 10/12/21 5:19 PM, Stephen Frost wrote:
> Greetings,
> 
> * Robert Haas (robertmh...@gmail.com) wrote:
>> On Tue, Oct 12, 2021 at 10:31 AM Tom Lane  wrote:
>>> If the behavior v14 had implemented were "throw an error if the
>>> first word doesn't match the current database name", perhaps nobody
>>> would have questioned it.  But that's not what we have.  It's fairly
>>> clear that neither you nor Mark thought very much about this case,
>>> let alone tested it.  Given that, I am not very pleased that you
>>> are retroactively trying to justify breaking it by claiming that
>>> it was already broken.  It's been that way since 7.3 implemented
>>> schemas, more or less, and nobody's complained about it.  Therefore
>>> I see little argument for changing that behavior.  Changing it in
>>> an already-released branch is especially suspect.
>>
>> Oh, give me a break. The previous behavior obviously hasn't been
>> tested either, and is broken on its face. If someone *had* complained
>> about it, I imagine you would have promptly fixed it and likely
>> back-patched the fix, probably in under 24 hours from the time of the
>> report. I find it difficult to take seriously the contention that
>> anyone is expecting \d dlsgjdsghj.sdhg.l.dsg.jkhsdg.foo.bar to work
>> like \d foo.bar, or that they would even prefer that behavior over an
>> error message. You're carefully avoiding addressing that question in
>> favor of having a discussion of backward compatibility, but a better
>> term for what we're talking about here would be bug-compatibility.
> 
> I tend to agree with Robert on this particular case.  Accepting random
> nonsense there isn't a feature or something which really needs to be
> preserved.  For my 2c, I would hope that one day we will be able to
> accept other database names there and if that happens, what then?  We'd
> "break" these cases anyway.  Better to be clear that such nonsense isn't
> intended to be accepted and clean that up.  I do think it'd be good to
> accept the current database name there as that's reasonable.

I am going to throw my hat in with Robert and Stephen, too.  At least
for 15 if we don't want to change this behavior in back branches.
-- 
Vik Fearing




Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Peter Geoghegan
On Tue, Oct 12, 2021 at 7:41 AM Robert Haas  wrote:
> Oh, give me a break. The previous behavior obviously hasn't been
> tested either, and is broken on its face. If someone *had* complained
> about it, I imagine you would have promptly fixed it and likely
> back-patched the fix, probably in under 24 hours from the time of the
> report.

You're asking us to imagine a counterfactual. But this counterfactual
bug report would have to describe a real practical problem. The
details would matter. It's reasonable to suppose that we haven't seen
such a bug report for a reason.

I can't speak for Tom. My position on this is that it's better to
leave it alone at this time, given the history, and the lack of
complaints from users.

> I find it difficult to take seriously the contention that
> anyone is expecting \d dlsgjdsghj.sdhg.l.dsg.jkhsdg.foo.bar to work
> like \d foo.bar, or that they would even prefer that behavior over an
> error message. You're carefully avoiding addressing that question in
> favor of having a discussion of backward compatibility, but a better
> term for what we're talking about here would be bug-compatibility.

Let's assume that it is bug compatibility. Is that intrinsically a bad thing?

-- 
Peter Geoghegan




Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Oct 12, 2021 at 10:31 AM Tom Lane  wrote:
> > If the behavior v14 had implemented were "throw an error if the
> > first word doesn't match the current database name", perhaps nobody
> > would have questioned it.  But that's not what we have.  It's fairly
> > clear that neither you nor Mark thought very much about this case,
> > let alone tested it.  Given that, I am not very pleased that you
> > are retroactively trying to justify breaking it by claiming that
> > it was already broken.  It's been that way since 7.3 implemented
> > schemas, more or less, and nobody's complained about it.  Therefore
> > I see little argument for changing that behavior.  Changing it in
> > an already-released branch is especially suspect.
> 
> Oh, give me a break. The previous behavior obviously hasn't been
> tested either, and is broken on its face. If someone *had* complained
> about it, I imagine you would have promptly fixed it and likely
> back-patched the fix, probably in under 24 hours from the time of the
> report. I find it difficult to take seriously the contention that
> anyone is expecting \d dlsgjdsghj.sdhg.l.dsg.jkhsdg.foo.bar to work
> like \d foo.bar, or that they would even prefer that behavior over an
> error message. You're carefully avoiding addressing that question in
> favor of having a discussion of backward compatibility, but a better
> term for what we're talking about here would be bug-compatibility.

I tend to agree with Robert on this particular case.  Accepting random
nonsense there isn't a feature or something which really needs to be
preserved.  For my 2c, I would hope that one day we will be able to
accept other database names there and if that happens, what then?  We'd
"break" these cases anyway.  Better to be clear that such nonsense isn't
intended to be accepted and clean that up.  I do think it'd be good to
accept the current database name there as that's reasonable.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Robert Haas
On Tue, Oct 12, 2021 at 10:31 AM Tom Lane  wrote:
> If the behavior v14 had implemented were "throw an error if the
> first word doesn't match the current database name", perhaps nobody
> would have questioned it.  But that's not what we have.  It's fairly
> clear that neither you nor Mark thought very much about this case,
> let alone tested it.  Given that, I am not very pleased that you
> are retroactively trying to justify breaking it by claiming that
> it was already broken.  It's been that way since 7.3 implemented
> schemas, more or less, and nobody's complained about it.  Therefore
> I see little argument for changing that behavior.  Changing it in
> an already-released branch is especially suspect.

Oh, give me a break. The previous behavior obviously hasn't been
tested either, and is broken on its face. If someone *had* complained
about it, I imagine you would have promptly fixed it and likely
back-patched the fix, probably in under 24 hours from the time of the
report. I find it difficult to take seriously the contention that
anyone is expecting \d dlsgjdsghj.sdhg.l.dsg.jkhsdg.foo.bar to work
like \d foo.bar, or that they would even prefer that behavior over an
error message. You're carefully avoiding addressing that question in
favor of having a discussion of backward compatibility, but a better
term for what we're talking about here would be bug-compatibility.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Mark Dilger



> On Oct 12, 2021, at 7:30 AM, Tom Lane  wrote:
> 
> If the behavior v14 had implemented were "throw an error if the
> first word doesn't match the current database name", perhaps nobody
> would have questioned it.  But that's not what we have.  It's fairly
> clear that neither you nor Mark thought very much about this case,
> let alone tested it.  Given that, I am not very pleased that you
> are retroactively trying to justify breaking it by claiming that
> it was already broken.  It's been that way since 7.3 implemented
> schemas, more or less, and nobody's complained about it.  Therefore
> I see little argument for changing that behavior.  Changing it in
> an already-released branch is especially suspect.

I completely agree that we need to fix this.  My question was only whether 
"fix" means to make it accept database.schema.table or whether it means to 
accept any.prefix.at.all.schema.table.  It sounds like more people like the 
latter, so I'll go with that unless this debate rages on and a different 
conclusion is reached.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Tom Lane
Robert Haas  writes:
> On Mon, Oct 11, 2021 at 10:33 PM Peter Geoghegan  wrote:
>> Being lenient here just doesn't have much downside in practice, as
>> evidenced by the total lack of complaints about that lenience.

> I find it kind of surprising to find everyone agreeing with this
> argument.

If the behavior v14 had implemented were "throw an error if the
first word doesn't match the current database name", perhaps nobody
would have questioned it.  But that's not what we have.  It's fairly
clear that neither you nor Mark thought very much about this case,
let alone tested it.  Given that, I am not very pleased that you
are retroactively trying to justify breaking it by claiming that
it was already broken.  It's been that way since 7.3 implemented
schemas, more or less, and nobody's complained about it.  Therefore
I see little argument for changing that behavior.  Changing it in
an already-released branch is especially suspect.

regards, tom lane




Re: pg14 psql broke \d datname.nspname.relname

2021-10-12 Thread Robert Haas
On Mon, Oct 11, 2021 at 10:33 PM Peter Geoghegan  wrote:
> On Mon, Oct 11, 2021 at 7:09 PM Mark Dilger
>  wrote:
> > I was just wondering when it might be time to stop being lenient in psql 
> > and instead reject malformed identifiers.
>
> I suppose that I probably wouldn't have chosen this behavior in a
> green field situation. But Hyrum's law tells us that there are bound
> to be some number of users relying on it. I don't think that it's
> worth inconveniencing those people without getting a clear benefit in
> return.
>
> Being lenient here just doesn't have much downside in practice, as
> evidenced by the total lack of complaints about that lenience.

I find it kind of surprising to find everyone agreeing with this
argument. I mean, PostgreSQL users are often quick to criticize MySQL
for accepting -00-00 as a date, because it isn't, and you
shouldn't accept garbage and do stuff with it as if it were valid
data. But by the same argument, accepting a database name that we know
is not correct as a request to show data in the current database seems
wrong to me.

I completely agree that somebody might be relying on the fact that \d
thisdb.someschema.sometable does something sensible when logged into
thisdb, but surely no user is relying on \d
jgldslghksdghjsgkhsdgjhskg.someschema.sometable is going to just
ignore the leading gibberish. Nor do I understand why we'd want to
ignore the leading gibberish. Saying, as Tom did, that nobody has
complained about that behavior is just another way of saying that
nobody tested it. Surely if someone had, it wouldn't be like this.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Peter Geoghegan
On Mon, Oct 11, 2021 at 7:09 PM Mark Dilger
 wrote:
> I was just wondering when it might be time to stop being lenient in psql and 
> instead reject malformed identifiers.

I suppose that I probably wouldn't have chosen this behavior in a
green field situation. But Hyrum's law tells us that there are bound
to be some number of users relying on it. I don't think that it's
worth inconveniencing those people without getting a clear benefit in
return.

Being lenient here just doesn't have much downside in practice, as
evidenced by the total lack of complaints about that lenience.

-- 
Peter Geoghegan




Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Mark Dilger



> On Oct 11, 2021, at 4:49 PM, Tom Lane  wrote:
> 
> You are attacking a straw man here.  To use a period in an identifier,
> you have to double-quote it; that's the same in SQL or \d.

That's a strange argument.  If somebody gives an invalid identifier, we 
shouldn't assume they know the proper use of quotations.  Somebody asking for 
a.b.c.d.e is clearly in the dark about something.  Maybe it's the need to quote 
the "a.b" part separately from the "c.d.e" part, or maybe it's something else.  
There are lots of reasonable guesses about what they meant, and for backward 
compatibility reasons we define using the suffix d.e and ignoring the prefix 
a.b.c as the correct answer.  That's a pretty arbitrary thing to do, but it has 
the advantage of being backwards compatible.

>> I expect I'll have to submit a patch restoring the old behavior, but I 
>> wonder if that's the best direction to go.
> 
> I do not understand why you're even questioning that.  The old
> behavior had stood for a decade or two without complaints.

I find the backward compatibility argument appealing, but since we have clients 
that understand the full database.schema.relation format without ignoring the 
database portion, our client behavior is getting inconsistent.  I'd like to 
leave the door open for someday supporting server.database.schema.relation 
format, too.  I was just wondering when it might be time to stop being lenient 
in psql and instead reject malformed identifiers.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Tom Lane
Mark Dilger  writes:
> But since we allow tables and schemas with dotted names in them, I'm 
> uncertain what  \d foo.bar.baz is really asking.  That could be 
> "foo.bar"."baz", or "foo"."bar"."baz", or "foo"."bar.baz", or even 
> "public"."foo.bar.baz".  The old behavior seems a bit dangerous.  There may 
> be tables with all those names, and the user may not have meant the one that 
> we gave them.

You are attacking a straw man here.  To use a period in an identifier,
you have to double-quote it; that's the same in SQL or \d.

regression=# create table "foo.bar" (f1 int);
CREATE TABLE
regression=# \d foo.bar
Did not find any relation named "foo.bar".
regression=# \d "foo.bar"
  Table "public.foo.bar"
 Column |  Type   | Collation | Nullable | Default 
+-+---+--+-
 f1 | integer |   |  | 

According to a quick test, you did not manage to break that in v14.

> I expect I'll have to submit a patch restoring the old behavior, but I wonder 
> if that's the best direction to go.

I do not understand why you're even questioning that.  The old
behavior had stood for a decade or two without complaints.

regards, tom lane




Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Isaac Morland
On Mon, 11 Oct 2021 at 19:35, Mark Dilger 
wrote:


> But since we allow tables and schemas with dotted names in them, I'm
> uncertain what  \d foo.bar.baz is really asking.
>

FWIW, it’s absolutely clear to me that "." is a special character which has
to be quoted in order to be in an identifier. In other words, a.b.c is
three identifiers separated by two period punctuation marks; what exactly
those periods mean is another question. If somebody uses periods in their
names, they have to quote those names just as if they used capital letters
etc.

But that's just my impression. I comment at all because I remember looking
at something to do with the grammar (I think I wanted to implement ALTER …
RENAME TO newschema.newname) and noticed that a database name could be
given in the syntax.


Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Mark Dilger



> On Oct 11, 2021, at 3:37 PM, Tom Lane  wrote:
> 
>> REL_13_STABLE appears to accept any amount of nonsense you like:
> 
> Yeah, I'm pretty sure that the old rule was to just ignore whatever
> appeared in the database-name position.  While we could tighten that
> up to insist that it match the current DB's name, I'm not sure that
> I see the point.  There's no near-term prospect of doing anything
> useful with some other DB's name there, so being more restrictive
> seems like it'll probably break peoples' scripts to little purpose.

You appear correct about the old behavior.  It's unclear how intentional it 
was.  There was a schema buffer and a name buffer, and while parsing the name, 
if a dot was encountered, the contents just parsed were copied into the schema 
buffer.  If multiple dots were encountered, that had the consequence of blowing 
away the earlier ones.

But since we allow tables and schemas with dotted names in them, I'm uncertain 
what  \d foo.bar.baz is really asking.  That could be "foo.bar"."baz", or 
"foo"."bar"."baz", or "foo"."bar.baz", or even "public"."foo.bar.baz".  The old 
behavior seems a bit dangerous.  There may be tables with all those names, and 
the user may not have meant the one that we gave them.

The v14 code is no better.  It just assumes that is "foo"."bar.baz".  So (with 
debugging statements included):

foo=# create table "foo.bar.baz" (i integer);
CREATE TABLE
foo=# \d public.foo.bar.baz
Converting "public.foo.bar.baz"
GOT "^(public)$" . "^(foo.bar.baz)$"
Table "public.foo.bar.baz"
 Column |  Type   | Collation | Nullable | Default 
+-+---+--+-
 i  | integer |   |  | 

I expect I'll have to submit a patch restoring the old behavior, but I wonder 
if that's the best direction to go.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Justin Pryzby
On Mon, Oct 11, 2021 at 02:47:59PM -0700, Mark Dilger wrote:
> > |$ LD_LIBRARY_PATH=tmp_install/usr/local/pgsql/lib/ src/bin/psql/psql -h 
> > /tmp regression
> > |psql (15devel)
> > |Type "help" for help.
> > |regression=# \d regresion.public.bit_defaults
> > |Did not find any relation named "regresion.public.bit_defaults".
> > |regression=# \d public.bit_defaults
> > | Table "public.bit_defaults"
> > |...
> > 
> > This worked before v14 (even though the commit message says otherwise).
> > 
> > |$ /usr/lib/postgresql/13/bin/psql -h /tmp regression
> > |psql (13.2 (Debian 13.2-1.pgdg100+1), server 15devel)
> > |...
> > |regression=# \d regresion.public.bit_defaults
> > | Table "public.bit_defaults"
> > |...
> 
> I can only assume that you are intentionally misspelling "regression" as 
> "regresion" (with only one "s") as part of the test.  I have not checked if 
> that worked before v14, but if it ignored the misspelled database name before 
> v14, and it rejects it now, I'm not sure that counts as a bug. 
> 
> Am I misunderstanding your bug report?

It's not intentional but certainly confusing to put a typo there.
Sorry for that (and good eyes, BTW).

In v15/master:
regression=# \d regression.public.bit_defaults
Did not find any relation named "regression.public.bit_defaults".

After reverting that commit and recompiling psql:
regression=# \d regression.public.bit_defaults
 Table "public.bit_defaults"
...

In v13 psql:
regression=# \d regression.public.bit_defaults
 Table "public.bit_defaults"
...

It looks like before v13 any "datname" prefix was ignored.

But now it fails to show the table because it does:

WHERE c.relname OPERATOR(pg_catalog.~) '^(public.bit_defaults)$' COLLATE 
pg_catalog.default
  AND n.nspname OPERATOR(pg_catalog.~) '^(regression)$' COLLATE 
pg_catalog.default

-- 
Justin




Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Tom Lane
Mark Dilger  writes:
>> On Oct 11, 2021, at 3:04 PM, Tom Lane  wrote:
>> Doesn't work with the correct DB name, either:
>> regression=# \d regression.public.bit_defaults
>> Did not find any relation named "regression.public.bit_defaults".

> REL_13_STABLE appears to accept any amount of nonsense you like:

Yeah, I'm pretty sure that the old rule was to just ignore whatever
appeared in the database-name position.  While we could tighten that
up to insist that it match the current DB's name, I'm not sure that
I see the point.  There's no near-term prospect of doing anything
useful with some other DB's name there, so being more restrictive
seems like it'll probably break peoples' scripts to little purpose.

regards, tom lane




Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Mark Dilger



> On Oct 11, 2021, at 3:26 PM, Justin Pryzby  wrote:
> 
> It looks like before v13 any "datname" prefix was ignored.

The evidence so far suggests that something is broken in v14, but it is less 
clear to me what the appropriate behavior is.  The v14 psql is rejecting even a 
correctly named database.schema.table, but v13 psql accepted 
lots.of.nonsense.schema.table, and neither of those seems at first glance to be 
correct.  But perhaps there are good reasons for ignoring the nonsense prefixes?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Mark Dilger



> On Oct 11, 2021, at 3:04 PM, Tom Lane  wrote:
> 
> Doesn't work with the correct DB name, either:
> 
> regression=# \d public.bit_defaults
> Table "public.bit_defaults"
> Column |  Type  | Collation | Nullable |   Default   
> ++---+--+-
> b1 | bit(4) |   |  | '1001'::"bit"
> b2 | bit(4) |   |  | '0101'::"bit"
> b3 | bit varying(5) |   |  | '1001'::bit varying
> b4 | bit varying(5) |   |  | '0101'::"bit"
> 
> regression=# \d regression.public.bit_defaults
> Did not find any relation named "regression.public.bit_defaults".

REL_13_STABLE appears to accept any amount of nonsense you like:

foo=# \d nonesuch.foo.a.b.c.d.bar.baz
  Table "bar.baz"
 Column |  Type   | Collation | Nullable | Default 
+-+---+--+-
 i  | integer |   |  | 


Is this something we're intentionally supporting?  There is no regression test 
covering this, else we'd have seen breakage in the build-farm.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Tom Lane
Mark Dilger  writes:
> I can only assume that you are intentionally misspelling "regression" as 
> "regresion" (with only one "s") as part of the test.  I have not checked if 
> that worked before v14, but if it ignored the misspelled database name before 
> v14, and it rejects it now, I'm not sure that counts as a bug. 

Doesn't work with the correct DB name, either:

regression=# \d public.bit_defaults
 Table "public.bit_defaults"
 Column |  Type  | Collation | Nullable |   Default   
++---+--+-
 b1 | bit(4) |   |  | '1001'::"bit"
 b2 | bit(4) |   |  | '0101'::"bit"
 b3 | bit varying(5) |   |  | '1001'::bit varying
 b4 | bit varying(5) |   |  | '0101'::"bit"

regression=# \d regression.public.bit_defaults
Did not find any relation named "regression.public.bit_defaults".

regards, tom lane




Re: pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Mark Dilger



> On Oct 11, 2021, at 2:24 PM, Justin Pryzby  wrote:
> 
> This commit broke psql \d datname.nspname.relname
> 
> commit 2c8726c4b0a496608919d1f78a5abc8c9b6e0868
> Author: Robert Haas 
> Date:   Wed Feb 3 13:19:41 2021 -0500
> 
>Factor pattern-construction logic out of processSQLNamePattern.
> ...
>patternToSQLRegex is a little more general than what is required
>by processSQLNamePattern. That function is only interested in
>patterns that can have up to 2 parts, a schema and a relation;
>but patternToSQLRegex can limit the maximum number of parts to
>between 1 and 3, so that patterns can look like either
>"database.schema.relation", "schema.relation", or "relation"
>depending on how it's invoked and what the user specifies.
> 
>processSQLNamePattern only passes two buffers, so works exactly
>the same as before, always interpreting the pattern as either
>a "schema.relation" pattern or a "relation" pattern. But,
>future callers can use this function in other ways.
> 
> |$ LD_LIBRARY_PATH=tmp_install/usr/local/pgsql/lib/ src/bin/psql/psql -h /tmp 
> regression
> |psql (15devel)
> |Type "help" for help.
> |regression=# \d regresion.public.bit_defaults
> |Did not find any relation named "regresion.public.bit_defaults".
> |regression=# \d public.bit_defaults
> | Table "public.bit_defaults"
> |...
> 
> This worked before v14 (even though the commit message says otherwise).
> 
> |$ /usr/lib/postgresql/13/bin/psql -h /tmp regression
> |psql (13.2 (Debian 13.2-1.pgdg100+1), server 15devel)
> |...
> |regression=# \d regresion.public.bit_defaults
> | Table "public.bit_defaults"
> |...

I can only assume that you are intentionally misspelling "regression" as 
"regresion" (with only one "s") as part of the test.  I have not checked if 
that worked before v14, but if it ignored the misspelled database name before 
v14, and it rejects it now, I'm not sure that counts as a bug. 

Am I misunderstanding your bug report?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







pg14 psql broke \d datname.nspname.relname

2021-10-11 Thread Justin Pryzby
This commit broke psql \d datname.nspname.relname

commit 2c8726c4b0a496608919d1f78a5abc8c9b6e0868
Author: Robert Haas 
Date:   Wed Feb 3 13:19:41 2021 -0500

Factor pattern-construction logic out of processSQLNamePattern.
...
patternToSQLRegex is a little more general than what is required
by processSQLNamePattern. That function is only interested in
patterns that can have up to 2 parts, a schema and a relation;
but patternToSQLRegex can limit the maximum number of parts to
between 1 and 3, so that patterns can look like either
"database.schema.relation", "schema.relation", or "relation"
depending on how it's invoked and what the user specifies.

processSQLNamePattern only passes two buffers, so works exactly
the same as before, always interpreting the pattern as either
a "schema.relation" pattern or a "relation" pattern. But,
future callers can use this function in other ways.

|$ LD_LIBRARY_PATH=tmp_install/usr/local/pgsql/lib/ src/bin/psql/psql -h /tmp 
regression
|psql (15devel)
|Type "help" for help.
|regression=# \d regresion.public.bit_defaults
|Did not find any relation named "regresion.public.bit_defaults".
|regression=# \d public.bit_defaults
| Table "public.bit_defaults"
|...

This worked before v14 (even though the commit message says otherwise).

|$ /usr/lib/postgresql/13/bin/psql -h /tmp regression
|psql (13.2 (Debian 13.2-1.pgdg100+1), server 15devel)
|...
|regression=# \d regresion.public.bit_defaults
| Table "public.bit_defaults"
|...

-- 
Justin