Re: [PATCH] Pull general SASL framework out of SCRAM

2021-07-10 Thread Michael Paquier
On Fri, Jul 09, 2021 at 11:31:48PM +, Jacob Champion wrote:
> On Thu, 2021-07-08 at 16:27 +0900, Michael Paquier wrote:
>> + *  outputlen: The length (0 or higher) of the client response 
>> buffer,
>> + * invalid if output is NULL.
> 
> nitpick: maybe "ignored" instead of "invalid"?

Thanks, applied as 44bd012 after using your suggestion.

Another thing I noticed after more review is that the check in
fe-auth.c to make sure that a message needs to be generated if the
exchange is not completed yet has no need to depend on "success", only
"done".
--
Michael


signature.asc
Description: PGP signature


Re: unnesting multirange data types

2021-07-10 Thread Alexander Korotkov
 On Sun, Jul 11, 2021 at 1:28 AM Tom Lane  wrote:
> Justin Pryzby  writes:
> > On Sun, Jul 11, 2021 at 01:00:27AM +0300, Alexander Korotkov wrote:
> >> True, but I'm a bit uncomfortable about user instances with different
> >> catalogs but the same catversions.  On the other hand, initdb's with
> >> beta3 or later will be the vast majority among pg14 instances.
> >>
> >> Did we have similar precedents in the past?
>
> > It seems so.
>
> If it's *only* the description strings you want to change, then yeah,
> we've done that before.

My patch also changes 'oprjoin' from 'scalargtjoinsel' to
'scalarltjoinsel'.  Implementation is the same, but 'scalarltjoinsel'
looks more logical here.

--
Regards,
Alexander Korotkov




Re: Unused function parameter in get_qual_from_partbound()

2021-07-10 Thread Soumyadeep Chakraborty
> Marking this as ready for committer. It can be committed when the branch
> is cut for 15.

I see that REL_14_STABLE is already cut. So this can go in now.




Re: Unused function parameter in get_qual_from_partbound()

2021-07-10 Thread Soumyadeep Chakraborty
Hello,

Googling around, I didn't find any extensions that would break from this
change. Even if there are any, this change will simplify the relevant
callsites. It also aligns the interface nicely with get_qual_for_hash,
get_qual_for_list and get_qual_for_range.

Marking this as ready for committer. It can be committed when the branch
is cut for 15.

Regards,
Soumyadeep (VMware)




Re: unnesting multirange data types

2021-07-10 Thread Tom Lane
Justin Pryzby  writes:
> On Sun, Jul 11, 2021 at 01:00:27AM +0300, Alexander Korotkov wrote:
>> True, but I'm a bit uncomfortable about user instances with different
>> catalogs but the same catversions.  On the other hand, initdb's with
>> beta3 or later will be the vast majority among pg14 instances.
>> 
>> Did we have similar precedents in the past?

> It seems so.

If it's *only* the description strings you want to change, then yeah,
we've done that before.

regards, tom lane




Re: unnesting multirange data types

2021-07-10 Thread Justin Pryzby
On Sun, Jul 11, 2021 at 01:00:27AM +0300, Alexander Korotkov wrote:
> On Sat, Jul 10, 2021 at 7:34 PM Alvaro Herrera  
> wrote:
> > On 2021-Jun-27, Alexander Korotkov wrote:
> >
> > > BTW, I found some small inconsistencies in the declaration of
> > > multirange operators in the system catalog.  Nothing critical, but if
> > > we decide to bump catversion in beta3, this patch is also nice to
> > > push.
> >
> > Hmm, I think you should push this and not bump catversion.  That way,
> > nobody is forced to initdb if we end up not having a catversion bump for
> > some other reason; but also anybody who initdb's with beta3 or later
> > will get the correct descriptions.
> >
> > If you don't push it, everybody will have the wrong descriptions.
> 
> True, but I'm a bit uncomfortable about user instances with different
> catalogs but the same catversions.  On the other hand, initdb's with
> beta3 or later will be the vast majority among pg14 instances.
> 
> Did we have similar precedents in the past?

It seems so.

Note in particular 74ab96a45, which adds a new function with no bump.
Although that one may not be a good precedent to follow, or one that's been
followed recently.

commit 0aac73e6a2602696d23aa7a9686204965f9093dc
Author: Noah Misch 
Date:   Mon Jun 14 17:29:37 2021 -0700

Copy-edit text for the pg_terminate_backend() "timeout" parameter.

commit b09a64d602a19c9a3cc69e4bb0f8986a6f5facf4
Author: Tom Lane 
Date:   Thu Sep 20 16:06:18 2018 -0400

Add missing pg_description strings for pg_type entries.

commit a4627e8fd479ff74fffdd49ad07636b79751be45
Author: Tom Lane 
Date:   Tue Feb 2 11:39:50 2016 -0500

Fix pg_description entries for jsonb_to_record() and jsonb_to_recordset().

commit b852dc4cbd09156e2c74786d5b265f03d45bc404
Author: Bruce Momjian 
Date:   Wed Oct 7 09:06:49 2015 -0400

docs:  clarify JSONB operator descriptions

commit a80889a7359e720107b881bcd3e8fd47f3874e36
Author: Tom Lane 
Date:   Wed Oct 10 12:19:25 2012 -0400

Set procost to 10 for each of the pg_foo_is_visible() functions.

commit c246eb5aafe66d5537b468d6da2116c462775faf
Author: Tom Lane 
Date:   Sat Aug 18 16:14:57 2012 -0400

Make use of LATERAL in information_schema.sequences view.

commit 74ab96a45ef6259aa6a86a781580edea8488511a
Author: Alvaro Herrera 
Date:   Wed Jan 25 13:15:29 2012 -0300

Add pg_trigger_depth() function

commit ddd6ff289f2512f881493b7793853a96955459ff
Author: Bruce Momjian 
Date:   Tue Mar 15 11:26:20 2011 -0400

Add database comments to template0 and postgres databases, and improve




Re: unnesting multirange data types

2021-07-10 Thread Alexander Korotkov
On Sat, Jul 10, 2021 at 7:34 PM Alvaro Herrera  wrote:
> On 2021-Jun-27, Alexander Korotkov wrote:
>
> > BTW, I found some small inconsistencies in the declaration of
> > multirange operators in the system catalog.  Nothing critical, but if
> > we decide to bump catversion in beta3, this patch is also nice to
> > push.
>
> Hmm, I think you should push this and not bump catversion.  That way,
> nobody is forced to initdb if we end up not having a catversion bump for
> some other reason; but also anybody who initdb's with beta3 or later
> will get the correct descriptions.
>
> If you don't push it, everybody will have the wrong descriptions.

True, but I'm a bit uncomfortable about user instances with different
catalogs but the same catversions.  On the other hand, initdb's with
beta3 or later will be the vast majority among pg14 instances.

Did we have similar precedents in the past?

--
Regards,
Alexander Korotkov




Re: Support kerberos authentication for postgres_fdw

2021-07-10 Thread Magnus Hagander
On Fri, Jul 9, 2021 at 3:49 PM Tom Lane  wrote:
>
> Peifeng Qiu  writes:
> > I'd like to add kerberos authentication support for postgres_fdw by adding 
> > two
> > options to user mapping: krb_client_keyfile and gssencmode.
>
> As you note, this'd have to be restricted to superusers, which makes it
> seem like a pretty bad idea.  We really don't want to be in a situation
> of pushing people to run day-to-day stuff as superuser.  Yeah, having
> access to kerberos auth sounds good on the surface, but it seems like
> it would be a net loss in security because of that.
>
> Is there some other way?

ISTM the right way to do this would be using Kerberos delegation. That
is, the system would be set up so that the postgres service principal
is trusted for kerberos delegation and it would then pass through the
actual Kerberos authentication from the client.

At least at a quick glance this does not look like what this patch is
doing, sadly.

What does kerberos auth with a fixed key on the client (being the
postgres server in this auth step) actually help with?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Add ZSON extension to /contrib/

2021-07-10 Thread Tomas Vondra

On 7/3/21 12:34 PM, Peter Eisentraut wrote:

On 04.06.21 17:09, Aleksander Alekseev wrote:

I decided to add the patch to the nearest commitfest.


With respect to the commit fest submission, I don't think there is 
consensus right now to add this.  I think people would prefer that this 
dictionary facility be somehow made available in the existing JSON 
types.  Also, I sense that there is still some volatility about some of 
the details of how this extension should work and its scope.  I think 
this is served best as an external extension for now.


I agree there's a lot of open questions to figure out, but I think this 
"column-level compression" capability has a lot of potential. Not just 
for structured documents like JSON, but maybe even for scalar types.


I don't think the question whether this should be built into jsonb, a 
separate built-in type, contrib type or something external is the one we 
need to answer first.


The first thing I'd like to see is some "proof" that it's actually 
useful in practice - there were some claims about people/customers using 
it and being happy with the benefits, but there were no actual examples 
of data sets that are expected to benefit, compression ratios etc. And 
considering that [1] went unnoticed for 5 years, I have my doubts about 
it being used very widely. (I may be wrong and maybe people are just not 
casting jsonb to zson.)


I've tried to use this on the one large non-synthetic JSONB dataset I 
had at hand at the moment, which is the bitcoin blockchain. That's ~1TB 
with JSONB, and when I tried using ZSON instead there was no measurable 
benefit, in fact the database was a bit larger. But I admit btc data is 
rather strange, because it contains a lot of randomness (all the tx and 
block IDs are random-looking hashes, etc.), and there's a lot of them in 
each document. So maybe that's simply a data set that can't benefit from 
zson on principle.


I also suspect the zson_extract_strings() is pretty inefficient and I 
ran into various issues with the btc blocks which have very many keys, 
often far more than the 10k limit.


In any case, I think having a clear example(s) of practical data sets 
that benefit from using zson would be very useful, both to guide the 
development and to show what the potential gains are.


The other thing is that a lot of the stuff seems to be manual (e.g. the 
learning), and not really well integrated with the core. IMO improving 
this by implementing the necessary infrastructure would help all the 
possible cases (built-in type, contrib, external extension).



regards

[1] 
https://github.com/postgrespro/zson/commit/02db084ea3b94d9e68fd912dea97094634fcdea5


--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pgsql: Fix numeric_mul() overflow due to too many digits after decimal

2021-07-10 Thread Dean Rasheed
On Sat, 10 Jul 2021 at 18:30, Tom Lane  wrote:
>
> [ moving to pghackers for wider visibility ]
>
> Dean Rasheed  writes:
> > On Sat, 10 Jul 2021 at 16:01, Tom Lane  wrote:
> >> In general, I'm disturbed that we just threw away the previous
> >> promise that numeric multiplication results were exact.  That
> >> seems like a pretty fundamental property --- which is stated
> >> in so many words in the manual, btw --- and I'm not sure I want
> >> to give it up.
>
> > Perhaps we should amend the statement about numeric multiplication to
> > say that it's exact within the limits of the numeric type's supported
> > scale, which we also document in the manual as 16383.
> > That seems a lot better than throwing an overflow error for a result
> > that isn't very big, which limits what's possible with numeric
> > multiplication to much less than 16383 digits.
>
> TBH, I don't agree.  I think this is strictly worse than what we
> did before, and we should just revert it.  It's no longer possible
> to reason about what numeric multiplication will do.  I think
> throwing an error if we can't represent the result exactly is a
> preferable behavior.  If you don't want exact results, use float8.

Actually, I think it makes it a lot easier to reason about what it
will do -- it'll return the exact result, or the exact result
correctly rounded to 16383 digits.

That seems perfectly reasonable to me, since almost every numeric
operation ends up having to round at some point, and almost all of
them don't produce exact results.

The previous behaviour might seem like a principled stance to take,
from a particular perspective, but it's pretty hard to work with in
practice.

For example, say you had 2 numbers, each with 1 digits after the
decimal point, that you wanted to multiply. If it throws an overflow
error for results with more than 16383 digits after the decimal point,
then you'd have to round one or both input numbers before multiplying.
To get the most accurate result possible, you'd actually have to round
each number to have the same number of *significant digits*, rather
than the same number of digits after the decimal point. In general,
that's quite complicated -- suppose, after rounding, you had

  x = [x1 digits] . [x2 digits]
  y = [y1 digits] . [y2 digits]

where x1+x2 = y1+y2 to minimise the error in the final result, and
x2+y2 = 16383 to maximise the result scale. x1 and y1 are known
(though we don't have a convenient way to obtain them), so that's a
pair of simultaneous equations to solve to decide the optimal amount
of rounding to apply before multiplying. And after all that, the
result will only be accurate to around x1+x2 (or y1+y2) significant
digits, and around x1+y1 of those will be before the decimal point, so
even though it will return a result with 16383 digits after the
decimal point, lots of those digits will be completely wrong.

With the new code, you just multiply the numbers and the result is
correctly rounded to 16383 digits.

In general, I'd argue that using numeric isn't about getting exact
results, since nearly all real computations don't have an exact
result. Really, it's about getting high-precision results that would
be impossible with float8. (And if you don't care about numbers with
16383 digits after the decimal point, this change won't affect you.)

Regards,
Dean




Re: [EXTERNAL] Re: Crash in record_type_typmod_compare

2021-07-10 Thread Jeff Davis
On Mon, 2021-04-05 at 12:07 +, Sait Talha Nisanci wrote:
> Hi Andres,
> 
> Please see the updated patch, do you mean something like this? (there
> might be a simpler way for doing this)
> 

Committed with minor revisions.

My patch also avoids incrementing NextRecordTypmod until we've already
called CreateTupleDescCopy(). Otherwise, an allocation failure could
leave NextRecordTypmod unnecessarily incremented, which is a tiny leak.

Regards,
Jeff Davis






Re: pgsql: Fix numeric_mul() overflow due to too many digits after decimal

2021-07-10 Thread Tom Lane
[ moving to pghackers for wider visibility ]

Dean Rasheed  writes:
> On Sat, 10 Jul 2021 at 16:01, Tom Lane  wrote:
>> In general, I'm disturbed that we just threw away the previous
>> promise that numeric multiplication results were exact.  That
>> seems like a pretty fundamental property --- which is stated
>> in so many words in the manual, btw --- and I'm not sure I want
>> to give it up.

> Perhaps we should amend the statement about numeric multiplication to
> say that it's exact within the limits of the numeric type's supported
> scale, which we also document in the manual as 16383.
> That seems a lot better than throwing an overflow error for a result
> that isn't very big, which limits what's possible with numeric
> multiplication to much less than 16383 digits.

TBH, I don't agree.  I think this is strictly worse than what we
did before, and we should just revert it.  It's no longer possible
to reason about what numeric multiplication will do.  I think
throwing an error if we can't represent the result exactly is a
preferable behavior.  If you don't want exact results, use float8.

regards, tom lane




Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2021-07-10 Thread Fabien COELHO



Hello,


Of course, users themselves should be careful of problematic script, but it
would be better that pgbench itself avoids problems if pgbench can beforehand.


Or, we should terminate the last cycle of benchmark regardless it is
retrying or not if -T expires. This will make pgbench behaves much
more consistent.


I would tend to agree with this behavior, that is not to start any new 
transaction or transaction attempt once -T has expired.


I'm a little hesitant about how to count and report such unfinished 
because of bench timeout transactions, though. Not counting them seems to 
be the best option.



Hmmm, indeed this might make the behaviour a bit consistent, but I am not
sure such behavioural change benefit users.


The user benefit would be that if they asked for a 100s benchmark, pgbench 
does a reasonable effort not to overshot that?


--
Fabien.




Re: Enhanced error message to include hint messages for redundant options error

2021-07-10 Thread vignesh C
On Sat, Jul 10, 2021 at 4:31 PM Dean Rasheed  wrote:
>
> On Sat, 10 Jul 2021 at 11:44, Dean Rasheed  wrote:
> >
> > I'm inclined to think that it isn't worth the effort trying to
> > distinguish between conflicting options, options specified more than
> > once and faked-up options that weren't really specified. If we just
> > make errorConflictingDefElem() report "conflicting or redundant
> > options", then it's much easier to update calling code without making
> > mistakes. The benefit then comes from the reduced code size and the
> > fact that the patch includes pstate in more places, so the
> > parser_errposition() indicator helps the user identify the problem.
> >
> > In file_fdw_validator(), where there is no pstate, it's already using
> > "specified more than once" as a hint to clarify the "conflicting or
> > redundant options" error, so I think we should leave that alone.
> >
>
> Another possibility would be to pass the option list to
> errorConflictingDefElem() and it could work out whether or not to
> include the "option \%s\" specified more than once" hint, since that
> hint probably is useful, and working out whether to include it is
> probably less error-prone if it's done there.

I'm planning to handle conflicting errors separately after this
current work is done, once the patch is changed to have just the valid
scenarios(removing the scenarios you have pointed out)  existing
function can work as is without any changes. Thoughts?

Regards,
Vignesh




Re: psql - factor out echo code

2021-07-10 Thread Fabien COELHO


Hello Vignesh,


I am changing the status to "Needs review" as the review is not
completed for this patch and also there are some tests failing, that
need to be fixed:
test test_extdepend   ... FAILED   50 ms


Indeed,

Attached v4 simplifies the format and fixes this one.
I ran check-world, this time.

--
Fabien.diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 543401c6d6..8ec00881db 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -2061,7 +2061,7 @@ exec_command_password(PsqlScanState scan_state, bool active_branch)
 printfPQExpBuffer(, "ALTER USER %s PASSWORD ",
   fmtId(user));
 appendStringLiteralConn(, encrypted_password, pset.db);
-res = PSQLexec(buf.data);
+res = PSQLexec("PASSWORD", buf.data);
 termPQExpBuffer();
 if (!res)
 	success = false;
@@ -4993,22 +4993,13 @@ do_watch(PQExpBuffer query_buf, double sleep)
  * returns true unless we have ECHO_HIDDEN_NOEXEC.
  */
 static bool
-echo_hidden_command(const char *query)
+echo_hidden_command(const char *origin, const char *query)
 {
 	if (pset.echo_hidden != PSQL_ECHO_HIDDEN_OFF)
 	{
-		printf(_("* QUERY **\n"
- "%s\n"
- "**\n\n"), query);
-		fflush(stdout);
+		echoQuery(stdout, origin, query);
 		if (pset.logfile)
-		{
-			fprintf(pset.logfile,
-	_("* QUERY **\n"
-	  "%s\n"
-	  "**\n\n"), query);
-			fflush(pset.logfile);
-		}
+			echoQuery(pset.logfile, origin, query);
 
 		if (pset.echo_hidden == PSQL_ECHO_HIDDEN_NOEXEC)
 			return false;
@@ -5060,7 +5051,7 @@ lookup_object_oid(EditableObjectType obj_type, const char *desc,
 			break;
 	}
 
-	if (!echo_hidden_command(query->data))
+	if (!echo_hidden_command("INTERNAL", query->data))
 	{
 		destroyPQExpBuffer(query);
 		return false;
@@ -5155,7 +5146,7 @@ get_create_object_cmd(EditableObjectType obj_type, Oid oid,
 			break;
 	}
 
-	if (!echo_hidden_command(query->data))
+	if (!echo_hidden_command("INTERNAL", query->data))
 	{
 		destroyPQExpBuffer(query);
 		return false;
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 9a00499510..50e8023c90 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -523,6 +523,17 @@ PrintTiming(double elapsed_msec)
 		   elapsed_msec, days, (int) hours, (int) minutes, seconds);
 }
 
+/*
+ * Echo user query
+ */
+void
+echoQuery(FILE *out, const char *origin, const char *query)
+{
+	fprintf(out,
+			/* FIXME should this really be translatable? */
+			_("-- %s QUERY:\n%s\n\n"), origin, query);
+	fflush(out);
+}
 
 /*
  * PSQLexec
@@ -537,7 +548,7 @@ PrintTiming(double elapsed_msec)
  * caller uses this path to issue "SET CLIENT_ENCODING".
  */
 PGresult *
-PSQLexec(const char *query)
+PSQLexec(const char *origin, const char *query)
 {
 	PGresult   *res;
 
@@ -549,18 +560,9 @@ PSQLexec(const char *query)
 
 	if (pset.echo_hidden != PSQL_ECHO_HIDDEN_OFF)
 	{
-		printf(_("* QUERY **\n"
- "%s\n"
- "**\n\n"), query);
-		fflush(stdout);
+		echoQuery(stdout, origin, query);
 		if (pset.logfile)
-		{
-			fprintf(pset.logfile,
-	_("* QUERY **\n"
-	  "%s\n"
-	  "**\n\n"), query);
-			fflush(pset.logfile);
-		}
+			echoQuery(pset.logfile, origin, query);
 
 		if (pset.echo_hidden == PSQL_ECHO_HIDDEN_NOEXEC)
 			return NULL;
@@ -856,10 +858,7 @@ ExecQueryTuples(const PGresult *result)
  * assumes that MainLoop did that, so we have to do it here.
  */
 if (pset.echo == PSQL_ECHO_ALL && !pset.singlestep)
-{
-	puts(query);
-	fflush(stdout);
-}
+	echoQuery(stdout, "GEXEC", query);
 
 if (!SendQuery(query))
 {
@@ -1226,13 +1225,7 @@ SendQuery(const char *query)
 	}
 
 	if (pset.logfile)
-	{
-		fprintf(pset.logfile,
-_("* QUERY **\n"
-  "%s\n"
-  "**\n\n"), query);
-		fflush(pset.logfile);
-	}
+		echoQuery(pset.logfile, "USER", query);
 
 	SetCancelConn(pset.db);
 
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index 041b2ac068..cdad55414a 100644
--- a/src/bin/psql/common.h
+++ b/src/bin/psql/common.h
@@ -28,9 +28,10 @@ extern sigjmp_buf sigint_interrupt_jmp;
 
 extern void psql_setup_cancel_handler(void);
 
-extern PGresult *PSQLexec(const char *query);
+extern PGresult *PSQLexec(const char *origin, const char *query);
 extern int	PSQLexecWatch(const char *query, const printQueryOpt *opt);
 
+extern void echoQuery(FILE *out, const char *origin, const char *query);
 extern bool SendQuery(const char *query);
 
 extern bool is_superuser(void);
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2abf255798..b7f701a68a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -127,7 +127,7 @@ describeAggregates(const char *pattern, bool verbose, bool showSystem)
 
 	appendPQExpBufferStr(, "ORDER BY 1, 2, 4;");
 
-	res = PSQLexec(buf.data);
+	res 

Re: unnesting multirange data types

2021-07-10 Thread Alvaro Herrera
On 2021-Jun-27, Alexander Korotkov wrote:

> BTW, I found some small inconsistencies in the declaration of
> multirange operators in the system catalog.  Nothing critical, but if
> we decide to bump catversion in beta3, this patch is also nice to
> push.

Hmm, I think you should push this and not bump catversion.  That way,
nobody is forced to initdb if we end up not having a catversion bump for
some other reason; but also anybody who initdb's with beta3 or later
will get the correct descriptions.

If you don't push it, everybody will have the wrong descriptions.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Pipeline mode and PQpipelineSync()

2021-07-10 Thread Alvaro Herrera
On 2021-Jul-08, Boris Kolpackov wrote:

> Alvaro Herrera  writes:
> 
> > To be honest, I am hesitant to changing the charter in that way; I fear
> > it may have consequences I don't foresee.  I think the workaround is not
> > *that* bad.
> 
> Ok, fair enough. I've updated my code to account for this and it seems
> to be working fine now.

Great, thanks.  I have pushed the fix, so beta3 (when it is released)
should work well for you.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ab09679429009bfed4bd894a6187afde0b7bdfcd

> How about the following for the second sentence:
> 
> "In particular, a call to PQisBusy in the middle
> of a pipeline returns 0 if all the results for queries issued so far
> have been consumed."

I used this wording, thanks.

On 2021-Jul-08, Alvaro Herrera wrote:

> Looking at this again, I noticed that I could probably do away with the
> switch on pipelineStatus, and just call pqPipelineProcessQueue in all
> cases when appending commands to the queue; I *think* that will do the
> right thing in all cases.  *Except* that I don't know what will happen
> if the program is in the middle of processing a result in single-row
> mode, and then sends another query: that would wipe out the pending
> results of the query being processed ...  but maybe that problem can
> already occur in some other way.

I tried this and it doesn't work.  It doesn't seem interesting to
pursue anyway, so I'll just drop the idea.  (I did notice that the
comment on single-row mode was wrong, though, since
pqPipelineProcessQueue does nothing in READY_MORE state, which is what
it is in the middle of processing a result.)

Thanks for all the help in testing and reviewing,

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)




Re: Enhanced error message to include hint messages for redundant options error

2021-07-10 Thread vignesh C
On Sat, Jul 10, 2021 at 4:14 PM Dean Rasheed  wrote:
>
> On Thu, 8 Jul 2021 at 14:40, vignesh C  wrote:
> >
> > On Thu, Jul 8, 2021 at 1:52 AM Daniel Gustafsson  wrote:
> > >
> > > I sort of like the visual cue of seeing ereport(ERROR ..  since it makes 
> > > it
> > > clear it will break execution then and there, this will require a lookup 
> > > for
> > > anyone who don't know the function by heart.  That being said, reducing
> > > duplicated boilerplate has clear value and this reduce the risk of 
> > > introducing
> > > strings which are complicated to translate.  On the whole I think this is 
> > > a net
> > > win, and the patch looks pretty good.
> > >
>
> Bikeshedding the function name, there are several similar examples in
> the existing code, but the closest analogs are probably
> errorMissingColumn() and errorMissingRTE(). So I think
> errorConflictingDefElem() would be better, since it's slightly more
> obviously an error.
>

Ok, I will change it to keep it similar.

> Also, I don't think this function should be marked inline -- using a
> normal function ought to help make the compiled code smaller.
>

inline instructs the compiler to attempt to embed the function content
into the calling code instead of executing an actual call. I think we
should keep it inline to reduce the function call.

> A bigger problem is that the patch replaces about 100 instances of the
> error "conflicting or redundant options" with "option \"%s\" specified
> more than once", but that's not always the appropriate thing to do.
> For example, in the walsender code, the error isn't necessarily due to
> the option being specified more than once.
>

This patch intended to change "conflicting or redundant options" to
"option \"%s\" specified more than once" only in case that error is
for option specified more than once. This change is not required. I
will remove it.

> Also, there are cases where def->defname isn't actually the name of
> the option specified, so including it in the error is misleading. For
> example:
>
> CREATE OR REPLACE FUNCTION foo() RETURNS int
> AS $$ SELECT 1 $$ STABLE IMMUTABLE;
>
> ERROR:  option "volatility" specified more than once
> LINE 2: AS $$ SELECT 1 $$ STABLE IMMUTABLE;
>  ^
>
> and in this case "volatility" is an internal string, so it won't get 
> translated.
>
> I'm inclined to think that it isn't worth the effort trying to
> distinguish between conflicting options, options specified more than
> once and faked-up options that weren't really specified. If we just
> make errorConflictingDefElem() report "conflicting or redundant
> options", then it's much easier to update calling code without making
> mistakes. The benefit then comes from the reduced code size and the
> fact that the patch includes pstate in more places, so the
> parser_errposition() indicator helps the user identify the problem.
>
> In file_fdw_validator(), where there is no pstate, it's already using
> "specified more than once" as a hint to clarify the "conflicting or
> redundant options" error, so I think we should leave that alone.

This patch intended to change "conflicting or redundant options" to
"option \"%s\" specified more than once" only in case that error is
for option specified more than once. Thanks for pointing out a few
places where the actual error "conflicting or redundant options"
should be left as it is. I will post a new patch which will remove the
conflicting options error scenarios, which were not targeted in this
patch.

Regards,
Vignesh




Re: proposal: possibility to read dumped table's name from file

2021-07-10 Thread Tomas Vondra

Hi,

I started looking at the patch allowing to export just functions [1], 
and I got pointed to this patch as an alternative approach (to adding a 
separate filtering option for every possible object type).


I'm familiar with the customer that inspired Pavel to start working on 
this, so I understand the use case he's trying to address - a flexible 
way to filter (include/exclude) large number of objects.


IMHO it's a mistake to try to broaden the scope of the patch and require 
implementing some universal pg_dump config file, particularly if it 
requires "complex" structure or formats like JSON, TOML or whatever. 
Maybe that's worth doing, but in my mind it's orthogonal to what this 
patch aims (or aimed) to do - filtering objects using rules in a file, 
not on the command line.


I believe it's much closer to .gitignore or rsync --filter than to a 
full config file. Even if we end up implementing the pg_dump config 
file, it'd be nice to keep the filter rules in a separate file and just 
reference that file from the config file.


That also means I find it pointless to use an "advanced" format like 
JSON or TOML - I think the format should be as simple as possible. Yes, 
it has to support all valid identifiers, comments and so on. But I don't 
quite see a point in using JSON or similar "full" format. If a simple 
format is good enough for rsync or gitignore, why should we insist on 
using something more complex?


OTOH I don't quite like the current approach of simply reading options 
from a file, because that requires adding new command-line options for 
each type of object we want to support. Which seems to contradict the 
idea of "general filter" method as mentioned in [1].


So if it was up to me, I'd go back to the original format or something 
close it. So something like this:


[+-] OBJECT_TYPE_PATTERN OBJECT_NAME_PATTERN


regards


[1] https://commitfest.postgresql.org/33/3051/

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: psql - factor out echo code

2021-07-10 Thread vignesh C
On Sat, Jul 3, 2021 at 3:07 AM Fabien COELHO  wrote:
>
>
> >  "-- #  QUERY\n%s\n\n"
>
> Attached an attempt along those lines. I found another duplicate of the
> ascii-art printing in another function.
>
> Completion queries seems to be out of the echo/echo hidden feature.
>
> Incredible, there is a (small) impact on regression tests for the \gexec
> case. All other changes have no impact, because they are not tested:-(

I am changing the status to "Needs review" as the review is not
completed for this patch and also there are some tests failing, that
need to be fixed:
test test_extdepend   ... FAILED   50 ms

Regards,
Vignesh




Re: Reduce lock level for ALTER TABLE ... ADD CHECK .. NOT VALID

2021-07-10 Thread John Naylor
On Thu, Apr 22, 2021 at 8:01 AM Simon Riggs 
wrote:
>
> 897795240cfaaed724af2f53ed2c50c9862f951f forgot to reduce the lock
> level for CHECK constraints when allowing them to be NOT VALID.
>
> This is simple and safe, since check constraints are not used in
> planning until validated.

The patch also reduces the lock level when NOT VALID is not specified,
which didn't seem to be the intention.

# begin;
BEGIN
*# alter table alterlock2 add check (f1 > 0);
ALTER TABLE
*# select * from my_locks order by 1;
  relname   | max_lockmode
+---
 alterlock2 | ShareRowExclusiveLock
(1 row)

--
John Naylor
EDB: http://www.enterprisedb.com


Re: bugfix: when the blocksize is 32k, the function page_header of pageinspect returns negative numbers.

2021-07-10 Thread Bharath Rupireddy
On Sat, Jul 10, 2021 at 4:59 PM Michael Paquier  wrote:
> So I would finish with the attached, close enough to what Quan has
> sent upthread.

Thanks. The patch looks good to me, except a minor comment - isn't it
"int2 for these fields" as the fields still exist? + /* pageinspect >=
1.10 uses int4 instead of int2 for those fields */

Regards,
Bharath Rupireddy.




Re: EXPLAIN/EXPLAIN ANALYZE REFRESH MATERIALIZED VIEW

2021-07-10 Thread Bharath Rupireddy
On Sat, Jul 10, 2021 at 5:19 PM John Naylor
 wrote:
> Side note for future reference: While the feature named in the CF entry has 
> been rejected, the remaining 0001 patch currently proposed no longer matches 
> the title, or category. It is possible within the CF app, and helpful, to 
> rename the entry when the scope changes.
>
> The proposed patch in the CF for incremental view maintenance [1] does some 
> refactoring of its own in implementing the feature. I don't think it makes 
> sense to commit a refactoring that conflicts with that, while not necessarily 
> making life easier for that feature. Incremental view maintenance is highly 
> desirable, so I don't want to put up unnecessary roadblocks.

Thanks. I'm okay to close the CF
entry(https://commitfest.postgresql.org/33/2928/) and stop this
thread.

Regards,
Bharath Rupireddy.




Re: proposal - psql - use pager for \watch command

2021-07-10 Thread vignesh C
On Wed, May 12, 2021 at 5:45 PM Pavel Stehule  wrote:
>
>
>
> st 12. 5. 2021 v 12:25 odesílatel Pavel Stehule  
> napsal:
>>
>> Hi
>>
>> st 21. 4. 2021 v 8:52 odesílatel Pavel Stehule  
>> napsal:
>>>
>>>
>>>
>>> st 21. 4. 2021 v 8:49 odesílatel Thomas Munro  
>>> napsal:

 On Wed, Apr 21, 2021 at 6:33 PM Pavel Stehule  
 wrote:
 > here is an rebase of Thomas's implementation

 Thanks.  I finished up not committing that one for 14 because I wasn't
 sure about the way to rebase it on top of 3a513067 (now reverted);
 that "restore" stuff seemed a bit weird.  Let's try again in v15 CF1!
>>>
>>>
>>> Understand. Thank you
>>
>>
>> rebase
>
>
> looks so with your patch psql doesn't work on ms
>
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.134648

I am changing the status to "Waiting on Author" as Pavel's comments
are not addressed.

Regards,
Vignesh




Re: Parallel Full Hash Join

2021-07-10 Thread vignesh C
On Mon, May 31, 2021 at 10:47 AM Greg Nancarrow  wrote:
>
> On Sat, Mar 6, 2021 at 12:31 PM Thomas Munro  wrote:
> >
> > On Tue, Mar 2, 2021 at 11:27 PM Thomas Munro  wrote:
> > > On Fri, Feb 12, 2021 at 11:02 AM Melanie Plageman
> > >  wrote:
> > > > I just attached the diff.
> > >
> > > Squashed into one patch for the cfbot to chew on, with a few minor
> > > adjustments to a few comments.
> >
> > I did some more minor tidying of comments and naming.  It's been on my
> > to-do-list to update some phase names after commit 3048898e, and while
> > doing that I couldn't resist the opportunity to change DONE to FREE,
> > which somehow hurts my brain less, and makes much more obvious sense
> > after the bugfix in CF #3031 that splits DONE into two separate
> > phases.  It also pairs obviously with ALLOCATE.  I include a copy of
> > that bugix here too as 0001, because I'll likely commit that first, so
> > I rebased the stack of patches that way.  0002 includes the renaming I
> > propose (master only).  Then 0003 is Melanie's patch, using the name
> > SCAN for the new match bit scan phase.  I've attached an updated
> > version of my "phase diagram" finger painting, to show how it looks
> > with these three patches.  "scan*" is new.
>
> Patches 0002, 0003 no longer apply to the master branch, seemingly
> because of subsequent changes to pgstat, so need rebasing.

I am changing the status to "Waiting on Author" as the patch does not
apply on Head.

Regards,
Vignesh




Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2021-07-10 Thread Tatsuo Ishii
I have played with v14 patch. I previously complained that pgbench
always reported 9 errors (actually the number is always the number
specified by "-c" -1 in my case).

$ pgbench -p 11000 -c 10  -T 10  --max-tries=0 test
pgbench (15devel, server 13.3)
starting vacuum...end.
transaction type: 
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 10 s
number of transactions actually processed: 974
number of failed transactions: 9 (0.916%)
number of transactions retried: 651 (66.226%)
total number of retries: 8482
latency average = 101.317 ms (including failures)
initial connection time = 44.440 ms
tps = 97.796487 (without initial connection time)

To reduce the number of errors I provide "--max-tries=9000" because
pgbench reported 8482 errors.

$ pgbench -p 11000 -c 10  -T 10 --max-tries=9000 test
pgbench (15devel, server 13.3)
starting vacuum...end.
transaction type: 
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 10 s
number of transactions actually processed: 1133
number of failed transactions: 9 (0.788%)
number of transactions retried: 755 (66.112%)
total number of retries: 9278
maximum number of tries: 9000
latency average = 88.570 ms (including failures)
initial connection time = 23.384 ms
tps = 112.015219 (without initial connection time)

Unfortunately this didn't work. Still 9 errors because pgbench
terminated the last round of run.

Then I gave up to use -T, and switched to use -t. Number of
transactions for -t option was calculated by the total number of
transactions actually processed (1133) / number of clients (10) =
11.33. I rouned up 11.33 to 12, then multiply number of clients (10)
and got 120. The result:

$ pgbench -p 11000 -c 10  -t 120 --max-tries=9000 test
pgbench (15devel, server 13.3)
starting vacuum...end.
transaction type: 
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
number of transactions per client: 120
number of transactions actually processed: 1200/1200
number of transactions retried: 675 (56.250%)
total number of retries: 8524
maximum number of tries: 9000
latency average = 93.777 ms
initial connection time = 14.120 ms
tps = 106.635908 (without initial connection time)

Finally I was able to get a result without any errors.  This is not a
super simple way to obtain pgbench results without errors, but
probably I can live with it.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: [PoC] Improve dead tuple storage for lazy vacuum

2021-07-10 Thread Masahiko Sawada
On Sat, Jul 10, 2021 at 2:17 AM Andres Freund  wrote:
>
> Hi,
>
> On 2021-07-07 20:46:38 +0900, Masahiko Sawada wrote:
> > Currently, the TIDs of dead tuples are stored in an array that is
> > collectively allocated at the start of lazy vacuum and TID lookup uses
> > bsearch(). There are the following challenges and limitations:
>
> > So I prototyped a new data structure dedicated to storing dead tuples
> > during lazy vacuum while borrowing the idea from Roaring Bitmap[2].
> > The authors provide an implementation of Roaring Bitmap[3]  (Apache
> > 2.0 license). But I've implemented this idea from scratch because we
> > need to integrate it with Dynamic Shared Memory/Area to support
> > parallel vacuum and need to support ItemPointerData, 6-bytes integer
> > in total, whereas the implementation supports only 4-bytes integers.
> > Also, when it comes to vacuum, we neither need to compute the
> > intersection, the union, nor the difference between sets, but need
> > only an existence check.
> >
> > The data structure is somewhat similar to TIDBitmap. It consists of
> > the hash table and the container area; the hash table has entries per
> > block and each block entry allocates its memory space, called a
> > container, in the container area to store its offset numbers. The
> > container area is actually an array of bytes and can be enlarged as
> > needed. In the container area, the data representation of offset
> > numbers varies depending on their cardinality. It has three container
> > types: array, bitmap, and run.
>
> How are you thinking of implementing iteration efficiently for rtbm? The
> second heap pass needs that obviously... I think the only option would
> be to qsort the whole thing?

Yes, I'm thinking that the iteration of rtbm is somewhat similar to
tbm. That is, we iterate and collect hash table entries and do qsort
hash entries by the block number. Then fetch the entry along with its
container one by one in order of the block number.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: EXPLAIN/EXPLAIN ANALYZE REFRESH MATERIALIZED VIEW

2021-07-10 Thread John Naylor
On Tue, Mar 16, 2021 at 8:13 AM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Tue, Mar 16, 2021 at 1:15 AM Tom Lane  wrote:

> > I don't really see that this feature buys us anything you can't
> > get by explaining the view's query, so I think we're better advised
> > to keep our options open about how REFRESH might be implemented
> > in future.
>
> That makes sense to me. Thanks for the comments. I'm fine to withdraw the
patch.
>
> I would like to see if the 0001 patch(attaching here) will be useful
> at all. It just splits up the existing ExecRefreshMatView into a few
> functions to make the code readable. I'm okay to withdraw it if no one
> agrees.

Side note for future reference: While the feature named in the CF entry has
been rejected, the remaining 0001 patch currently proposed no longer
matches the title, or category. It is possible within the CF app, and
helpful, to rename the entry when the scope changes.

The proposed patch in the CF for incremental view maintenance [1] does some
refactoring of its own in implementing the feature. I don't think it makes
sense to commit a refactoring that conflicts with that, while not
necessarily making life easier for that feature. Incremental view
maintenance is highly desirable, so I don't want to put up unnecessary
roadblocks.

[1] https://commitfest.postgresql.org/33/2138/

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Support kerberos authentication for postgres_fdw

2021-07-10 Thread Michael Paquier
On Fri, Jul 09, 2021 at 10:13:20AM +, Peifeng Qiu wrote:
> I'd like to add kerberos authentication support for postgres_fdw by adding two
> options to user mapping: krb_client_keyfile and gssencmode.

You may want to register this patch into the next commit fest, to get
it reviewed for a potential integration in 15:
https://commitfest.postgresql.org/34/
--
Michael


signature.asc
Description: PGP signature


Re: bugfix: when the blocksize is 32k, the function page_header of pageinspect returns negative numbers.

2021-07-10 Thread Michael Paquier
On Fri, Jul 09, 2021 at 05:26:37PM +0530, Bharath Rupireddy wrote:
> 1) How about just adding a comment /* support for old extension
> version */ before INT2OID handling?
> + case INT2OID:
> + values[3] = UInt16GetDatum(page->pd_lower);
> + break;

Yes, having a comment to document from which version this is done
would be nice.  This is more consistent with the surroundings.

> 2) Do we ever reach the error statement elog(ERROR, "incorrect output
> types");? We have the function either defined with smallint or int, I
> don't think so we ever reach it. Instead, an assertion would work as
> suggested by Micheal.

I would keep an elog() here for the default case.  I was referring to
the use of assertions if changing the code into a single switch/case,
with assertions checking that the other arguments have the expected
type.

> 3) Isn't this test case unstable when regression tests are run with a
> different BLCKSZ setting? Or is it okay that some of the other tests
> for pageinspect already outputs page_size, hash_page_stats.
> +SELECT pagesize, version FROM page_header(get_raw_page('test1', 0));
> + pagesize | version
> +--+-
> + 8192 |   4

I don't think it matters much, most of the tests of pageinspect
already rely on 8k pages.  So let's keep it as-is.

> 4) Can we arrange pageinspect--1.8--1.9.sql  into the first line itself?
> +DATA =  pageinspect--1.9--1.10.sql \
> + pageinspect--1.8--1.9.sql \

That's a nit, but why not.

> 5) How about using "int4" instead of just "int", just for readability?

Any way is fine, I'd stick with "int" as the other fields used
"smallint".

> 6) How about something like below instead of repetitive switch statements?
> static inline Datum
> get_page_header_attr(TupleDesc desc, int attno, int val)
> {
> Oid atttypid;
> Datum datum;

Nah.  It does not seem like an improvement to me in terms of
readability.

So I would finish with the attached, close enough to what Quan has
sent upthread. 
--
Michael
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index 2d330ddb28..5c0736564a 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -13,7 +13,7 @@ OBJS = \
 	rawpage.o
 
 EXTENSION = pageinspect
-DATA =  pageinspect--1.8--1.9.sql \
+DATA =  pageinspect--1.9--1.10.sql pageinspect--1.8--1.9.sql \
 	pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
 	pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
 	pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
diff --git a/contrib/pageinspect/expected/oldextversions.out b/contrib/pageinspect/expected/oldextversions.out
index 04dc7f8640..f5c4b61bd7 100644
--- a/contrib/pageinspect/expected/oldextversions.out
+++ b/contrib/pageinspect/expected/oldextversions.out
@@ -36,5 +36,21 @@ SELECT * FROM bt_page_items('test1_a_idx', 1);
   1 | (0,1) |  16 | f | f| 01 00 00 00 00 00 00 01 | f| (0,1) | 
 (1 row)
 
+-- page_header() uses int instead of smallint for lower, upper, special and
+-- pagesize in pageinspect >= 1.10.
+ALTER EXTENSION pageinspect UPDATE TO '1.9';
+\df page_header
+  List of functions
+ Schema |Name | Result data type | Argument data types | Type 
++-+--+-+--
+ public | page_header | record   | page bytea, OUT lsn pg_lsn, OUT checksum smallint, OUT flags smallint, OUT lower smallint, OUT upper smallint, OUT special smallint, OUT pagesize smallint, OUT version smallint, OUT prune_xid xid | func
+(1 row)
+
+SELECT pagesize, version FROM page_header(get_raw_page('test1', 0));
+ pagesize | version 
+--+-
+ 8192 |   4
+(1 row)
+
 DROP TABLE test1;
 DROP EXTENSION pageinspect;
diff --git a/contrib/pageinspect/pageinspect--1.9--1.10.sql b/contrib/pageinspect/pageinspect--1.9--1.10.sql
new file mode 100644
index 00..8dd9a27505
--- /dev/null
+++ b/contrib/pageinspect/pageinspect--1.9--1.10.sql
@@ -0,0 +1,21 @@
+/* contrib/pageinspect/pageinspect--1.9--1.10.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pageinspect UPDATE TO '1.10'" to load this file. \quit
+
+--
+-- page_header()
+--
+DROP FUNCTION page_header(IN page bytea);
+CREATE FUNCTION page_header(IN page bytea,
+OUT lsn pg_lsn,
+OUT checksum smallint,
+OUT flags smallint,
+OUT lower int,
+OUT upper int,
+OUT special int,
+OUT pagesize int,
+OUT version smallint,
+OUT prune_xid xid)
+AS 'MODULE_PATHNAME', 'page_header'

Re: pg_dump new feature: exporting functions only. Bad or good idea ?

2021-07-10 Thread Tomas Vondra

On 7/10/21 1:43 AM, Tom Lane wrote:

Tomas Vondra  writes:

The main question I have is whether this should include procedures.


I feel a bit uncomfortable about sticking this sort of limited-purpose
selectivity mechanism into pg_dump.  I'd rather see a general filter
method that can select object(s) of any type.  Pavel was doing some
work towards that awhile ago, though I think he got frustrated about
the lack of consensus on details.  Which is a problem, but I don't
think the solution is to accrue more and more independently-designed-
and-implemented features that each solve some subset of the big problem.



I'm not against introducing such general filter mechanism, but why 
should it block this patch? I'd understand it the patch was adding a lot 
of code, but that's not the case - it's tiny. And we already have 
multiple filter options (to pick tables, schemas, extensions, ...).


And if there's no consensus on details of Pavel's patch after multiple 
commitfests, how likely is it it'll start moving forward?



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Enhanced error message to include hint messages for redundant options error

2021-07-10 Thread Dean Rasheed
On Sat, 10 Jul 2021 at 11:44, Dean Rasheed  wrote:
>
> I'm inclined to think that it isn't worth the effort trying to
> distinguish between conflicting options, options specified more than
> once and faked-up options that weren't really specified. If we just
> make errorConflictingDefElem() report "conflicting or redundant
> options", then it's much easier to update calling code without making
> mistakes. The benefit then comes from the reduced code size and the
> fact that the patch includes pstate in more places, so the
> parser_errposition() indicator helps the user identify the problem.
>
> In file_fdw_validator(), where there is no pstate, it's already using
> "specified more than once" as a hint to clarify the "conflicting or
> redundant options" error, so I think we should leave that alone.
>

Another possibility would be to pass the option list to
errorConflictingDefElem() and it could work out whether or not to
include the "option \%s\" specified more than once" hint, since that
hint probably is useful, and working out whether to include it is
probably less error-prone if it's done there.

Regards,
Dean




Re: Enhanced error message to include hint messages for redundant options error

2021-07-10 Thread Dean Rasheed
On Thu, 8 Jul 2021 at 14:40, vignesh C  wrote:
>
> On Thu, Jul 8, 2021 at 1:52 AM Daniel Gustafsson  wrote:
> >
> > I sort of like the visual cue of seeing ereport(ERROR ..  since it makes it
> > clear it will break execution then and there, this will require a lookup for
> > anyone who don't know the function by heart.  That being said, reducing
> > duplicated boilerplate has clear value and this reduce the risk of 
> > introducing
> > strings which are complicated to translate.  On the whole I think this is a 
> > net
> > win, and the patch looks pretty good.
> >

Bikeshedding the function name, there are several similar examples in
the existing code, but the closest analogs are probably
errorMissingColumn() and errorMissingRTE(). So I think
errorConflictingDefElem() would be better, since it's slightly more
obviously an error.

Also, I don't think this function should be marked inline -- using a
normal function ought to help make the compiled code smaller.

A bigger problem is that the patch replaces about 100 instances of the
error "conflicting or redundant options" with "option \"%s\" specified
more than once", but that's not always the appropriate thing to do.
For example, in the walsender code, the error isn't necessarily due to
the option being specified more than once.

Also, there are cases where def->defname isn't actually the name of
the option specified, so including it in the error is misleading. For
example:

CREATE OR REPLACE FUNCTION foo() RETURNS int
AS $$ SELECT 1 $$ STABLE IMMUTABLE;

ERROR:  option "volatility" specified more than once
LINE 2: AS $$ SELECT 1 $$ STABLE IMMUTABLE;
 ^

and in this case "volatility" is an internal string, so it won't get translated.

I'm inclined to think that it isn't worth the effort trying to
distinguish between conflicting options, options specified more than
once and faked-up options that weren't really specified. If we just
make errorConflictingDefElem() report "conflicting or redundant
options", then it's much easier to update calling code without making
mistakes. The benefit then comes from the reduced code size and the
fact that the patch includes pstate in more places, so the
parser_errposition() indicator helps the user identify the problem.

In file_fdw_validator(), where there is no pstate, it's already using
"specified more than once" as a hint to clarify the "conflicting or
redundant options" error, so I think we should leave that alone.

Regards,
Dean




Re: pgbench logging broken by time logic changes

2021-07-10 Thread Fabien COELHO


Hello again,


I hoped we were done here but I realised that your check for 1-3 log
lines will not survive the harsh environment of the build farm.
Adding sleep(2) before the final doLog() confirms that.  I had two
ideas:



So I think we should do 1 for now.  Objections or better ideas?


At least, we now that it is too much.


I misread your point. You think that it should fail, but it is not
tried yet. I'm rather optimistic that it should not fail, but I'm okay 
with averting the risk anyway.


What about moving the test as is in the TODO section with a comment, next to 
the other one, for now?


I stand by this solution which should allow to get some data from the 
field, as v18 attached. If all is green then the TODO could be removed 
later.


--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4aeccd93af..a54958930b 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -343,6 +343,12 @@ typedef struct StatsData
 	SimpleStats lag;
 } StatsData;
 
+/*
+ * For displaying Unix epoch timestamps, as some time functions may have
+ * another reference.
+ */
+pg_time_usec_t epoch_shift;
+
 /*
  * Struct to keep random state.
  */
@@ -3771,7 +3777,11 @@ executeMetaCommand(CState *st, pg_time_usec_t *now)
 /*
  * Print log entry after completing one transaction.
  *
- * We print Unix-epoch timestamps in the log, so that entries can be
+ * The function behavior changes depending on sample_rate (a fraction of
+ * transaction is reported) and agg_interval (transactions are aggregated
+ * and reported once every agg_interval seconds).
+ *
+ * We use Unix-epoch timestamps in the log, so that entries can be
  * correlated against other logs.  On some platforms this could be obtained
  * from the caller, but rather than get entangled with that, we just eat
  * the cost of an extra syscall in all cases.
@@ -3781,7 +3791,7 @@ doLog(TState *thread, CState *st,
 	  StatsData *agg, bool skipped, double latency, double lag)
 {
 	FILE	   *logfile = thread->logfile;
-	pg_time_usec_t now = pg_time_now();
+	pg_time_usec_t now = pg_time_now() + epoch_shift;
 
 	Assert(use_log);
 
@@ -3796,17 +3806,19 @@ doLog(TState *thread, CState *st,
 	/* should we aggregate the results or not? */
 	if (agg_interval > 0)
 	{
+		pg_time_usec_t next;
+
 		/*
 		 * Loop until we reach the interval of the current moment, and print
 		 * any empty intervals in between (this may happen with very low tps,
 		 * e.g. --rate=0.1).
 		 */
 
-		while (agg->start_time + agg_interval <= now)
+		while ((next = agg->start_time + agg_interval * INT64CONST(100)) <= now)
 		{
 			/* print aggregated report to logfile */
 			fprintf(logfile, INT64_FORMAT " " INT64_FORMAT " %.0f %.0f %.0f %.0f",
-	agg->start_time,
+	agg->start_time / 100,	/* seconds since Unix epoch */
 	agg->cnt,
 	agg->latency.sum,
 	agg->latency.sum2,
@@ -3825,7 +3837,7 @@ doLog(TState *thread, CState *st,
 			fputc('\n', logfile);
 
 			/* reset data and move to next interval */
-			initStats(agg, agg->start_time + agg_interval);
+			initStats(agg, next);
 		}
 
 		/* accumulate the current transaction */
@@ -5458,7 +5470,8 @@ printProgressReport(TState *threads, int64 test_start, pg_time_usec_t now,
 
 	if (progress_timestamp)
 	{
-		snprintf(tbuf, sizeof(tbuf), "%.3f s", PG_TIME_GET_DOUBLE(now));
+		snprintf(tbuf, sizeof(tbuf), "%.3f s",
+ PG_TIME_GET_DOUBLE(now + epoch_shift));
 	}
 	else
 	{
@@ -5808,6 +5821,15 @@ main(int argc, char **argv)
 	char	   *env;
 
 	int			exit_code = 0;
+	struct timeval tv;
+
+	/*
+	 * Record difference between Unix epoch and high resolution timer's epoch.
+	 * We'll use this for logging and aggregation with Unix epoch-based
+	 * buckets.
+	 */
+	gettimeofday(, NULL);
+	epoch_shift = tv.tv_sec * INT64CONST(100) + tv.tv_usec - pg_time_now();
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -6637,7 +6659,14 @@ threadRun(void *arg)
 	thread->bench_start = start;
 	thread->throttle_trigger = start;
 
-	initStats(, start);
+	/*
+	 * The log format currently has Unix epoch timestamps with whole numbers
+	 * of seconds.  Round the first aggregate's start time down to the nearest
+	 * Unix epoch second (the very first aggregate might really have started a
+	 * fraction of a second later, but later aggregates are measured from the
+	 * whole number time that is actually logged).
+	 */
+	initStats(, (start + epoch_shift) / 100 * 100);
 	last = aggs;
 
 	/* loop till all clients have terminated */
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 3aa9d5d753..d674cc59a5 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -8,6 +8,7 @@ use PostgresNode;
 use TestLib;
 use Test::More;
 use Config;
+use Time::HiRes qw(time);
 
 # start a pgbench specific server
 my $node = get_new_node('main');
@@ -54,12 +55,14 @@ sub pgbench
 
 	push @cmd, 

Re: SQL/JSON: JSON_TABLE

2021-07-10 Thread Erik Rijkers

On 5/18/21 9:23 PM, Andrew Dunstan wrote:


On 5/8/21 2:23 PM, Andrew Dunstan wrote:

On 4/12/21 11:34 AM, Erik Rijkers wrote:

On 2021.03.27. 02:12 Nikita Glukhov  wrote:
Attached 47th version of the patches.

We're past feature freeze for 14 and alas, JSON_TABLE has not made it.

I have tested quite a bit with it and because I didn't find any trouble with 
functionality or speed, I wanted to at least mention that here once.

I looked at v47, these files

[0001-SQL-JSON-functions-v47.patch]
[0002-JSON_TABLE-v47.patch]
[0003-JSON_TABLE-PLAN-DEFAULT-clause-v47.patch]
[0004-JSON_TABLE-PLAN-clause-v47.patch]
[manual_addition_fixed.patch]  # for this see [1], [2]

(v47 doesn't apply anymore, as cfbot shows, but instances can still be 
built on top of 6131ffc43ff from 30 march 2021)

I hope it will fare better next round, version 15.


Me too. Here's a set that should remove the bitrot.


Rebased for removal of serial schedule


Can one of you please add or integrate this patch to the JSON_TABLE changes?


It contains the fix for a bug that I reported earlier (on 2021-03-30 see 
[1]). Nikita did diagnose this fix but today I noticed it was still not 
included in the latest version, v49.



Thanks,

Erik Rijkers


[1] 
https://www.postgresql.org/message-id/2101814418.20240.1617123418368%40webmailclassic.xs4all.nl








--
Andrew Dunstan
EDB: https://www.enterprisedb.com

--- src/backend/jit/llvm/llvmjit_types.c.orig	2021-03-30 23:05:01.974817622 +0200
+++ src/backend/jit/llvm/llvmjit_types.c	2021-03-30 23:06:39.080369545 +0200
@@ -130,6 +130,9 @@
 	ExecEvalSysVar,
 	ExecEvalWholeRowVar,
 	ExecEvalXmlExpr,
+	ExecEvalJsonConstructor,
+	ExecEvalJsonIsPredicate,
+	ExecEvalJson,
 	MakeExpandedObjectReadOnlyInternal,
 	slot_getmissingattrs,
 	slot_getsomeattrs_int,


Re: pgbench logging broken by time logic changes

2021-07-10 Thread Fabien COELHO




Works for me: patch applies, global and local check ok. I'm fine with it.


I hoped we were done here but I realised that your check for 1-3 log
lines will not survive the harsh environment of the build farm.
Adding sleep(2) before the final doLog() confirms that.  I had two
ideas:

1.  Give up and expect 1-180 lines.  (180s is the current timeout
tolerance used elsewhere to deal with
swamped/valgrind/swapping/microsd computers, after a few rounds of
inflation, so if you need an arbitrary large number to avoid buildfarm
measles that's my suggestion)
2.  Change doLog() to give up after end_time.  But then I think you'd
need to make it 0-3 :-(

I think the logging could be tightened up to work the way you expected
in future work, though.  Perhaps we could change all logging to work
with transaction start time instead of log-writing time, which doLog()
should receive.  If you never start a transaction after end_time, then
there can never be an aggregate that begins after that, and the whole
thing becomes more deterministic.  That kind of alignment of aggregate
timing with whole-run timing could also get rid of those partial
aggregates completely.  But that's an idea for 15.

So I think we should do 1 for now.  Objections or better ideas?


At least, we now that it is too much.

What about moving the test as is in the TODO section with a comment, next 
to the other one, for now?


I hesitated to suggest that before for the above risks, but I was very 
naively optimistic that it may pass because the test is not that too 
demanding.


Someone suggested to have a "real-time" configure switch to enable/disable 
time-sensitive tests.


--
Fabien.




Re: when the startup process doesn't (logging startup delays)

2021-07-10 Thread Bharath Rupireddy
On Mon, Jun 21, 2021 at 12:06 PM Nitin Jadhav
 wrote:
> That was by mistake and I have corrected it in the attached patch.

Thanks for the patch. Here are some comments. Please ignore if I
repeat any of the comments given previously, as I didn't look at the
entire thread.

1) A new line between function return value and the function name:
+void CloseStartupProgress(StartupProcessOp operation)
+{
Like below:
+void
+CloseStartupProgress(StartupProcessOp operation)
+{

2) Add an entry in the commit fest, if it's not done yet. That way,
the patch gets tested on many platforms.

3) Replace "zero" with the number "0".
+ # -1 disables the feature, zero logs all

4) I think we can just rename InitStartupProgress to
EnableStartupProgress or EnableStartupOpProgress to be more in sync
with what it does.
+/*
+ * Sets the start timestamp of the current operation and also enables the
+ * timeout for logging the progress of startup process.
+ */
+void
+InitStartupProgress(void)
+{

5) Do we need the GetCurrentOperationStartTimestamp function at all?
It does nothing great actually, you might have referred to
GetCurrentTimestamp which does a good amount of work that qualifies to
be inside a function. Can't we just use the operationStartTimestamp
variable? Can we rename operationStartTimestamp (I don't think we need
to specify Timestamp in a variable name) to startup_op_start_time or
some other better name?
+/*
+ * Fetches the start timestamp of the current operation.
+ */
+TimestampTz
+GetCurrentOperationStartTimestamp(void)
+{

6) I think you can transform below
+ /* If the feature is disabled, then no need to proceed further. */
+ if (log_startup_progress_interval < 0)
+ return;
to
+ /* If the feature is disabled, then no need to proceed further. */
+ if (log_startup_progress_interval == -1)
+ return;
as -1 means feature disabled and values < -1 are not allowed to be set at all.

7) Isn't RECOVERY_IN_PROGRESS supposed to be REDO_IN_PRGRESS? Because,
"recovery in progress" generally applies to the entire startup process
right? Put it another way, the startup process as a whole does the
entire recovery, and you have the RECOVERY_IN_PROGRESS in the redo
phase of the entire startup process.

8) Why do we need to call get_startup_process_operation_string here?
Can't you directly use the error message text?
if (operation == RECOVERY_IN_PROGRESS)
ereport(LOG,
(errmsg("%s, elapsed time: %ld.%03d ms, current LSN: %X/%X",
get_startup_process_operation_string(operation),

9) Do you need error handling in the default case of
get_startup_process_operation_string? Instead, how about an assertion,
Assert(operation >= SYNCFS_IN_PROGRESS && operation <=
RESET_UNLOGGED_REL_END);?
default:
ereport(ERROR,
(errmsg("unrecognized operation (%d) in startup
progress update",
operation)));
10) I personally didn't like the name
get_startup_process_operation_string. How about get_startup_op_string?

11) As pointed out by Robert, the error log message should start with
small letters.
"syncing data directory (syncfs)");
"syncing data directory (fsync)");
"performing crash recovery");
"resetting unlogged relations");
 In general, the error log message should start with small letters and
not end with ".". The detail/hit messages should start with capital
letters and end with "."
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("only B-Tree indexes are supported as targets
for verification"),
 errdetail("Relation \"%s\" is not a B-Tree index.",
   RelationGetRelationName(rel;
 ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 errmsg("sslcert and sslkey are superuser-only"),
 errhint("User mappings with the sslcert or
sslkey options set may only be created or modified by the
superuser")));

12) How about starting SYNCFS_IN_PROGRESS = 1, and leaving 0 for some
unknown value?
typedef enum StartupProcessOp
{
/* Codes for in-progress operations */
SYNCFS_IN_PROGRESS = 1,

13) Can we combine LogStartupProgress and CloseStartupProgress?  Let's
have a single function LogStartupProgress(StartupProcessOp operation,
const char *path, bool start);, and differentiate the functionality
with the start flag.

14) Can we move log_startup_progress_interval declaration from guc.h
and guc.c to xlog.h and xlog.c? Because it makes more sense to be
there, similar to the other GUCs under /* these variables are GUC
parameters related to XLOG */ in xlog.h.

Regards,
Bharath Rupireddy.




Re: Grammar railroad diagram

2021-07-10 Thread Corey Huinker
>
>
> Another way that I tested and it's working is to use
> https://www.bottlecaps.de/convert/ paste the postgresql grammar there
> and press "convert" and after press "view diagram".
>

I tried this out and I'm pleased to see that one of the outputs is xhtml +
SVG, because SVGs have hover-over tool-tips, which are an important aspect
of accessibility, which was my major concern the last time a similar thing
was proposed [1].

[1]
https://www.postgresql.org/message-id/cah2-wzmfc+p3pc_u1dsgm3lawurzkx5pqzmxtglgsxbf8gf...@mail.gmail.com