Re: Supporting = operator in gin/gist_trgm_ops

2020-11-14 Thread Erik Rijkers

On 2020-11-14 06:30, Alexander Korotkov wrote:


[v4-0001-Handle-equality...in-contrib-pg_trgm.patch (~]

I'm going to push this if no objections.



About the sgml, in doc/src/sgml/pgtrgm.sgml :


Beginning in PostgreSQL 14, these indexes 
also support equality operator (simple comparison operators are not 
supported).


should be:

Beginning in PostgreSQL 14, these indexes 
also support the equality operator (simple comparison operators are not 
supported).


(added 'the')


And:

Although these indexes might have lower the performance of equality 
operator

search than regular B-tree indexes.

should be (I think - please check the meaning)

Although these indexes might have a lower performance with equality 
operator

search than with regular B-tree indexes.


I am not sure I understood this last sentence correctly. Does this mean 
the slower trgm index might be chosen over the faster btree?



Thanks,

Erik Rijkers





Re: Allow an alias to be attached directly to a JOIN ... USING

2020-11-14 Thread Peter Eisentraut

On 2020-11-10 16:15, Georgios Kokolatos wrote:

I noticed that this patch fails on the cfbot.
For this, I changed the status to: 'Waiting on Author'.

Cheers,
//Georgios

The new status of this patch is: Waiting on Author


Here is a rebased and lightly retouched patch.

--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/
From 5b736c0033d4e41a4ca186101f892fbd00ff3528 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 14 Nov 2020 09:25:33 +0100
Subject: [PATCH v4] Allow an alias to be attached to a JOIN ... USING

This allows something like

SELECT ... FROM t1 JOIN t2 USING (a, b, c) AS x

where x has the columns a, b, c and unlike a regular alias it does not
hide the range variables of the tables being joined t1 and t2.

Per SQL:2016 feature F404 "Range variable for common column names".

Discussion: 
https://www.postgresql.org/message-id/flat/454638cf-d563-ab76-a585-256442806...@2ndquadrant.com
---
 doc/src/sgml/ref/select.sgml  | 11 -
 src/backend/catalog/sql_features.txt  |  2 +-
 src/backend/nodes/copyfuncs.c |  2 +
 src/backend/nodes/equalfuncs.c|  2 +
 src/backend/nodes/outfuncs.c  |  2 +
 src/backend/nodes/readfuncs.c |  2 +
 src/backend/parser/analyze.c  |  1 +
 src/backend/parser/gram.y | 48 +
 src/backend/parser/parse_clause.c | 21 -
 src/backend/parser/parse_relation.c   | 45 +++-
 src/backend/utils/adt/ruleutils.c |  4 ++
 src/include/nodes/parsenodes.h|  7 +++
 src/include/nodes/primnodes.h |  1 +
 src/include/parser/parse_node.h   |  1 +
 src/include/parser/parse_relation.h   |  1 +
 src/test/regress/expected/create_view.out | 52 ++-
 src/test/regress/expected/join.out| 31 ++
 src/test/regress/sql/create_view.sql  | 11 +
 src/test/regress/sql/join.sql |  8 
 19 files changed, 239 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 472b7cae81..c6e449d71e 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -59,7 +59,7 @@
 [ LATERAL ] function_name ( [ 
argument [, ...] ] ) AS ( 
column_definition [, ...] )
 [ LATERAL ] ROWS FROM( function_name ( [ argument [, ...] ] ) [ AS ( column_definition [, ...] ) ] [, ...] )
 [ WITH ORDINALITY ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
-from_item [ NATURAL ] 
join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ]
+from_item [ NATURAL ] 
join_type from_item [ ON join_condition | USING ( join_column [, ...] ) [ AS join_using_alias ] ]
 
 and grouping_element can 
be one of:
 
@@ -633,6 +633,15 @@ FROM Clause
 equivalent columns will be included in the join output, not
 both.

+
+   
+If a join_using_alias is
+specified, it gives a correlation name to the join columns.  Only the
+join columns in the USING clause are addressable by
+this name.  Unlike an alias, this does not hide the names of
+the joined tables from the rest of the query.
+   
   
  
 
diff --git a/src/backend/catalog/sql_features.txt 
b/src/backend/catalog/sql_features.txt
index b6e58e8493..69c24e7369 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -264,7 +264,7 @@ F401Extended joined table   02  FULL OUTER JOIN 
YES
 F401   Extended joined table   04  CROSS JOIN  YES 
 F402   Named column joins for LOBs, arrays, and multisets  
YES 
 F403   Partitioned joined tables   NO  
-F404   Range variable for common column names  NO  
+F404   Range variable for common column names  YES 
 F411   Time zone specification YES differences regarding 
literal interpretation
 F421   National character  YES 
 F431   Read-only scrollable cursorsYES 
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 3031c52991..3582f48084 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2206,6 +2206,7 @@ _copyJoinExpr(const JoinExpr *from)
COPY_NODE_FIELD(rarg);
COPY_NODE_FIELD(usingClause);
COPY_NODE_FIELD(quals);
+   COPY_NODE_FIELD(join_using_alias);
COPY_NODE_FIELD(alias);
COPY_SCALAR_FIELD(rtindex);
 
@@ -2415,6 +2416,7 @@ _copyRangeTblEntry(const RangeTblEntry *from)
COPY_NODE_FIELD(joinaliasvars);
COPY_NODE_FIELD(joinleftcols);
COPY_NODE_FIELD(joinrightcols);
+   COPY_NODE_FIELD(join_using_alias);
COPY_NODE_FIELD(functions);
COPY_SCALAR_FIELD(funcordinality);
COPY_NODE_FIELD(tablefunc);
diff --git a/src/backend/nodes/equalfuncs

Re: Allow an alias to be attached directly to a JOIN ... USING

2020-11-14 Thread Peter Eisentraut

On 2020-08-03 19:44, Wolfgang Walther wrote:

So given this:

SELECT x.id FROM a LEFT JOIN b USING (id) AS x

will this return NULL or a.id for rows that don't match in b? This
should definitely be mentioned in the docs and I guess a test wouldn't
be too bad as well?


This issue is independent of the presence of the alias "x", so I don't 
think it has to do with this patch.


There is a fair amount of documentation on outer joins, so I expect that 
this is discussed there.


--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/




More time spending with "delete pending"

2020-11-14 Thread Alexander Lakhin
Hello hackers,

After fixing bug #16161 (pg_ctl inability to open just deleted
postmaster.pid) there are still some errors related to the same
Windows-only issue.
Namely, pg_upgradeCheck failures seen on
https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=fairywren&br=REL_13_STABLE
Here pg_ls_dir_files couldn't handle the "delete pending" state too.

I've made a simple test to reproduce this behavior with pg_ls_waldir,
that fails with:
error running SQL: 'psql::1: ERROR:  could not stat file
"pg_wal/dummy": Permission denied'

As noted in [1], a sensible solution would be putting the same "retry on
ERROR_ACCESS_DENIED" action in a wrapper for stat().
And bed90759f brought in master the _pgstat64() function, where such
error handling should be placed.
So please look at the patch (based on the previous one made to fix
bug#16161), that makes the attached test pass.

And I have a few questions.
For now genfile.c needs to include "port.h" explicitly to override
definitions in "sys/stat.h", that override "stat" redefinition for
Windows included implicitly above via "postgres.h". Without it the
pg_ls_dir_files() function just uses system' stat().
So I wonder, is there a way to make all stat-calling code use the
improved stat()?
And if so, maybe the stat() call in pgwin32_open should be replaced with
microsoft_native_stat().

And regarding released versions, are there any reasons to not backport
bed90759f (with the proposed fix or alike)?

[1] https://www.postgresql.org/message-id/396837.1605298573%40sss.pgh.pa.us

Best regards,
Alexander



001_delete_pending.pl
Description: Perl program
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..922df49125 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -28,6 +28,7 @@
 #include "funcapi.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "port.h"
 #include "postmaster/syslogger.h"
 #include "storage/fd.h"
 #include "utils/acl.h"
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 4351aa4d08..c2b55c7fa6 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -170,6 +170,7 @@ _pgstat64(const char *name, struct stat *buf)
 	SECURITY_ATTRIBUTES sa;
 	HANDLE		hFile;
 	int			ret;
+	int			loops = 0;
 #if _WIN32_WINNT < 0x0600
 	IO_STATUS_BLOCK ioStatus;
 	FILE_STANDARD_INFORMATION standardInfo;
@@ -182,31 +183,60 @@ _pgstat64(const char *name, struct stat *buf)
 		errno = EINVAL;
 		return -1;
 	}
-
 	/* fast not-exists check */
 	if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
 	{
-		_dosmaperr(GetLastError());
-		return -1;
+		DWORD		err = GetLastError();
+
+		if (err != ERROR_ACCESS_DENIED) {
+			_dosmaperr(err);
+			return -1;
+		}
 	}
 
 	/* get a file handle as lightweight as we can */
 	sa.nLength = sizeof(SECURITY_ATTRIBUTES);
 	sa.bInheritHandle = TRUE;
 	sa.lpSecurityDescriptor = NULL;
-	hFile = CreateFile(name,
-	   GENERIC_READ,
-	   (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
-	   &sa,
-	   OPEN_EXISTING,
-	   (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
-		FILE_FLAG_OVERLAPPED),
-	   NULL);
-	if (hFile == INVALID_HANDLE_VALUE)
+	while ((hFile = CreateFile(name,
+			   GENERIC_READ,
+			   (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
+			   &sa,
+			   OPEN_EXISTING,
+			   (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
+			   FILE_FLAG_OVERLAPPED),
+			   NULL)) == INVALID_HANDLE_VALUE)
 	{
 		DWORD		err = GetLastError();
 
-		CloseHandle(hFile);
+		/*
+		 * ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
+		 * gone (Windows NT status code is STATUS_DELETE_PENDING).  In that
+		 * case we want to wait a bit and try again, giving up after 1 second
+		 * (since this condition should never persist very long).  However,
+		 * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
+		 * so care is needed.  In particular that happens if we try to open a
+		 * directory, or of course if there's an actual file-permissions
+		 * problem.  To distinguish these cases, try a stat().  In the
+		 * delete-pending case, it will either also get STATUS_DELETE_PENDING,
+		 * or it will see the file as gone and fail with ENOENT.  In other
+		 * cases it will usually succeed.  The only somewhat-likely case where
+		 * this coding will uselessly wait is if there's a permissions problem
+		 * with a containing directory, which we hope will never happen in any
+		 * performance-critical code paths.
+		 */
+		if (err == ERROR_ACCESS_DENIED)
+		{
+			struct microsoft_native_stat st;
+
+			if (microsoft_native_stat(name, &st) != 0)
+			{
+pg_usleep(10);
+loops++;
+continue;
+			}
+		}
+
 		_dosmaperr(err);
 		return -1;
 	}


Re: Supporting = operator in gin/gist_trgm_ops

2020-11-14 Thread Alexander Korotkov
Hi, Erik!

On Sat, Nov 14, 2020 at 11:37 AM Erik Rijkers  wrote:
> On 2020-11-14 06:30, Alexander Korotkov wrote:
>
> > [v4-0001-Handle-equality...in-contrib-pg_trgm.patch (~]
> >
> > I'm going to push this if no objections.
> >
>
> About the sgml, in doc/src/sgml/pgtrgm.sgml :
>
>
> Beginning in PostgreSQL 14, these indexes
> also support equality operator (simple comparison operators are not
> supported).
>
> should be:
>
> Beginning in PostgreSQL 14, these indexes
> also support the equality operator (simple comparison operators are not
> supported).
>
> (added 'the')
>
>
> And:
>
> Although these indexes might have lower the performance of equality
> operator
> search than regular B-tree indexes.
>
> should be (I think - please check the meaning)
>
> Although these indexes might have a lower performance with equality
> operator
> search than with regular B-tree indexes.
>
>
> I am not sure I understood this last sentence correctly. Does this mean
> the slower trgm index might be chosen over the faster btree?

Thank you for your review.

I mean searching for an equal string with pg_trgm GiST/GIN might be
slower than the same search with B-tree.  If you need both pg_trgm
similarity/pattern search and equal search on your column, you have
choice.  You can run with a single pg_trgm index, but your search for
an equal string wouldn't be as fast as with B-tree.  Alternatively you
can have two indexes: pg_trgm index for similarity/pattern search and
B-tree index for equality search.  Second option gives you a fast
equality search, but the second B-tree index would take extra space
and maintenance overhead.  For equality search, the B-tree index
should be chosen by the planner (and that was tested).

--
Regards,
Alexander Korotkov




Re: Supporting = operator in gin/gist_trgm_ops

2020-11-14 Thread Julien Rouhaud
On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov  wrote:
>
> Hi, Erik!
>
> On Sat, Nov 14, 2020 at 11:37 AM Erik Rijkers  wrote:
> > On 2020-11-14 06:30, Alexander Korotkov wrote:
> >
> > > [v4-0001-Handle-equality...in-contrib-pg_trgm.patch (~]
> > >
> > > I'm going to push this if no objections.
> > >
> >
> > About the sgml, in doc/src/sgml/pgtrgm.sgml :
> >
> >
> > Beginning in PostgreSQL 14, these indexes
> > also support equality operator (simple comparison operators are not
> > supported).
> >
> > should be:
> >
> > Beginning in PostgreSQL 14, these indexes
> > also support the equality operator (simple comparison operators are not
> > supported).
> >
> > (added 'the')
> >
> >
> > And:
> >
> > Although these indexes might have lower the performance of equality
> > operator
> > search than regular B-tree indexes.
> >
> > should be (I think - please check the meaning)
> >
> > Although these indexes might have a lower performance with equality
> > operator
> > search than with regular B-tree indexes.
> >
> >
> > I am not sure I understood this last sentence correctly. Does this mean
> > the slower trgm index might be chosen over the faster btree?
>
> Thank you for your review.
>
> I mean searching for an equal string with pg_trgm GiST/GIN might be
> slower than the same search with B-tree.  If you need both pg_trgm
> similarity/pattern search and equal search on your column, you have
> choice.  You can run with a single pg_trgm index, but your search for
> an equal string wouldn't be as fast as with B-tree.  Alternatively you
> can have two indexes: pg_trgm index for similarity/pattern search and
> B-tree index for equality search.  Second option gives you a fast
> equality search, but the second B-tree index would take extra space
> and maintenance overhead.  For equality search, the B-tree index
> should be chosen by the planner (and that was tested).

Thanks everyone for the review, and thanks Alexander for the modifications!

I agree that it's important to document that those indexes may be less
performant than btree indexes.   I also agree that there's an
extraneous "the" in the documentation.  Maybe this rewrite could be
better?

   Note that those indexes may not be as afficient as regulat B-tree indexes
   for equality operator.

While at it, there's a small grammar error:

case EqualStrategyNumber:
-   /* Wildcard search is inexact */
+   /* Wildcard and equal search is inexact */

It should be /* Wildcard and equal search are inexact */




Re: Supporting = operator in gin/gist_trgm_ops

2020-11-14 Thread Erik Rijkers

On 2020-11-14 12:53, Julien Rouhaud wrote:
On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov 
 >


   Note that those indexes may not be as afficient as regulat B-tree 
indexes

   for equality operator.



'afficient as regulat'  should be
'efficient as regular'


Sorry to be nitpicking - it's the one thing I'm really good at :P

Erik




Re: pgbench: option delaying queries till connections establishment?

2020-11-14 Thread Marina Polyakova

Hello!

On 2020-11-13 08:44, kuroda.hay...@fujitsu.com wrote:

Dear Fabien,

and this will wait till its time comes. In the mean time, I think that 
you

should put the patch status as you see fit, independently of the other
patch: it seems unlikely that they would be committed together, and 
I'll

have to merge the remaining one anyway.


OK. I found the related thread[1], and I understood you will submit
another patch
on the thread.

PostgreSQL Patch Tester says all regression tests are passed, and
I change the status to "Ready for committer."

[1]: https://commitfest.postgresql.org/31/2827/

Thank you for discussing with me.

Hayato Kuroda
FUJITSU LIMITED


From the mentioned thread [2]:

While trying to test a patch that adds a synchronization barrier in 
pgbench [1] on Windows,


Thanks for trying that, I do not have a windows setup for testing, and
the sync code I wrote for Windows is basically blind coding:-(


FYI:

1) It looks like pgbench will no longer support Windows XP due to the
function DeleteSynchronizationBarrier. From
https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-deletesynchronizationbarrier
:

Minimum supported client: Windows 8 [desktop apps only]
Minimum supported server: Windows Server 2012 [desktop apps only]

On Windows Server 2008 R2 (MSVC 2013) the 6-th version of the patch
[1] has compiled without (new) warnings, but when running pgbench I
got the following error:

The procedure entry point DeleteSynchronizationBarrier could not be
located in the dynamic link library KERNEL32.dll.


IMO, it looks like either old Windows systems should not call new 
functions, or we should throw them a compilation error. (Update 
MIN_WINNT to 0x0602 = Windows 8 in src/include/port/win32.h?) In the 
second case it looks like the documentation should be updated too, see 
doc/src/sgml/installation.sgml:



 PostgreSQL can be expected to work on these 
operating

 systems: Linux (all recent distributions), Windows (XP and later),
 FreeBSD, OpenBSD, NetBSD, macOS, AIX, HP/UX, and Solaris.
 Other Unix-like systems may also work but are not currently
 being tested.  In most cases, all CPU architectures supported by
 a given operating system will work.  Look in
  below to see if
 there is information
 specific to your operating system, particularly if using an older 
system.



<...>


 The native Windows port requires a 32 or 64-bit version of Windows
 2000 or later. Earlier operating systems do
 not have sufficient infrastructure (but Cygwin may be used on
 those).  MinGW, the Unix-like build tools, and MSYS, a collection
 of Unix tools required to run shell scripts
 like configure, can be downloaded
 from http://www.mingw.org/";>.  Neither is
 required to run the resulting binaries; they are needed only for
 creating the binaries.


[2] 
https://www.postgresql.org/message-id/e5a09b790db21356376b6e73673aa07c%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: PATCH: Batch/pipelining support for libpq

2020-11-14 Thread Daniel Verite
Alvaro Herrera wrote:

> I adapted the test code to our code style.  I also removed the "timings"
> stuff; I think that's something better left to pgbench.
> 
> (I haven't looked at Daniel's pgbench stuff yet, but I will do that
> next.)

The patch I posted in [1] was pretty simple, but at the time, query
results were always discarded. Now that pgbench can instantiate
variables from query results, a script can do:
  select 1 as var \gset
  select :var;
This kind of sequence wouldn't work in batch mode since it
sends queries before getting results of previous queries.

So maybe \gset should be rejected when inside a batch section.

Or alternatively pgbench should collect results before a variable is
reinjected into a query, thereby stalling the pipeline.
To do this only when necessary, it would have to track read-write
dependencies among variables, which seems overly complicated though.

[1]
https://www.postgresql.org/message-id/b4e34135-2bd9-4b8a-94ca-27d760da2...@manitou-mail.org

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite




Re: pgbench: option delaying queries till connections establishment?

2020-11-14 Thread Fabien COELHO


Hello Marina,


1) It looks like pgbench will no longer support Windows XP due to the
function DeleteSynchronizationBarrier. From
https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-deletesynchronizationbarrier

Minimum supported client: Windows 8 [desktop apps only]
Minimum supported server: Windows Server 2012 [desktop apps only]


Thanks for the test and precise analysis!

Sigh.

I do not think that putting such version requirements are worth it just 
for the sake of pgbench.


In the attached version, I just comment out the call and add an 
explanation about why it is commented out. If pg overall version 
requirements are changed on windows, then it could be reinstated.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 7180fedd65..f02721da0d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -58,8 +58,10 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
-tps = 85.184871 (including connections establishing)
-tps = 85.296346 (excluding connections establishing)
+latency average = 11.013 ms
+latency stddev = 7.351 ms
+initial connection time = 45.758 ms
+tps = 896.967014 (without initial connection establishing)
 
 
   The first six lines report some of the most important parameter
@@ -68,8 +70,7 @@ tps = 85.296346 (excluding connections establishing)
   and number of transactions per client); these will be equal unless the run
   failed before completion.  (In -T mode, only the actual
   number of transactions is printed.)
-  The last two lines report the number of transactions per second,
-  figured with and without counting the time to start database sessions.
+  The last line reports the number of transactions per second.
  
 
   
@@ -2234,22 +2235,22 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
-latency average = 15.844 ms
-latency stddev = 2.715 ms
-tps = 618.764555 (including connections establishing)
-tps = 622.977698 (excluding connections establishing)
+latency average = 10.870 ms
+latency stddev = 7.341 ms
+initial connection time = 30.954 ms
+tps = 907.949122 (without initial connection establishing)
 statement latencies in milliseconds:
-0.002  \set aid random(1, 10 * :scale)
-0.005  \set bid random(1, 1 * :scale)
-0.002  \set tid random(1, 10 * :scale)
-0.001  \set delta random(-5000, 5000)
-0.326  BEGIN;
-0.603  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
-0.454  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
-5.528  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
-7.335  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
-0.371  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
-1.212  END;
+0.001  \set aid random(1, 10 * :scale)
+0.001  \set bid random(1, 1 * :scale)
+0.001  \set tid random(1, 10 * :scale)
+0.000  \set delta random(-5000, 5000)
+0.046  BEGIN;
+0.151  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
+0.107  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
+4.241  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
+5.245  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
+0.102  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
+0.974  END;
 
   
 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 3057665bbe..b8a3e28690 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -292,9 +292,9 @@ typedef struct SimpleStats
  */
 typedef struct StatsData
 {
-	time_t		start_time;		/* interval start time, for aggregates */
-	int64		cnt;			/* number of transactions, including skipped */
-	int64		skipped;		/* number of transactions skipped under --rate
+	pg_time_usec_t	start_time;	/* interval start time, for aggregates */
+	int64			cnt;		/* number of transactions, including skipped */
+	int64			skipped;	/* number of transactions skipped under --rate
  * and --latency-limit */
 	SimpleStats latency;
 	SimpleStats lag;
@@ -417,11 +417,11 @@ typedef struct
 	int			nvariables;		/* number of variables */
 	bool		vars_sorted;	/* are variables sorted by name? */
 
-	/* various times about current transaction */
-	int64		txn_scheduled;	/* scheduled start time of transaction (usec) */
-	int64		sleep_until;	/* scheduled start time of next cmd (usec) */
-	instr_time	txn_begin;		/* used for measuring schedule lag times */
-	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	/* various times about current transaction i

Re: pgbench: option delaying queries till connections establishment?

2020-11-14 Thread Alexander Korotkov
Hi!

On Thu, Feb 27, 2020 at 9:01 PM Andres Freund  wrote:
> I am trying to run a few benchmarks measuring the effects of patch to
> make GetSnapshotData() faster in the face of larger numbers of
> established connections.
>
> Before the patch connection establishment often is very slow due to
> contention. The first few connections are fast, but after that it takes
> increasingly long. The first few connections constantly hold
> ProcArrayLock in shared mode, which then makes it hard for new
> connections to acquire it exclusively (I'm addressing that to a
> significant degree in the patch FWIW).

Hmm...  Let's see the big picture.  You've recently committed a
patchset, which greatly improved the performance of GetSnapshotData().
And you're making further improvements in this direction.  But you're
getting trouble in measuring the effect, because Postgres is still
stuck on ProcArrayLock.  And in this thread you propose a workaround
for that implemented on the pgbench side.  My very dumb idea is
following: should we finally give a chance to more fair lwlocks rather
than inventing workarounds?

As I remember, your major argument against more fair lwlocks was the
idea that we should fix lwlocks use-cases rather than lwlock mechanism
themselves.  But can we expect that we fix all the lwlocks use-case in
any reasonable prospect?  My guess is 'no'.

Links
1. 
https://www.postgresql.org/message-id/CAPpHfdvJhO1qutziOp%3Ddy8TO8Xb4L38BxgKG4RPa1up1Lzh_UQ%40mail.gmail.com

--
Regards,
Alexander Korotkov




Re: Supporting = operator in gin/gist_trgm_ops

2020-11-14 Thread Julien Rouhaud
On Sat, Nov 14, 2020 at 7:58 PM Erik Rijkers  wrote:
>
> On 2020-11-14 12:53, Julien Rouhaud wrote:
> > On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov
> >  >
>
> >Note that those indexes may not be as afficient as regulat B-tree
> > indexes
> >for equality operator.
>
>
> 'afficient as regulat'  should be
> 'efficient as regular'
>
>
> Sorry to be nitpicking - it's the one thing I'm really good at :P

Oops indeed :)




Re: extension patch of CREATE OR REPLACE TRIGGER

2020-11-14 Thread Tom Lane
"osumi.takami...@fujitsu.com"  writes:
> [ CREATE_OR_REPLACE_TRIGGER_v18.patch ]

Pushed with some mostly-minor cleanup.

(I know this has been a very long slog.  Congratulations for
seeing it through.)

regards, tom lane




Re: More time spending with "delete pending"

2020-11-14 Thread Justin Pryzby
On Sat, Nov 14, 2020 at 01:00:00PM +0300, Alexander Lakhin wrote:
> As noted in [1], a sensible solution would be putting the same "retry on
> ERROR_ACCESS_DENIED" action in a wrapper for stat().
> And bed90759f brought in master the _pgstat64() function, where such
> error handling should be placed.
> So please look at the patch (based on the previous one made to fix
> bug#16161), that makes the attached test pass.

Your patch introduces a "loops", but doesn't use it to escape the loop.

> diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
> index d34182a7b0..922df49125 100644
> --- a/src/backend/utils/adt/genfile.c
> +++ b/src/backend/utils/adt/genfile.c
> @@ -28,6 +28,7 @@
>  #include "funcapi.h"
>  #include "mb/pg_wchar.h"
>  #include "miscadmin.h"
> +#include "port.h"
>  #include "postmaster/syslogger.h"
>  #include "storage/fd.h"
>  #include "utils/acl.h"
> diff --git a/src/port/win32stat.c b/src/port/win32stat.c
> index 4351aa4d08..c2b55c7fa6 100644
> --- a/src/port/win32stat.c
> +++ b/src/port/win32stat.c
> @@ -170,6 +170,7 @@ _pgstat64(const char *name, struct stat *buf)
>   SECURITY_ATTRIBUTES sa;
>   HANDLE  hFile;
>   int ret;
> + int loops = 0;
>  #if _WIN32_WINNT < 0x0600
>   IO_STATUS_BLOCK ioStatus;
>   FILE_STANDARD_INFORMATION standardInfo;
> @@ -182,31 +183,60 @@ _pgstat64(const char *name, struct stat *buf)
>   errno = EINVAL;
>   return -1;
>   }
> -
>   /* fast not-exists check */
>   if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
>   {
> - _dosmaperr(GetLastError());
> - return -1;
> + DWORD   err = GetLastError();
> +
> + if (err != ERROR_ACCESS_DENIED) {
> + _dosmaperr(err);
> + return -1;
> + }
>   }
>  
>   /* get a file handle as lightweight as we can */
>   sa.nLength = sizeof(SECURITY_ATTRIBUTES);
>   sa.bInheritHandle = TRUE;
>   sa.lpSecurityDescriptor = NULL;
> - hFile = CreateFile(name,
> -GENERIC_READ,
> -(FILE_SHARE_READ | FILE_SHARE_WRITE 
> | FILE_SHARE_DELETE),
> -&sa,
> -OPEN_EXISTING,
> -(FILE_FLAG_NO_BUFFERING | 
> FILE_FLAG_BACKUP_SEMANTICS |
> - FILE_FLAG_OVERLAPPED),
> -NULL);
> - if (hFile == INVALID_HANDLE_VALUE)
> + while ((hFile = CreateFile(name,
> +GENERIC_READ,
> +(FILE_SHARE_READ | 
> FILE_SHARE_WRITE | FILE_SHARE_DELETE),
> +&sa,
> +OPEN_EXISTING,
> +
> (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
> +
> FILE_FLAG_OVERLAPPED),
> +NULL)) == 
> INVALID_HANDLE_VALUE)
>   {
>   DWORD   err = GetLastError();
>  
> - CloseHandle(hFile);
> + /*
> +  * ERROR_ACCESS_DENIED is returned if the file is deleted but 
> not yet
> +  * gone (Windows NT status code is STATUS_DELETE_PENDING).  In 
> that
> +  * case we want to wait a bit and try again, giving up after 1 
> second
> +  * (since this condition should never persist very long).  
> However,
> +  * there are other commonly-hit cases that return 
> ERROR_ACCESS_DENIED,
> +  * so care is needed.  In particular that happens if we try to 
> open a
> +  * directory, or of course if there's an actual file-permissions
> +  * problem.  To distinguish these cases, try a stat().  In the
> +  * delete-pending case, it will either also get 
> STATUS_DELETE_PENDING,
> +  * or it will see the file as gone and fail with ENOENT.  In 
> other
> +  * cases it will usually succeed.  The only somewhat-likely 
> case where
> +  * this coding will uselessly wait is if there's a permissions 
> problem
> +  * with a containing directory, which we hope will never happen 
> in any
> +  * performance-critical code paths.
> +  */
> + if (err == ERROR_ACCESS_DENIED)
> + {
> + struct microsoft_native_stat st;
> +
> + if (microsoft_native_stat(name, &st) != 0)
> + {
> + pg_usleep(10);
> + loops++;
> +   

Re: More time spending with "delete pending"

2020-11-14 Thread Alexander Lakhin
15.11.2020 04:11, Justin Pryzby wrote:
> On Sat, Nov 14, 2020 at 01:00:00PM +0300, Alexander Lakhin wrote:
>> As noted in [1], a sensible solution would be putting the same "retry on
>> ERROR_ACCESS_DENIED" action in a wrapper for stat().
>> And bed90759f brought in master the _pgstat64() function, where such
>> error handling should be placed.
>> So please look at the patch (based on the previous one made to fix
>> bug#16161), that makes the attached test pass.
> Your patch introduces a "loops", but doesn't use it to escape the loop.
Indeed, this is my mistake. Please look at the corrected patch (now that
code corresponds to the pgwin32_open() as intended).

And it rises another question, what if pg_ls_dir_files() is called for a
directory where hundreds or thousands of files are really inaccessible
due to restrictions?
I mean that using CreateFile() in the modified stat() implementation can
be rather expensive for an arbitrary file (and worse yet, for many files).
On the positive side, for now pg_ls_dir_files() is called only from
pg_ls_logdir, pg_ls_waldir, pg_ls_tmpdir, pg_ls_archive_statusdir, where
having a bunch of files that are inaccessible for the postgres user is
not expected anyway.

But probably getting directory contents with correct file sizes (>4GB)
in pg_ls_dir_files() can be implemented without calling
CreateFile()/stat() at all (as ProcMon shows, the "dir" command doesn't
call CreateFile() (or any other system function) for each file in the
target directory).

Best regards,
Alexander
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..922df49125 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -28,6 +28,7 @@
 #include "funcapi.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "port.h"
 #include "postmaster/syslogger.h"
 #include "storage/fd.h"
 #include "utils/acl.h"
diff --git a/src/port/win32stat.c b/src/port/win32stat.c
index 4351aa4d08..2f9e111627 100644
--- a/src/port/win32stat.c
+++ b/src/port/win32stat.c
@@ -170,6 +170,7 @@ _pgstat64(const char *name, struct stat *buf)
 	SECURITY_ATTRIBUTES sa;
 	HANDLE		hFile;
 	int			ret;
+	int			loops = 0;
 #if _WIN32_WINNT < 0x0600
 	IO_STATUS_BLOCK ioStatus;
 	FILE_STANDARD_INFORMATION standardInfo;
@@ -182,31 +183,63 @@ _pgstat64(const char *name, struct stat *buf)
 		errno = EINVAL;
 		return -1;
 	}
-
 	/* fast not-exists check */
 	if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
 	{
-		_dosmaperr(GetLastError());
-		return -1;
+		DWORD		err = GetLastError();
+
+		if (err != ERROR_ACCESS_DENIED) {
+			_dosmaperr(err);
+			return -1;
+		}
 	}
 
 	/* get a file handle as lightweight as we can */
 	sa.nLength = sizeof(SECURITY_ATTRIBUTES);
 	sa.bInheritHandle = TRUE;
 	sa.lpSecurityDescriptor = NULL;
-	hFile = CreateFile(name,
-	   GENERIC_READ,
-	   (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
-	   &sa,
-	   OPEN_EXISTING,
-	   (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
-		FILE_FLAG_OVERLAPPED),
-	   NULL);
-	if (hFile == INVALID_HANDLE_VALUE)
+	while ((hFile = CreateFile(name,
+			   GENERIC_READ,
+			   (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
+			   &sa,
+			   OPEN_EXISTING,
+			   (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
+			   FILE_FLAG_OVERLAPPED),
+			   NULL)) == INVALID_HANDLE_VALUE)
 	{
 		DWORD		err = GetLastError();
 
-		CloseHandle(hFile);
+		/*
+		 * ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
+		 * gone (Windows NT status code is STATUS_DELETE_PENDING).  In that
+		 * case we want to wait a bit and try again, giving up after 1 second
+		 * (since this condition should never persist very long).  However,
+		 * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
+		 * so care is needed.  In particular that happens if we try to open a
+		 * directory, or of course if there's an actual file-permissions
+		 * problem.  To distinguish these cases, try a stat().  In the
+		 * delete-pending case, it will either also get STATUS_DELETE_PENDING,
+		 * or it will see the file as gone and fail with ENOENT.  In other
+		 * cases it will usually succeed.  The only somewhat-likely case where
+		 * this coding will uselessly wait is if there's a permissions problem
+		 * with a containing directory, which we hope will never happen in any
+		 * performance-critical code paths.
+		 */
+		if (err == ERROR_ACCESS_DENIED)
+		{
+			if (loops < 10)
+			{
+struct microsoft_native_stat st;
+
+if (microsoft_native_stat(name, &st) != 0)
+{
+	pg_usleep(10);
+	loops++;
+	continue;
+}
+			}
+		}
+
 		_dosmaperr(err);
 		return -1;
 	}


Re: Add docs stub for recovery.conf

2020-11-14 Thread David G. Johnston
On Fri, Nov 13, 2020 at 10:42 AM Bruce Momjian  wrote:

> I think the big problem, and I have seen this repeatedly, is showing up
> with a patch without discussing whether people actually want the
> feature.  I know it is a doc issue, but our TODO list has the order as:
>
> Desirability -> Design -> Implement -> Test -> Review -> Commit


> and there is a reason for that.  When you appear with a patch, you are
> already far down the steps, and you have to back up to explain why it is
> useful.
>

That process is designed to prevent people from being exposed to wasted
effort and hard feelings.  The choice to follow it individually, instead of
collectively, doesn't diminish the value of the end result.

I generally agree with Craig's proposed solution here.  It doesn't add any
cognitive load to new users as they will not see the obsolete features
appendix in the normal course of their reading.

To the particular point regarding renaming features - this situation is not
an instance of a rename but rather a feature removal.  To blindly apply the
reasoning and decision made for renaming to removal is not reasonable.
>From that observation (and the commentary below) extends the conclusion
that this appendix shouldn't include renaming.

On the point of renaming, my suggestion would be to have the documentation
directory provide a file of all renaming for which redirects should be
performed.  pgweb would source that file and actually establish the
redirects on the main website.  Comments in the file can describe to a
curious user why the name change was needed.  Though that honestly seems a
bit overkill; for rename, the content as a whole still exists and a comment
therein can talk about the renaming.  Users of the public website would
still get the benefit of redirects, and there isn't any practical reason
for people building documentation from source to want to establish such
redirects even if they were provided the data in the form of the
aforementioned file.

I believe there is probably room for more discussion regarding the value of
providing a limited view of history in the publicly facing documentation
but that seems outside the scope of this patch.

David J.


Re: POC: Cleaning up orphaned files using undo logs

2020-11-14 Thread Amit Kapila
On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska  wrote:
>
> Amit Kapila  wrote:
>
> > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska  wrote:
> > >
> > >
> > > No background undo
> > > --
> > >
> > > Reduced complexity of the patch seems to be the priority at the moment. 
> > > Amit
> > > suggested that cleanup of an orphaned relation file is simple enough to be
> > > done on foreground and I agree.
> > >
> >
> > Yeah, I think we should try and see if we can make it work but I
> > noticed that there are few places like AbortOutOfAnyTransaction where
> > we have the assumption that undo will be executed in the background.
> > We need to deal with it.
>
> I think this is o.k. if we always check for unapplied undo during startup.
>

Hmm, how it is ok to leave undo (and rely on startup) unless it is a
PANIC error. IIRC, this path is invoked in non-panic errors as well.
Basically, we won't be able to discard such an undo which doesn't seem
like a good idea.

> > > "undo worker" is still there, but it only processes undo requests after 
> > > server
> > > restart because relation data can only be changed in a transaction - it 
> > > seems
> > > cleaner to launch a background worker for this than to hack the startup
> > > process.
> > >
> >
> > But, I see there are still multiple undoworkers that are getting
> > launched and I am not sure if that works correctly because a
> > particular undoworker is connected to a database and then it starts
> > processing all the pending undo.
>
> Each undo worker applies only transactions for its own database, see
> ProcessExistingUndoRequests():
>
> /* We can only process undo of the database we are connected to. */
> if (xact_hdr.dboid != MyDatabaseId)
> continue;
>
> Nevertheless, as I've just mentioned in my response to Thomas, I admit that we
> should try to live w/o the undo worker altogether.
>

Okay, but keep in mind that there could be a large amount of undo
(unlike redo which has some limit as we can replay it from the last
checkpoint) which needs to be processed but it might be okay to live
with that for now. Another thing is that it seems we need to connect
to the database to perform it which might appear a bit odd that we
don't allow users to connect to the database but internally we are
connecting it. These are just some points to consider while finalizing
the solution to this.

> > > Discarding
> > > --
> > >
> > > There's no discard worker for the URS infrastructure yet. I thought about
> > > discarding the undo log during checkpoint, but checkpoint should probably 
> > > do
> > > more straightforward tasks than the calculation of a new discard pointer 
> > > for
> > > each undo log, so a background worker is needed. A few notes on that:
> > >
> > >   * until the zheap AM gets added, only the transaction that creates the 
> > > undo
> > > records needs to access them. This assumption should make the 
> > > discarding
> > > algorithm a bit simpler. Note that with zheap, the other transactions 
> > > need
> > > to look for old versions of tuples, so the concept of 
> > > oldestXidHavingUndo
> > > variable is needed there.
> > >
> > >   * it's rather simple to pass pointer the URS pointer to the discard 
> > > worker
> > > when transaction either committed or the undo has been executed.
> > >
> >
> > Why can't we have a separate discard worker which keeps on scanning
> > the undorecords and discard accordingly? Giving the onus of foreground
> > process might be tricky because say discard worker is not up to speed
> > and we ran out of space to pass such information for each commit/abort
> > request.
>
> Sure, there should be a discard worker. The question is how to make its work
> efficient. The initial run after restart probably needs to scan everything
> between 'discard' and 'insert' pointers,
>

Yeah, such an initial scan would be helpful to identify pending aborts
and allow them to be processed.

> but then it should process only the
> parts created by individual transactions.
>

Yeah, it needs to process transaction-by-transaction to see which all
we can discard. Also, note that in Single-User mode we need to discard
undo after commit. I think we also need to maintain
oldestXidHavingUndo for CLOG truncation and transaction-wraparound. We
can't allow CLOG truncation for the transaction whose undo is not
discarded as that could be required by some other transaction. For
similar reasons, we can't allow transaction-wraparound and we need to
integrate this into the existing xid-allocation mechanism. I have
found one of the old patch
(Allow-execution-and-discard-of-undo-by-background-wo) attached where
all these concepts were implemented. Unless you have a reason why we
don't these things, you might want to refer to the attached patch to
either re-use or refer to these ideas. There are a few other things
like undorequest and some undoworker mechanism which you can ignore.

-- 
With Regards,

Re: POC: Cleaning up orphaned files using undo logs

2020-11-14 Thread Amit Kapila
On Sun, Nov 15, 2020 at 11:24 AM Amit Kapila  wrote:
>
> On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska  wrote:
> >
> > Amit Kapila  wrote:
> >
> > > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska  wrote:
> > > >
> > > >
> > > > No background undo
> > > > --
> > > >
> > > > Reduced complexity of the patch seems to be the priority at the moment. 
> > > > Amit
> > > > suggested that cleanup of an orphaned relation file is simple enough to 
> > > > be
> > > > done on foreground and I agree.
> > > >
> > >
> > > Yeah, I think we should try and see if we can make it work but I
> > > noticed that there are few places like AbortOutOfAnyTransaction where
> > > we have the assumption that undo will be executed in the background.
> > > We need to deal with it.
> >
> > I think this is o.k. if we always check for unapplied undo during startup.
> >
>
> Hmm, how it is ok to leave undo (and rely on startup) unless it is a
> PANIC error. IIRC, this path is invoked in non-panic errors as well.
> Basically, we won't be able to discard such an undo which doesn't seem
> like a good idea.
>
> > > > "undo worker" is still there, but it only processes undo requests after 
> > > > server
> > > > restart because relation data can only be changed in a transaction - it 
> > > > seems
> > > > cleaner to launch a background worker for this than to hack the startup
> > > > process.
> > > >
> > >
> > > But, I see there are still multiple undoworkers that are getting
> > > launched and I am not sure if that works correctly because a
> > > particular undoworker is connected to a database and then it starts
> > > processing all the pending undo.
> >
> > Each undo worker applies only transactions for its own database, see
> > ProcessExistingUndoRequests():
> >
> > /* We can only process undo of the database we are connected to. */
> > if (xact_hdr.dboid != MyDatabaseId)
> > continue;
> >
> > Nevertheless, as I've just mentioned in my response to Thomas, I admit that 
> > we
> > should try to live w/o the undo worker altogether.
> >
>
> Okay, but keep in mind that there could be a large amount of undo
> (unlike redo which has some limit as we can replay it from the last
> checkpoint) which needs to be processed but it might be okay to live
> with that for now. Another thing is that it seems we need to connect
> to the database to perform it which might appear a bit odd that we
> don't allow users to connect to the database but internally we are
> connecting it. These are just some points to consider while finalizing
> the solution to this.
>
> > > > Discarding
> > > > --
> > > >
> > > > There's no discard worker for the URS infrastructure yet. I thought 
> > > > about
> > > > discarding the undo log during checkpoint, but checkpoint should 
> > > > probably do
> > > > more straightforward tasks than the calculation of a new discard 
> > > > pointer for
> > > > each undo log, so a background worker is needed. A few notes on that:
> > > >
> > > >   * until the zheap AM gets added, only the transaction that creates 
> > > > the undo
> > > > records needs to access them. This assumption should make the 
> > > > discarding
> > > > algorithm a bit simpler. Note that with zheap, the other 
> > > > transactions need
> > > > to look for old versions of tuples, so the concept of 
> > > > oldestXidHavingUndo
> > > > variable is needed there.
> > > >
> > > >   * it's rather simple to pass pointer the URS pointer to the discard 
> > > > worker
> > > > when transaction either committed or the undo has been executed.
> > > >
> > >
> > > Why can't we have a separate discard worker which keeps on scanning
> > > the undorecords and discard accordingly? Giving the onus of foreground
> > > process might be tricky because say discard worker is not up to speed
> > > and we ran out of space to pass such information for each commit/abort
> > > request.
> >
> > Sure, there should be a discard worker. The question is how to make its work
> > efficient. The initial run after restart probably needs to scan everything
> > between 'discard' and 'insert' pointers,
> >
>
> Yeah, such an initial scan would be helpful to identify pending aborts
> and allow them to be processed.
>
> > but then it should process only the
> > parts created by individual transactions.
> >
>
> Yeah, it needs to process transaction-by-transaction to see which all
> we can discard. Also, note that in Single-User mode we need to discard
> undo after commit. I think we also need to maintain
> oldestXidHavingUndo for CLOG truncation and transaction-wraparound. We
> can't allow CLOG truncation for the transaction whose undo is not
> discarded as that could be required by some other transaction. For
> similar reasons, we can't allow transaction-wraparound and we need to
> integrate this into the existing xid-allocation mechanism. I have
> found one of the old patch
> (Allow-execution-and-discard-of-undo-by-background-wo) attached

Re: Supporting = operator in gin/gist_trgm_ops

2020-11-14 Thread Alexander Korotkov
On Sat, Nov 14, 2020 at 8:26 PM Julien Rouhaud  wrote:
> On Sat, Nov 14, 2020 at 7:58 PM Erik Rijkers  wrote:
> >
> > On 2020-11-14 12:53, Julien Rouhaud wrote:
> > > On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov
> > >  >
> >
> > >Note that those indexes may not be as afficient as regulat B-tree
> > > indexes
> > >for equality operator.
> >
> >
> > 'afficient as regulat'  should be
> > 'efficient as regular'
> >
> >
> > Sorry to be nitpicking - it's the one thing I'm really good at :P
>
> Oops indeed :)

Pushed with all the corrections above.  Thanks!

--
Regards,
Alexander Korotkov




Re: logical streaming of xacts via test_decoding is broken

2020-11-14 Thread Dilip Kumar
On Fri, Nov 13, 2020 at 3:18 PM Amit Kapila  wrote:
>
> On Thu, Nov 12, 2020 at 3:10 PM Dilip Kumar  wrote:
> >
> > On Tue, Nov 10, 2020 at 7:20 PM Amit Kapila  wrote:
> > >
> > > On Tue, Nov 10, 2020 at 2:25 PM Dilip Kumar  wrote:
> >
> > > 3. Can you please prepare a separate patch for test case changes so
> > > that it would be easier to verify that it fails without the patch and
> > > passed after the patch?
> >
> > Done
> >
>
> Few comments:
> =
> 1.
>-- consume DDL
>SELECT data FROM pg_logical_slot_get_changes('isolation_slot',
> NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
> -  CREATE OR REPLACE FUNCTION large_val() RETURNS TEXT LANGUAGE SQL AS
> 'select array_agg(md5(g::text))::text from generate_series(1, 8)
> g';
> +  CREATE OR REPLACE FUNCTION large_val() RETURNS TEXT LANGUAGE SQL AS
> 'select array_agg(md5(g::text))::text from generate_series(1, 6)
> g';
>  }
>
>
> Is there a reason for this change? I think probably here a lesser
> number of rows are sufficient to serve the purpose of the test but I
> am not sure if it is related to this patch or there is any other
> reason behind this change?

I think I changed for some experiment and got included in the patch so
reverted this.

> 2.
> +typedef struct
> +{
> + bool xact_wrote_changes;
> + bool stream_wrote_changes;
> +} TestDecodingTxnData;
> +
>
> I think here a comment explaining why we need this as a separate
> structure would be better and probably explain why we need two
> different members.

Done

> 3.
> pg_decode_commit_txn()
> {
> ..
> - if (data->skip_empty_xacts && !data->xact_wrote_changes)
> + pfree(txndata);
> + txn->output_plugin_private = false;
> +
>
> Here, don't we need to set txn->output_plugin_private as NULL as it is
> a pointer and we do explicitly test it for being NULL at other places?
> Also, change at other places where it is set as false.

Fixed

> 4.
> @@ -592,10 +610,24 @@ pg_decode_stream_start(LogicalDecodingContext *ctx,
>  ReorderBufferTXN *txn)
>  {
>   TestDecodingData *data = ctx->output_plugin_private;
> + TestDecodingTxnData *txndata = txn->output_plugin_private;
>
> - data->xact_wrote_changes = false;
> + /*
> + * If this is the first stream for the txn then allocate the txn plugin
> + * data and set the xact_wrote_changes to false.
> + */
> + if (txndata == NULL)
> + {
> + txndata =
>
> As we are explicitly testing for NULL here, isn't it better to
> explicitly initialize 'output_plugin_private' with NULL in
> ReorderBufferGetTXN?

Done

> 5.
> @@ -633,8 +666,18 @@ pg_decode_stream_abort(LogicalDecodingContext *ctx,
>  XLogRecPtr abort_lsn)
>  {
>   TestDecodingData *data = ctx->output_plugin_private;
> + ReorderBufferTXN *toptxn = txn->toptxn ? txn->toptxn : txn;
> + TestDecodingTxnData *txndata = toptxn->output_plugin_private;
> + bool xact_wrote_changes = txndata->xact_wrote_changes;
>
> - if (data->skip_empty_xacts && !data->xact_wrote_changes)
> + if (txn->toptxn == NULL)
> + {
> + Assert(txn->output_plugin_private != NULL);
> + pfree(txndata);
> + txn->output_plugin_private = false;
> + }
> +
>
> Here, if we are expecting 'output_plugin_private' to be set only for
> toptxn then the Assert and reset should happen for toptxn? I find the
> changes in this function a bit unclear so probably adding a comment
> here could help.

I have added the comments.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 037ad84600e308c03fb71d21c2550deccb0bc2aa Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Thu, 12 Nov 2020 15:00:25 +0530
Subject: [PATCH v3 2/2] Test case to test the interleaved empty transactions

---
 contrib/test_decoding/expected/concurrent_stream.out | 5 +++--
 contrib/test_decoding/specs/concurrent_stream.spec   | 6 +-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/contrib/test_decoding/expected/concurrent_stream.out b/contrib/test_decoding/expected/concurrent_stream.out
index e731d13..6f8b217 100644
--- a/contrib/test_decoding/expected/concurrent_stream.out
+++ b/contrib/test_decoding/expected/concurrent_stream.out
@@ -1,11 +1,12 @@
-Parsed test spec with 2 sessions
+Parsed test spec with 3 sessions
 
-starting permutation: s0_begin s0_ddl s1_ddl s1_begin s1_toast_insert s1_commit s1_get_stream_changes
+starting permutation: s0_begin s0_ddl s1_ddl s1_begin s1_toast_insert s2_ddl s1_commit s1_get_stream_changes
 step s0_begin: BEGIN;
 step s0_ddl: CREATE TABLE stream_test1(data text);
 step s1_ddl: CREATE TABLE stream_test(data text);
 step s1_begin: BEGIN;
 step s1_toast_insert: INSERT INTO stream_test SELECT large_val();
+step s2_ddl: CREATE TABLE stream_test2(data text);
 step s1_commit: COMMIT;
 step s1_get_stream_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL,NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
 data   
diff --git a/contrib/test_decoding/specs/concurrent_stream.spec b/contrib/test_decoding/specs/concurrent_stream.spe

Re: Refactor pg_rewind code and make it work against a standby

2020-11-14 Thread Tom Lane
Not sure if you noticed, but piculet has twice failed the
007_standby_source.pl test that was added by 9c4f5192f:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet&dt=2020-11-15%2006%3A00%3A11
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet&dt=2020-11-13%2011%3A20%3A10

and francolin failed once:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=francolin&dt=2020-11-12%2018%3A57%3A33

These failures look the same:

#   Failed test 'table content after rewind and insert: query result matches'
#   at t/007_standby_source.pl line 160.
#  got: 'in A
# in A, before promotion
# in A, after C was promoted
# '
# expected: 'in A
# in A, before promotion
# in A, after C was promoted
# in A, after rewind
# '
# Looks like you failed 1 test of 3.
[11:27:01] t/007_standby_source.pl ... 
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/3 subtests 

Now, I'm not sure what to make of that, but I can't help noticing that
piculet uses --disable-atomics while francolin uses --disable-spinlocks.
That leads the mind towards some kind of low-level synchronization
bug ...

regards, tom lane




Re: Refactor pg_rewind code and make it work against a standby

2020-11-14 Thread Tom Lane
I wrote:
> Not sure if you noticed, but piculet has twice failed the
> 007_standby_source.pl test that was added by 9c4f5192f:
> ...
> Now, I'm not sure what to make of that, but I can't help noticing that
> piculet uses --disable-atomics while francolin uses --disable-spinlocks.
> That leads the mind towards some kind of low-level synchronization
> bug ...

Or, maybe it's less mysterious than that.  The failure looks like we
have not waited long enough for the just-inserted row to get replicated
to node C.  That wait is implemented as

$lsn = $node_a->lsn('insert');
$node_b->wait_for_catchup('node_c', 'write', $lsn);

which looks fishy ... shouldn't wait_for_catchup be told to
wait for replay of that LSN, not just write-the-WAL?

regards, tom lane




Re: Supporting = operator in gin/gist_trgm_ops

2020-11-14 Thread Julien Rouhaud
On Sun, Nov 15, 2020 at 1:55 PM Alexander Korotkov  wrote:
>
> On Sat, Nov 14, 2020 at 8:26 PM Julien Rouhaud  wrote:
> > On Sat, Nov 14, 2020 at 7:58 PM Erik Rijkers  wrote:
> > >
> > > On 2020-11-14 12:53, Julien Rouhaud wrote:
> > > > On Sat, Nov 14, 2020 at 6:07 PM Alexander Korotkov
> > > >  >
> > >
> > > >Note that those indexes may not be as afficient as regulat B-tree
> > > > indexes
> > > >for equality operator.
> > >
> > >
> > > 'afficient as regulat'  should be
> > > 'efficient as regular'
> > >
> > >
> > > Sorry to be nitpicking - it's the one thing I'm really good at :P
> >
> > Oops indeed :)
>
> Pushed with all the corrections above.  Thanks!

Thanks a lot!




Re: Remove unused variable from SharedSort

2020-11-14 Thread Bharath Rupireddy
On Thu, Nov 12, 2020 at 5:29 PM Dilip Kumar  wrote:
>
> While going through the code I noticed that the nTapes member in
> SharedSort is unused.  This is just initialized with nworkers but
> never used.  The attached patch removes this variable.
>

We could have used that variable for an assert like
Assert(state->worker <= shared->nTapes) in worker_freeze_result_tape()
before accessing shared->tapes[state->worker] = output; as sometimes
state->worker is being set to -1. But, it seems like we reach
worker_freeze_result_tape(), only when  WORKER(state) is true. So, we
don't need that extra Assert and removing nTapes variable makes sense
to me.

Patch looks good to me. Regression tests make check and make
check-world ran successfully.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com