Re: PG v12.2 - Setting jit_above_cost is causing the server to crash

2020-02-23 Thread Aditya Toshniwal
Hi Andres,

On Mon, Feb 24, 2020 at 12:46 PM Andres Freund  wrote:

>
> Hi,
>
> On 2020-02-24 12:16:08 +0530, Aditya Toshniwal wrote:
> > The PG 12.2 server is crashing on setting the jit_above_cost param. Below
> > is the output.
>
> > postgres=# SELECT count(*) FROM pg_class;
> >
> > server closed the connection unexpectedly
> >
> > This probably means the server terminated abnormally
> >
> > before or while processing the request.
> >
> > The connection to the server was lost. Attempting reset: Failed.
>
> This isn't reproducible here. Are you sure that you're running on a
> clean installation?
>
Yes I did a fresh installation using installer provided here -
https://www.enterprisedb.com/downloads/postgresql

>
> If not, we'd at least need a backtrace to figure out what's going on.
>
Please let me know how can I provide required info.

>
> Greetings,
>
> Andres Freund
>


-- 
Thanks and Regards,
Aditya Toshniwal
pgAdmin Hacker | Sr. Software Engineer | EnterpriseDB India | Pune
"Don't Complain about Heat, Plant a TREE"


Re: ALTER TABLE ... SET STORAGE does not propagate to indexes

2020-02-23 Thread Peter Eisentraut

On 2020-01-06 13:32, Peter Eisentraut wrote:

Attached is a patch that attempts to fix this by propagating the storage
change to existing indexes.  This triggers a few regression test
failures (going from no error to error), which I attempted to fix up,
but I haven't analyzed what the tests were trying to do, so it might
need another look.


Attached is a more polished patch, with tests.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From b46e75ffb1bed1228fbde3f9ceedd04e105699cb Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 24 Feb 2020 08:24:15 +0100
Subject: [PATCH v2] Propagate ALTER TABLE ... SET STORAGE to indexes

When creating a new index, the attstorage setting of the table column
is copied to regular (non-expression) index columns.  But a later
ALTER TABLE ... SET STORAGE is not propagated to indexes, thus
creating an inconsistent and undumpable state.

Discussion: 
https://www.postgresql.org/message-id/flat/9765d72b-37c0-06f5-e349-2a580aafd989%402ndquadrant.com
---
 contrib/test_decoding/expected/toast.out  |  4 +-
 contrib/test_decoding/sql/toast.sql   |  4 +-
 src/backend/commands/tablecmds.c  | 47 +++
 src/test/regress/expected/alter_table.out | 20 ++
 src/test/regress/expected/vacuum.out  |  4 +-
 src/test/regress/sql/alter_table.sql  |  6 +++
 src/test/regress/sql/vacuum.sql   |  4 +-
 7 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/contrib/test_decoding/expected/toast.out 
b/contrib/test_decoding/expected/toast.out
index 91a9a1e86d..2baa06f304 100644
--- a/contrib/test_decoding/expected/toast.out
+++ b/contrib/test_decoding/expected/toast.out
@@ -305,8 +305,8 @@ ALTER TABLE toasted_several REPLICA IDENTITY FULL;
 ALTER TABLE toasted_several ALTER COLUMN toasted_key SET STORAGE EXTERNAL;
 ALTER TABLE toasted_several ALTER COLUMN toasted_col1 SET STORAGE EXTERNAL;
 ALTER TABLE toasted_several ALTER COLUMN toasted_col2 SET STORAGE EXTERNAL;
-INSERT INTO toasted_several(toasted_key) VALUES(repeat('9876543210', 1));
-SELECT pg_column_size(toasted_key) > 2^16 FROM toasted_several;
+INSERT INTO toasted_several(toasted_key) VALUES(repeat('9876543210', 269));
+SELECT pg_column_size(toasted_key) > 2^11 FROM toasted_several;
  ?column? 
 --
  t
diff --git a/contrib/test_decoding/sql/toast.sql 
b/contrib/test_decoding/sql/toast.sql
index ef11d2bfff..5cf6b87b3a 100644
--- a/contrib/test_decoding/sql/toast.sql
+++ b/contrib/test_decoding/sql/toast.sql
@@ -279,8 +279,8 @@ CREATE TABLE toasted_several (
 ALTER TABLE toasted_several ALTER COLUMN toasted_col1 SET STORAGE EXTERNAL;
 ALTER TABLE toasted_several ALTER COLUMN toasted_col2 SET STORAGE EXTERNAL;
 
-INSERT INTO toasted_several(toasted_key) VALUES(repeat('9876543210', 1));
-SELECT pg_column_size(toasted_key) > 2^16 FROM toasted_several;
+INSERT INTO toasted_several(toasted_key) VALUES(repeat('9876543210', 269));
+SELECT pg_column_size(toasted_key) > 2^11 FROM toasted_several;
 
 SELECT regexp_replace(data, '^(.{100}).*(.{100})$', '\1..\2') FROM 
pg_logical_slot_peek_changes('regression_slot', NULL, NULL, 'include-xids', 
'0', 'skip-empty-xacts', '1');
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b7c8d663fc..8c24a7a0e3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -7383,6 +7383,7 @@ ATExecSetStorage(Relation rel, const char *colName, Node 
*newValue, LOCKMODE loc
Form_pg_attribute attrtuple;
AttrNumber  attnum;
ObjectAddress address;
+   ListCell   *lc;
 
Assert(IsA(newValue, String));
storagemode = strVal(newValue);
@@ -7442,6 +7443,52 @@ ATExecSetStorage(Relation rel, const char *colName, Node 
*newValue, LOCKMODE loc
 
heap_freetuple(tuple);
 
+   /*
+* Apply the change to indexes as well (only for simple index columns,
+* matching behavior of index.c ConstructTupleDescriptor()).
+*/
+   foreach(lc, RelationGetIndexList(rel))
+   {
+   Oid indexoid = lfirst_oid(lc);
+   Relationindrel;
+   AttrNumber  indattnum = 0;
+
+   indrel = index_open(indexoid, lockmode);
+
+   for (int i = 0; i < indrel->rd_index->indnatts; i++)
+   {
+   if (indrel->rd_index->indkey.values[i] == attnum)
+   {
+   indattnum = i + 1;
+   break;
+   }
+   }
+
+   if (indattnum == 0)
+   {
+   index_close(indrel, lockmode);
+   continue;
+   }
+
+   tuple = SearchSysCacheCopyAttNum(RelationGetRelid(indrel), 
indattnum);
+
+   if (HeapTupleIsValid(tuple))
+   {
+   attrtuple = 

Re: PG v12.2 - Setting jit_above_cost is causing the server to crash

2020-02-23 Thread Andres Freund


Hi,

On 2020-02-24 12:16:08 +0530, Aditya Toshniwal wrote:
> The PG 12.2 server is crashing on setting the jit_above_cost param. Below
> is the output.

> postgres=# SELECT count(*) FROM pg_class;
> 
> server closed the connection unexpectedly
> 
> This probably means the server terminated abnormally
> 
> before or while processing the request.
> 
> The connection to the server was lost. Attempting reset: Failed.

This isn't reproducible here. Are you sure that you're running on a
clean installation?

If not, we'd at least need a backtrace to figure out what's going on.

Greetings,

Andres Freund




Re: PG v12.2 - Setting jit_above_cost is causing the server to crash

2020-02-23 Thread Aditya Toshniwal
++hackers.

On Mon, Feb 24, 2020 at 12:16 PM Aditya Toshniwal <
aditya.toshni...@enterprisedb.com> wrote:

> Hi Team,
>
> The PG 12.2 server is crashing on setting the jit_above_cost param. Below
> is the output.
>
> postgres=# select version();
>
>   version
>
>
>
> 
>
>  PostgreSQL 12.2 on x86_64-apple-darwin, compiled by Apple LLVM version
> 6.0 (clang-600.0.54) (based on LLVM 3.5svn), 64-bit
>
> (1 row)
>
>
> postgres=# SET jit_above_cost=10;
>
> SET
>
> postgres=# SELECT count(*) FROM pg_class;
>
> server closed the connection unexpectedly
>
> This probably means the server terminated abnormally
>
> before or while processing the request.
>
> The connection to the server was lost. Attempting reset: Failed.
>
> !>
>
> --
> Thanks and Regards,
> Aditya Toshniwal
> pgAdmin Hacker | Sr. Software Engineer | EnterpriseDB India | Pune
> "Don't Complain about Heat, Plant a TREE"
>


-- 
Thanks and Regards,
Aditya Toshniwal
pgAdmin Hacker | Sr. Software Engineer | EnterpriseDB India | Pune
"Don't Complain about Heat, Plant a TREE"


Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

2020-02-23 Thread Thomas Munro
On Mon, Feb 24, 2020 at 7:42 PM Thomas Munro  wrote:
> On Mon, Feb 24, 2020 at 12:24 PM Tom Lane  wrote:
> > On reflection, trying to make ReserveExternalFD serve two different
> > use-cases was pretty messy.  Here's a version that splits it into two
> > functions.  I also took the trouble to fix dblink.
>
> +/*
> + * We don't want more than max_safe_fds / 3 FDs to be consumed for
> + * "external" FDs.
> + */
> +if (numExternalFDs < max_safe_fds / 3)

I suppose there may be users who have set ulimit -n high enough to
support an FDW workload that connects to very many hosts, who will now
need to set max_files_per_process higher to avoid the new error now
that we're doing this accounting.  That doesn't seem to be a problem
in itself, but I wonder if the error message should make it clearer
that it's our limit they hit here.




Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

2020-02-23 Thread Thomas Munro
On Mon, Feb 24, 2020 at 12:24 PM Tom Lane  wrote:
> On reflection, trying to make ReserveExternalFD serve two different
> use-cases was pretty messy.  Here's a version that splits it into two
> functions.  I also took the trouble to fix dblink.

+/*
+ * We don't want more than max_safe_fds / 3 FDs to be consumed for
+ * "external" FDs.
+ */
+if (numExternalFDs < max_safe_fds / 3)

This looks pretty reasonable to me.

I'll have a new patch set to create a common WES at startup over on
that other thread in a day or two.




Re: Shouldn't GSSAPI and SSL code use FeBeWaitSet?

2020-02-23 Thread Thomas Munro
On Mon, Feb 24, 2020 at 4:49 PM Thomas Munro  wrote:
> While working on a patch to reuse a common WaitEventSet for latch
> waits, I noticed that be-secure-gssapi.c and be-secure-openssl.c don't
> use FeBeWaitSet.  They probably should, for consistency with
> be-secure.c, because that surely holds the socket they want, no?

Hmm.  Perhaps it's like that because they're ignoring their latch
(though they pass it in just because that interface requires it).  So
then why not reset it and process read interrupts, like be-secure.c?




Shouldn't GSSAPI and SSL code use FeBeWaitSet?

2020-02-23 Thread Thomas Munro
Hi,

While working on a patch to reuse a common WaitEventSet for latch
waits, I noticed that be-secure-gssapi.c and be-secure-openssl.c don't
use FeBeWaitSet.  They probably should, for consistency with
be-secure.c, because that surely holds the socket they want, no?  The
attached patch passes the "ssl" and "kerberos" tests and reaches that
code, confirmed by adding some log messages.

I'm not actually sure what the rationale is for reporting "terminating
connection due to unexpected postmaster exit" here but not elsewhere.
I copied the error from be-secure.c.
From d62e9337f201c7bed8040d72ff249900cf15f169 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Fri, 21 Feb 2020 21:47:13 +1300
Subject: [PATCH] Use FeBeWaitSet for GSSAPI and SSL waits.

These routines are waiting on the FE/BE socket, so they should
use the existing WaitEventSet for that.
---
 src/backend/libpq/be-secure-gssapi.c  | 37 +--
 src/backend/libpq/be-secure-openssl.c | 19 +++---
 2 files changed, 44 insertions(+), 12 deletions(-)

diff --git a/src/backend/libpq/be-secure-gssapi.c b/src/backend/libpq/be-secure-gssapi.c
index c25cfda0db..8ec5130cc5 100644
--- a/src/backend/libpq/be-secure-gssapi.c
+++ b/src/backend/libpq/be-secure-gssapi.c
@@ -379,7 +379,7 @@ be_gssapi_read(Port *port, void *ptr, size_t len)
 
 /*
  * Read the specified number of bytes off the wire, waiting using
- * WaitLatchOrSocket if we would block.
+ * FeBeWaitSet if we would block.
  *
  * Results are read into PqGSSRecvBuffer.
  *
@@ -415,9 +415,19 @@ read_or_wait(Port *port, ssize_t len)
 		 */
 		if (ret <= 0)
 		{
-			WaitLatchOrSocket(MyLatch,
-			  WL_SOCKET_READABLE | WL_EXIT_ON_PM_DEATH,
-			  port->sock, 0, WAIT_EVENT_GSS_OPEN_SERVER);
+			WaitEvent	event;
+
+			/* FeBeWaitSet has port->sock in position 0 */
+			Assert(port == MyProcPort);
+			ModifyWaitEvent(FeBeWaitSet, 0, WL_SOCKET_READABLE, NULL);
+			WaitEventSetWait(FeBeWaitSet, -1, , 1,
+			 WAIT_EVENT_GSS_OPEN_SERVER);
+
+			/* see comment in secure_read() */
+			if (event.events & WL_POSTMASTER_DEATH)
+ereport(FATAL,
+		(errcode(ERRCODE_ADMIN_SHUTDOWN),
+		 errmsg("terminating connection due to unexpected postmaster exit")));
 
 			/*
 			 * If we got back zero bytes, and then waited on the socket to be
@@ -454,7 +464,7 @@ read_or_wait(Port *port, ssize_t len)
  *
  * Note that unlike the be_gssapi_read/be_gssapi_write functions, this
  * function WILL block on the socket to be ready for read/write (using
- * WaitLatchOrSocket) as appropriate while establishing the GSSAPI
+ * FeBeWaitSet) as appropriate while establishing the GSSAPI
  * session.
  */
 ssize_t
@@ -595,9 +605,20 @@ secure_open_gssapi(Port *port)
 /* Wait and retry if we couldn't write yet */
 if (ret <= 0)
 {
-	WaitLatchOrSocket(MyLatch,
-	  WL_SOCKET_WRITEABLE | WL_EXIT_ON_PM_DEATH,
-	  port->sock, 0, WAIT_EVENT_GSS_OPEN_SERVER);
+	WaitEvent	event;
+
+	/* FeBeWaitSet has port->sock in position 0 */
+	Assert(port == MyProcPort);
+	ModifyWaitEvent(FeBeWaitSet, 0, WL_SOCKET_WRITEABLE, NULL);
+	WaitEventSetWait(FeBeWaitSet, -1, , 1,
+	 WAIT_EVENT_GSS_OPEN_SERVER);
+
+	/* see comment in secure_read() */
+	if (event.events & WL_POSTMASTER_DEATH)
+		ereport(FATAL,
+(errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("terminating connection due to unexpected postmaster exit")));
+
 	continue;
 }
 
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 18d3fff068..fe5c904fbf 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -347,6 +347,7 @@ be_tls_open_server(Port *port)
 	int			err;
 	int			waitfor;
 	unsigned long ecode;
+	WaitEvent	event;
 
 	Assert(!port->ssl);
 	Assert(!port->peer);
@@ -415,12 +416,22 @@ aloop:
  * StartupPacketTimeoutHandler() which directly exits.
  */
 if (err == SSL_ERROR_WANT_READ)
-	waitfor = WL_SOCKET_READABLE | WL_EXIT_ON_PM_DEATH;
+	waitfor = WL_SOCKET_READABLE;
 else
-	waitfor = WL_SOCKET_WRITEABLE | WL_EXIT_ON_PM_DEATH;
+	waitfor = WL_SOCKET_WRITEABLE;
+
+/* FeBeWaitSet has port->sock in position 0 */
+Assert(port == MyProcPort);
+ModifyWaitEvent(FeBeWaitSet, 0, waitfor, NULL);
+(void) WaitEventSetWait(FeBeWaitSet, -1, , 1,
+		WAIT_EVENT_SSL_OPEN_SERVER);
+
+/* see comment in secure_read() */
+if (event.events & WL_POSTMASTER_DEATH)
+	ereport(FATAL,
+			(errcode(ERRCODE_ADMIN_SHUTDOWN),
+			 errmsg("terminating connection due to unexpected postmaster exit")));
 
-(void) WaitLatchOrSocket(MyLatch, waitfor, port->sock, 0,
-		 WAIT_EVENT_SSL_OPEN_SERVER);
 goto aloop;
 			case SSL_ERROR_SYSCALL:
 if (r < 0)
-- 
2.20.1



Re: Yet another vectorized engine

2020-02-23 Thread Hubert Zhang
Hi

On Sat, Feb 22, 2020 at 12:58 AM Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:

>
>
> On 12.02.2020 13:12, Hubert Zhang wrote:
>
> On Tue, Feb 11, 2020 at 1:20 AM Konstantin Knizhnik <
> k.knizh...@postgrespro.ru> wrote:
>
>>
>> So looks like PG-13 provides significant advantages in OLAP queries
>> comparing with 9.6!
>> Definitely it doesn't mean that vectorized executor is not needed for new
>> version of Postgres.
>> Once been ported, I expect that it should provide comparable  improvement
>> of performance.
>>
>> But in any case I think that vectorized executor makes sense only been
>> combine with columnar store.
>>
>
> Thanks for the test. +1 on vectorize should be combine with columnar
> store. I think when we support this extension
> on master, we could try the new zedstore.
> I'm not active on this work now, but will continue when I have time. Feel
> free to join bring vops's feature into this extension.
>
> Thanks
>
> Hubert Zhang
>
>
> I have ported vectorize_engine to the master.
> It takes longer than I expected: a lot of things were changed in executor.
>
> Results are the following:
>
>
> par.warkers
> PG9_6
> vectorize=off
> PG9_6
> vectorize=on
> master
> vectorize=off
> jit=on
> master
> vectorize=off
> jit=off master
> vectorize=on
> jit=ofn master
> vectorize=on
> jit=off
> 0
> 36
> 20
> 16
> 25.5
> 15
> 17.5
> 4
> 10
> -
> 5 7
> -
> -
>
> So it proves the theory that JIT provides almost the same speedup as
> vector executor (both eliminates interpretation overhead but in different
> way).
> I still not sure that we need vectorized executor: because with standard
> heap it provides almost no improvements comparing with current JIT version.
> But in any case I am going to test it with vertical storage (zedstore or
> cstore).
>
>
>
Thanks for the porting and testing.
Yes, PG master and 9.6 have many changes, not only executor, but also
tupletableslot interface.

What matters the performance of JIT and Vectorization is its
implementation. This is just the beginning of vectorization work, just as
your vops extension reported, vectorization could run 10 times faster in
PG. With the overhead of row storage(heap), we may not reach that speedup,
but I think we could do better. Also +1 on vertical storage.

BTW, welcome to submit your PR for the PG master version.


-- 
Thanks

Hubert Zhang


Re: Error on failed COMMIT

2020-02-23 Thread Dave Cramer
On Sun, 23 Feb 2020 at 20:31, Robert Haas  wrote:

> On Sun, Feb 23, 2020 at 11:11 AM Shay Rojansky  wrote:
> > I'd like to second Dave on this, from the .NET perspective - actual
> client access is done via standard drivers in almost all cases, and these
> drivers generally adhere to database API abstractions (JDBC for Java,
> ADO.NET for .NET, and so on). AFAIK, in almost all such abstractions,
> commit can either complete (implying success) or throw an exception - there
> is no third way to return a status code. It's true that a driver may expose
> NOTICE/WARNING messages via some other channel (Npgsql emits .NET events
> for these), but this is a separate message "channel" that is disconnected
> API-wise from the commit; this makes the mechanism very "undiscoverable".
>
> I'm still befuddled here. First, to repeat what I said before, the
> COMMIT only returns a ROLLBACK command tag if there's been a previous
> ERROR. So, if you haven't ignored the prior ERROR, you should be fine.
> Second, there's nothing to keep the driver itself from translating
> ROLLBACK into an exception, if that's more convenient for some
> particular driver. Let's go back to Bernhard's example upthred:
>
> composeTransaction() {
> Connection con = getConnection(); // implicitly "begin"
> try {
>insertFrameworkLevelState(con);
>insertApplicationLevelState(con);
>con.commit();
>publishNewState();
> } catch (Throwable ex) {
>con.rollback();
> }
> }
>
> If insertFrameworkLevelState() or insertApplicationLevelState()
> perform database operations that fail, then an exception should be
> thrown and we should end up at con.rollback(), unless there is an
> internal catch block inside those functions that swallows the
> exception, or unless the JDBC driver ignores the error from the
> server.


The driver does not ignore the error, But in Bernhard's case the framework
is
processing the exception and not re-throwing it.


> If those things succeed, then COMMIT could still fail with an
> ERROR but it shouldn't return ROLLBACK. But, for extra security,
> con.commit() could be made to throw an exception if the command tag
> returned by COMMIT is not COMMIT. It sounds like Dave doesn't want to
> do that, but it would solve this problem without requiring a server
> behavior change.
>

Well the driver really isn't in the business of changing the semantics of
the server.

>
> Actually, an even better idea might be to make the driver error out
> when the transaction is known to be in a failed state when you enter
> con.commit(). The server does return an indication after each command
> as to whether the session is in a transaction and whether that
> transaction is in a failed state. That's how the %x escape sequence
> just added to the psql prompt works. So, suppose the JDBC driver
> tracked that state like libpq does. insertFrameworkLevelState() or
> insertApplicationLevelState() throws an exception, which is internally
> swallowed. Then you reach con.commit(), and it says, nope, can't do
> that, we're in a failed state, and so an exception is thrown. Then
> when we reach con.rollback() we're still inside a transaction, it gets
> rolled back, and everything works just as expected.
>

Yes, we could do that.

>
> Or, alternatively, the JDBC driver could keep track of the fact that
> it had thrown an exception ITSELF, without paying any attention to
> what the server told it, and if it saw con.commit() after raising an
> exception, it could raise another exception (or re-raise the same
> one). That would also fix it.
>

We could also do this.

>
> > Asking drivers to do this at the client have the exact same breakage
> impact as the server change, since the user-visible behavior changes in the
> same way (the change is just shifted from server to driver). What's worse
> is that every driver now has to reimplement the same new logic, and we'd
> most probably end up with some drivers doing it in some languages, and
> others not doing it in others (so behavioral differences).
>
> Well, it seems quite possible that there are drivers and applications
> that don't have this issue; I've never had a problem with this
> behavior, and I've been using PostgreSQL for something like two
> decades,


I would argue that you really don't know if you had the problem or not
since an error is not thrown.
The client merrily goes along its way after issuing a commit and receiving
a rollback or possibly  a warning
saying that it's not in a transaction. One would have to know that the
server
had this behaviour to check for it.
Clearly not everyone knows this as it's not documented as violation of the
SQL spec


> and I believe that the sketch above could be used to get the
> desired behavior in current releases of PostgreSQL with no server code
> change. If we did change the server behavior, it seems unlikely that
> every driver would adjust their behavior to the new server behavior
> all at once and that they 

Re: Error on failed COMMIT

2020-02-23 Thread Robert Haas
On Sun, Feb 23, 2020 at 11:11 AM Shay Rojansky  wrote:
> I'd like to second Dave on this, from the .NET perspective - actual client 
> access is done via standard drivers in almost all cases, and these drivers 
> generally adhere to database API abstractions (JDBC for Java, ADO.NET for 
> .NET, and so on). AFAIK, in almost all such abstractions, commit can either 
> complete (implying success) or throw an exception - there is no third way to 
> return a status code. It's true that a driver may expose NOTICE/WARNING 
> messages via some other channel (Npgsql emits .NET events for these), but 
> this is a separate message "channel" that is disconnected API-wise from the 
> commit; this makes the mechanism very "undiscoverable".

I'm still befuddled here. First, to repeat what I said before, the
COMMIT only returns a ROLLBACK command tag if there's been a previous
ERROR. So, if you haven't ignored the prior ERROR, you should be fine.
Second, there's nothing to keep the driver itself from translating
ROLLBACK into an exception, if that's more convenient for some
particular driver. Let's go back to Bernhard's example upthred:

composeTransaction() {
Connection con = getConnection(); // implicitly "begin"
try {
   insertFrameworkLevelState(con);
   insertApplicationLevelState(con);
   con.commit();
   publishNewState();
} catch (Throwable ex) {
   con.rollback();
}
}

If insertFrameworkLevelState() or insertApplicationLevelState()
perform database operations that fail, then an exception should be
thrown and we should end up at con.rollback(), unless there is an
internal catch block inside those functions that swallows the
exception, or unless the JDBC driver ignores the error from the
server. If those things succeed, then COMMIT could still fail with an
ERROR but it shouldn't return ROLLBACK. But, for extra security,
con.commit() could be made to throw an exception if the command tag
returned by COMMIT is not COMMIT. It sounds like Dave doesn't want to
do that, but it would solve this problem without requiring a server
behavior change.

Actually, an even better idea might be to make the driver error out
when the transaction is known to be in a failed state when you enter
con.commit(). The server does return an indication after each command
as to whether the session is in a transaction and whether that
transaction is in a failed state. That's how the %x escape sequence
just added to the psql prompt works. So, suppose the JDBC driver
tracked that state like libpq does. insertFrameworkLevelState() or
insertApplicationLevelState() throws an exception, which is internally
swallowed. Then you reach con.commit(), and it says, nope, can't do
that, we're in a failed state, and so an exception is thrown. Then
when we reach con.rollback() we're still inside a transaction, it gets
rolled back, and everything works just as expected.

Or, alternatively, the JDBC driver could keep track of the fact that
it had thrown an exception ITSELF, without paying any attention to
what the server told it, and if it saw con.commit() after raising an
exception, it could raise another exception (or re-raise the same
one). That would also fix it.

> Asking drivers to do this at the client have the exact same breakage impact 
> as the server change, since the user-visible behavior changes in the same way 
> (the change is just shifted from server to driver). What's worse is that 
> every driver now has to reimplement the same new logic, and we'd most 
> probably end up with some drivers doing it in some languages, and others not 
> doing it in others (so behavioral differences).

Well, it seems quite possible that there are drivers and applications
that don't have this issue; I've never had a problem with this
behavior, and I've been using PostgreSQL for something like two
decades, and I believe that the sketch above could be used to get the
desired behavior in current releases of PostgreSQL with no server code
change. If we did change the server behavior, it seems unlikely that
every driver would adjust their behavior to the new server behavior
all at once and that they would all get it right while also all
preserving backward compatibility with current releases in case a
newer driver is used with an older server. I don't think that's
likely. What would probably happen is that many drivers would ignore
the change, leaving applications to cope with the differences between
server versions, and some would change the driver behavior
categorically, breaking compatibility with older server versions, and
some would make mistakes in implementing support for the new behavior.
And maybe we would also find that the new behavior isn't ideal for
everybody any more than the current behavior is ideal for everybody.

I am really struggling to see why this is anything but a bug in the
JDBC driver. The problem is that the application doesn't know that the
transaction has failed, but the server has returned not one, not two,

Re: Parallel copy

2020-02-23 Thread Andres Freund
Hi,

On 2020-02-19 11:38:45 +0100, Tomas Vondra wrote:
> I generally agree with the impression that parsing CSV is tricky and
> unlikely to benefit from parallelism in general. There may be cases with
> restrictions making it easier (e.g. restrictions on the format) but that
> might be a bit too complex to start with.
> 
> For example, I had an idea to parallelise the planning by splitting it
> into two phases:

FWIW, I think we ought to rewrite our COPY parsers before we go for
complex schemes. They're way slower than a decent green-field
CSV/... parser.


> The one piece of information I'm missing here is at least a very rough
> quantification of the individual steps of CSV processing - for example
> if parsing takes only 10% of the time, it's pretty pointless to start by
> parallelising this part and we should focus on the rest. If it's 50% it
> might be a different story. Has anyone done any measurements?

Not recently, but I'm pretty sure that I've observed CSV parsing to be
way more than 10%.

Greetings,

Andres Freund




Re: Parallel copy

2020-02-23 Thread Robert Haas
On Tue, Feb 18, 2020 at 6:51 PM Amit Kapila  wrote:
> I am talking about access to shared memory instead of the process
> local memory.  I understand that an extra copy won't be required.

You make it sound like there is some performance penalty for accessing
shared memory, but I don't think that's true. It's true that
*contended* access to shared memory can be slower, because if multiple
processes are trying to access the same memory, and especially if
multiple processes are trying to write the same memory, then the cache
lines have to be shared and that has a cost. However, I don't think
that would create any noticeable effect in this case. First, there's
presumably only one writer process. Second, you wouldn't normally have
multiple readers working on the same part of the data at the same
time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Fixing parallel make of libpq

2020-02-23 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> OK, got that.  But that doesn't directly answer the question of whether
> it's wrong to use a phony target as an order-only prerequisite of
> another phony target.  Grepping around for other possible issues,
> I see that you recently added
>
> update-unicode: | submake-generated-headers submake-libpgport
>   $(MAKE) -C src/common/unicode $@
>   $(MAKE) -C contrib/unaccent $@
>
> Doesn't that also have parallel-make hazards, if libpq does?

The part of 'update-unicode' that needs the generated headers and
libpgport is src/common/unicode/norm_test, which is depended by on by
the normalization-check target in the same directory.  Running 'make -C
src/common/unicode normalization-check' in a freshly-configured tree
does indeed fail, independent of parallelism or the update-unicode
target.

Adding the deps to the norm_test target fixes 'make -C
src/common/unicode normalization-check', but 'make -C src/common/unicode
update-unicode' still fails, because submake-generated-headers only does
its thing in the top-level make invocation, so that needs an explicit
dep as well.

Please find a patch attached.  However, I don't think it's fair to block
fixing the actual libpq parallel-make bug that has bitten me several
times on auditing the entire source tree for vaguely related issues that
nobody has complained about yet.

- ilmari
-- 
"A disappointingly low fraction of the human race is,
 at any given time, on fire." - Stig Sandbeck Mathisen

>From df05722a6b287d909698becb9718acce78a88020 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 24 Feb 2020 00:03:54 +
Subject: [PATCH] Add missing dependencies in src/common/unicode/Makefile

The norm_test program needs the generated headers and libpgport.

Because the submake-generated-headers target only does its thing in a
top-level make, add it to the update-unicode target too, so that doing
'make update-unicode' in this directory works, not just in the top
directory.
---
 src/common/unicode/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/common/unicode/Makefile b/src/common/unicode/Makefile
index ec78aeec2a..8e92eabdea 100644
--- a/src/common/unicode/Makefile
+++ b/src/common/unicode/Makefile
@@ -18,7 +18,7 @@ LIBS += $(PTHREAD_LIBS)
 # By default, do nothing.
 all:
 
-update-unicode: unicode_norm_table.h unicode_combining_table.h
+update-unicode: unicode_norm_table.h unicode_combining_table.h | submake-generated-headers
 	$(MAKE) normalization-check
 	mv unicode_norm_table.h unicode_combining_table.h ../../../src/include/common/
 
@@ -40,7 +40,7 @@ unicode_combining_table.h: generate-unicode_combining_table.pl UnicodeData.txt
 normalization-check: norm_test
 	./norm_test
 
-norm_test: norm_test.o ../unicode_norm.o
+norm_test: norm_test.o ../unicode_norm.o | submake-generated-headers submake-libpgport
 
 norm_test.o: norm_test_table.h
 
-- 
2.22.0



Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

2020-02-23 Thread Tom Lane
I wrote:
> Here's a draft patch that does it like that.

On reflection, trying to make ReserveExternalFD serve two different
use-cases was pretty messy.  Here's a version that splits it into two
functions.  I also took the trouble to fix dblink.

regards, tom lane

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index c1155e3..a5f69fc 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -200,12 +200,27 @@ dblink_get_conn(char *conname_or_str,
 		if (connstr == NULL)
 			connstr = conname_or_str;
 		dblink_connstr_check(connstr);
+
+		/*
+		 * We must obey fd.c's limit on non-virtual file descriptors.  Assume
+		 * that a PGconn represents one long-lived FD.  (Doing this here also
+		 * ensures that VFDs are closed if needed to make room.)
+		 */
+		if (!AcquireExternalFD())
+			ereport(ERROR,
+	(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
+	 errmsg("could not establish connection"),
+	 errdetail("There are too many open files.")));
+
+		/* OK to make connection */
 		conn = PQconnectdb(connstr);
+
 		if (PQstatus(conn) == CONNECTION_BAD)
 		{
 			char	   *msg = pchomp(PQerrorMessage(conn));
 
 			PQfinish(conn);
+			ReleaseExternalFD();
 			ereport(ERROR,
 	(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
 	 errmsg("could not establish connection"),
@@ -282,12 +297,26 @@ dblink_connect(PG_FUNCTION_ARGS)
 
 	/* check password in connection string if not superuser */
 	dblink_connstr_check(connstr);
+
+	/*
+	 * We must obey fd.c's limit on non-virtual file descriptors.  Assume that
+	 * a PGconn represents one long-lived FD.  (Doing this here also ensures
+	 * that VFDs are closed if needed to make room.)
+	 */
+	if (!AcquireExternalFD())
+		ereport(ERROR,
+(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
+ errmsg("could not establish connection"),
+ errdetail("There are too many open files.")));
+
+	/* OK to make connection */
 	conn = PQconnectdb(connstr);
 
 	if (PQstatus(conn) == CONNECTION_BAD)
 	{
 		msg = pchomp(PQerrorMessage(conn));
 		PQfinish(conn);
+		ReleaseExternalFD();
 		if (rconn)
 			pfree(rconn);
 
@@ -312,7 +341,10 @@ dblink_connect(PG_FUNCTION_ARGS)
 	else
 	{
 		if (pconn->conn)
+		{
 			PQfinish(pconn->conn);
+			ReleaseExternalFD();
+		}
 		pconn->conn = conn;
 	}
 
@@ -346,6 +378,7 @@ dblink_disconnect(PG_FUNCTION_ARGS)
 		dblink_conn_not_avail(conname);
 
 	PQfinish(conn);
+	ReleaseExternalFD();
 	if (rconn)
 	{
 		deleteConnection(conname);
@@ -780,7 +813,10 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 	{
 		/* if needed, close the connection to the database */
 		if (freeconn)
+		{
 			PQfinish(conn);
+			ReleaseExternalFD();
+		}
 	}
 	PG_END_TRY();
 
@@ -1458,7 +1494,10 @@ dblink_exec(PG_FUNCTION_ARGS)
 	{
 		/* if needed, close the connection to the database */
 		if (freeconn)
+		{
 			PQfinish(conn);
+			ReleaseExternalFD();
+		}
 	}
 	PG_END_TRY();
 
@@ -2563,6 +2602,7 @@ createNewConnection(const char *name, remoteConn *rconn)
 	if (found)
 	{
 		PQfinish(rconn->conn);
+		ReleaseExternalFD();
 		pfree(rconn);
 
 		ereport(ERROR,
@@ -2604,6 +2644,7 @@ dblink_security_check(PGconn *conn, remoteConn *rconn)
 		if (!PQconnectionUsedPassword(conn))
 		{
 			PQfinish(conn);
+			ReleaseExternalFD();
 			if (rconn)
 pfree(rconn);
 
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 29c811a..74576d7 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -20,6 +20,7 @@
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postgres_fdw.h"
+#include "storage/fd.h"
 #include "storage/latch.h"
 #include "utils/hsearch.h"
 #include "utils/inval.h"
@@ -259,10 +260,27 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
 
 		keywords[n] = values[n] = NULL;
 
-		/* verify connection parameters and make connection */
+		/* verify the set of connection parameters */
 		check_conn_params(keywords, values, user);
 
+		/*
+		 * We must obey fd.c's limit on non-virtual file descriptors.  Assume
+		 * that a PGconn represents one long-lived FD.  (Doing this here also
+		 * ensures that VFDs are closed if needed to make room.)
+		 */
+		if (!AcquireExternalFD())
+			ereport(ERROR,
+	(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
+	 errmsg("could not connect to server \"%s\"",
+			server->servername),
+	 errdetail("There are too many open files.")));
+
+		/* OK to make connection */
 		conn = PQconnectdbParams(keywords, values, false);
+
+		if (!conn)
+			ReleaseExternalFD();	/* because the PG_CATCH block won't */
+
 		if (!conn || PQstatus(conn) != CONNECTION_OK)
 			ereport(ERROR,
 	(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
@@ -294,7 +312,10 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
 	{
 		/* Release PGconn data structure if we managed to create one */
 		if (conn)
+		{
 			

v12 "won't fix" item regarding memory leak in "ATTACH PARTITION without AEL"; (or, relcache ref counting)

2020-02-23 Thread Justin Pryzby
This links to a long thread, from which I've tried to quote some of the
most important mails, below.
https://wiki.postgresql.org/wiki/PostgreSQL_12_Open_Items#Won.27t_Fix

I wondered if there's an effort to pursue a resolution for v13 ?

On Fri, Apr 12, 2019 at 11:42:24AM -0400, Tom Lane wrote in 
<31027.1555083...@sss.pgh.pa.us>:
> Michael Paquier  writes:
> > On Wed, Apr 10, 2019 at 05:03:21PM +0900, Amit Langote wrote:
> >> The problem lies in all branches that have partitioning, so it should be
> >> listed under Older Bugs, right?  You may have noticed that I posted
> >> patches for all branches down to 10.
> 
> > I have noticed.  The message from Tom upthread outlined that an open
> > item should be added, but this is not one.  That's what I wanted to
> > emphasize.  Thanks for making it clearer.
> 
> To clarify a bit: there's more than one problem here.  Amit added an
> open item about pre-existing leaks related to rd_partcheck.  (I'm going
> to review and hopefully push his fix for that today.)  However, there's
> a completely separate leak associated with mismanagement of rd_pdcxt,
> as I showed in [1], and it doesn't seem like we have consensus about
> what to do about that one.  AFAIK that's a new bug in 12 (caused by
> 898e5e329) and so it ought to be in the main open-items list.
> 
> This thread has discussed a bunch of possible future changes like
> trying to replace copying of relcache-provided data structures
> with reference-counting, but I don't think any such thing could be
> v12 material at this point.  We do need to fix the newly added
> leak though.
> 
>   regards, tom lane
> 
> [1] https://www.postgresql.org/message-id/10797.1552679128%40sss.pgh.pa.us
> 
> 

On Fri, Mar 15, 2019 at 05:41:47PM -0400, Robert Haas wrote in 
:
> On Fri, Mar 15, 2019 at 3:45 PM Tom Lane  wrote:
> > More to the point, we turned *one* rebuild = false situation into
> > a bunch of rebuild = true situations.  I haven't studied it closely,
> > but I think a CCA animal would probably see O(N^2) rebuild = true
> > invocations in a query with N partitions, since each time
> > expand_partitioned_rtentry opens another child table, we'll incur
> > an AcceptInvalidationMessages call which leads to forced rebuilds
> > of the previously-pinned rels.  In non-CCA situations, almost always
> > nothing happens with the previously-examined relcache entries.
> 
> That's rather unfortunate.  I start to think that clobbering the cache
> "always" is too hard a line.
> 
> > I agree that copying data isn't great.  What I don't agree is that this
> > solution is better.  In particular, copying data out of the relcache
> > rather than expecting the relcache to hold still over long periods
> > is the way we've done things everywhere else, cf RelationGetIndexList,
> > RelationGetStatExtList, RelationGetIndexExpressions,
> > RelationGetIndexPredicate, RelationGetIndexAttrBitmap,
> > RelationGetExclusionInfo, GetRelationPublicationActions.  I don't care
> > for a patch randomly deciding to do things differently on the basis of an
> > unsupported-by-evidence argument that it might cost too much to copy the
> > data.  If we're going to make a push to reduce the amount of copying of
> > that sort that we do, it should be a separately (and carefully) designed
> > thing that applies to all the relcache substructures that have the issue,
> > not one-off hacks that haven't been reviewed thoroughly.
> 
> That's not an unreasonable argument.  On the other hand, if you never
> try new stuff, you lose opportunities to explore things that might
> turn out to be better and worth adopting more widely.
> 
> I am not very convinced that it makes sense to lump something like
> RelationGetIndexAttrBitmap in with something like
> RelationGetPartitionDesc.  RelationGetIndexAttrBitmap is returning a
> single Bitmapset, whereas the data structure RelationGetPartitionDesc
> is vastly larger and more complex.  You can't say "well, if it's best
> to copy 32 bytes of data out of the relcache every time we need it, it
> must also be right to copy 10k or 100k of data out of the relcache
> every time we need it."
> 
> There is another difference as well: there's a good chance that
> somebody is going to want to mutate a Bitmapset, whereas they had
> BETTER NOT think that they can mutate the PartitionDesc.  So returning
> an uncopied Bitmapset is kinda risky in a way that returning an
> uncopied PartitionDesc is not.
> 
> If we want an at-least-somewhat unified solution here, I think we need
> to bite the bullet and make the planner hold a reference to the
> relcache throughout planning.  (The executor already keeps it open, I
> believe.) Otherwise, how is the relcache supposed to know when it can
> throw stuff away and when it can't?  The only alternative seems to be
> to have each subsystem hold its own reference count, as I did with the
> PartitionDirectory stuff, which is not something we'd want to scale
> out.
> 
> > I especially don't care 

Re: Ought binary literals to allow spaces?

2020-02-23 Thread Tom Lane
Chapman Flack  writes:
> ISO seems to allow spaces among the digits of a binary literal, so as
> to group them for readability, as in X'00ba b10c'. We seem not to.

Hmm.  SQL99 did not allow noise spaces in  and
related productions, but it does look like they added that in SQL:2008.

> (The B'...' form appears to be a PostgreSQL extension, but I imagine
> if ISO had it, it would allow spaces too.)

The B'...' form was there in SQL99.  The committee took it out in
SQL:2003 or so, along with the BIT type, but we still have both.

regards, tom lane




Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

2020-02-23 Thread Tom Lane
I wrote:
>> Clearly, we gotta do something about that too.  Maybe bumping up
>> NUM_RESERVED_FDS would be a good idea, but I feel like more-honest
>> accounting for permanently-eaten FDs would be a better idea.  And
>> in any case we can't suppose that there's a fixed upper limit on
>> the number of postgres_fdw connections.  I'm liking the idea I floated
>> earlier of letting postgres_fdw forcibly close the oldest LRU entry.

> A late-night glimmer of an idea: suppose we make fd.c keep count,
> not just of the open FDs under its control, but also of open FDs
> not under its control.

Here's a draft patch that does it like that.  There are undoubtedly
more places that need to be taught to report their FD consumption;
one obvious candidate that I didn't touch is dblink.  But this is
enough to account for all the long-lived "extra" FDs that are currently
open in a default build, and it passes check-world even with ulimit -n
set to the minimum that set_max_safe_fds will allow.

One thing I noticed is that if you open enough postgres_fdw connections
to cause a failure, that tends to come out as an unintelligible-to-
the-layman "epoll_create1 failed: Too many open files" error.  That's
because after postgres_fdw has consumed the last available "external
FD" slot, it tries to use CreateWaitEventSet to wait for input from
the remote server, and now that needs to get another external FD.
I left that alone for the moment, because if you do rejigger the
WaitEventSet code to avoid dynamically creating/destroying epoll sockets,
it will stop being a problem.  If that doesn't happen, another
possibility is to reclassify CreateWaitEventSet as a caller that should
ignore "failure" from ReserveExternalFD --- but that would open us up
to problems if a lot of WaitEventSets get created, so it's not a great
answer.  It'd be okay perhaps if we added a distinction between
long-lived and short-lived WaitEventSets (ignoring "failure" only for
the latter).  But I didn't want to go there in this patch.

Thoughts?

regards, tom lane

diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 29c811a..27d509c 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -20,6 +20,7 @@
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postgres_fdw.h"
+#include "storage/fd.h"
 #include "storage/latch.h"
 #include "utils/hsearch.h"
 #include "utils/inval.h"
@@ -259,10 +260,30 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
 
 		keywords[n] = values[n] = NULL;
 
-		/* verify connection parameters and make connection */
+		/* verify the set of connection parameters */
 		check_conn_params(keywords, values, user);
 
+		/*
+		 * We must obey fd.c's limit on non-virtual file descriptors.  Assume
+		 * that a PGconn represents one long-lived FD.  (Doing this here also
+		 * ensures that VFDs are closed if needed to make room.)
+		 */
+		if (!ReserveExternalFD())
+		{
+			ReleaseExternalFD();	/* yup, gotta do this here */
+			ereport(ERROR,
+	(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
+	 errmsg("could not connect to server \"%s\"",
+			server->servername),
+	 errdetail("There are too many open files.")));
+		}
+
+		/* OK to make connection */
 		conn = PQconnectdbParams(keywords, values, false);
+
+		if (!conn)
+			ReleaseExternalFD();	/* because the PG_CATCH block won't */
+
 		if (!conn || PQstatus(conn) != CONNECTION_OK)
 			ereport(ERROR,
 	(errcode(ERRCODE_SQLCLIENT_UNABLE_TO_ESTABLISH_SQLCONNECTION),
@@ -294,7 +315,10 @@ connect_pg_server(ForeignServer *server, UserMapping *user)
 	{
 		/* Release PGconn data structure if we managed to create one */
 		if (conn)
+		{
 			PQfinish(conn);
+			ReleaseExternalFD();
+		}
 		PG_RE_THROW();
 	}
 	PG_END_TRY();
@@ -312,6 +336,7 @@ disconnect_pg_server(ConnCacheEntry *entry)
 	{
 		PQfinish(entry->conn);
 		entry->conn = NULL;
+		ReleaseExternalFD();
 	}
 }
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index fd527f2..b60e45a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -774,6 +774,7 @@ static const char *const xlogSourceNames[] = {"any", "archive", "pg_wal", "strea
  * openLogFile is -1 or a kernel FD for an open log file segment.
  * openLogSegNo identifies the segment.  These variables are only used to
  * write the XLOG, and so will normally refer to the active segment.
+ * Note: call Reserve/ReleaseExternalFD to track consumption of this FD.
  */
 static int	openLogFile = -1;
 static XLogSegNo openLogSegNo = 0;
@@ -785,6 +786,9 @@ static XLogSegNo openLogSegNo = 0;
  * will be just past that page. readLen indicates how much of the current
  * page has been read into readBuf, and readSource indicates where we got
  * the currently open file from.
+ * Note: we could use Reserve/ReleaseExternalFD to track consumption of
+ * this FD too; but it doesn't currently seem 

Re: Yet another fast GiST build

2020-02-23 Thread Thomas Munro
On Thu, Feb 20, 2020 at 10:14 AM Thomas Munro  wrote:
> 1.  We expect floats to be in IEEE format, and the sort order of IEEE
> floats is mostly correlated to the binary sort order of the bits
> reinterpreted as an int.  It isn't in some special cases, but for this
> use case we don't really care about that, we're just trying to
> encourage locality.

I suppose there is a big jump in integer value (whether signed or
unsigned) as you cross from positive to negative floats, and then the
sort order is reversed.  I have no idea if either of those things is a
problem worth fixing.  That made me wonder if there might also be an
endianness problem.  It seems from some quick googling that all
current architectures have integers and floats of the same endianness.
Apparently this wasn't always the case, and some ARMs have a weird
half-flipped arrangement for 64 bit floats, but not 32 bit floats as
you are using here.




Ought binary literals to allow spaces?

2020-02-23 Thread Chapman Flack
Hi,

ISO seems to allow spaces among the digits of a binary literal, so as
to group them for readability, as in X'00ba b10c'. We seem not to.
(The B'...' form appears to be a PostgreSQL extension, but I imagine
if ISO had it, it would allow spaces too.)

Is it worthwhile to allow that? Or to add a compatibility note somewhere
around sql-syntax-bit-strings saying we don't allow it?

For comparison, byteain does allow grouping whitespace.

It seems that byteain allows arbitrary whitespace (tabs, newlines, etc.),
whereas ISO's X'...' allows exactly and only U+0020 space characters.

Whitespace for byteain must occur between digit pairs; spaces in X'...'
per ISO can be anywhere.

Regards,
-Chap




Re: Error on failed COMMIT

2020-02-23 Thread Vladimir Sitnikov
Shay> Asking drivers to do this at the client have the exact same breakage
impact as the server change, since the user-visible behavior changes in the
same way

+1

Dave>While we can certainly code around this in the client drivers I don't
believe they should be responsible for fixing the failings of the server.

Application developers expect that the database feels the same no matter
which driver is used, so it would be better to avoid a case
when half of the drivers create exceptions on non-committing-commit, and
another half silently loses data.

Vladimir


Re: Error on failed COMMIT

2020-02-23 Thread Dave Cramer
On Sun, 23 Feb 2020 at 00:41, Shay Rojansky  wrote:

>
>
> On Fri, 14 Feb 2020 at 14:37, Robert Haas  wrote:
>>
>>> On Fri, Feb 14, 2020 at 2:08 PM Dave Cramer 
>>> wrote:
>>> > Well now you are asking the driver to re-interpret the results in a
>>> different way than the server which is not what we tend to do.
>>> >
>>> > The server throws an error we throw an error. We really aren't in the
>>> business of re-interpreting the servers responses.
>>>
>>> I don't really see a reason why the driver has to throw an exception
>>> if and only if there is an ERROR on the PostgreSQL side. But even if
>>> you want to make that rule for some reason, it doesn't preclude
>>> correct behavior here. All you really need is to have con.commit()
>>> return some indication of what the command tag was, just as, say, psql
>>> would do. If the server provides you with status information and you
>>> throw it out instead of passing it along to the application, that's
>>> not ideal.
>>>
>>
>> Well con.commit() returns void :(
>>
>
> I'd like to second Dave on this, from the .NET perspective - actual client
> access is done via standard drivers in almost all cases, and these drivers
> generally adhere to database API abstractions (JDBC for Java, ADO.NET for
> .NET, and so on). AFAIK, in almost all such abstractions, commit can either
> complete (implying success) or throw an exception - there is no third way
> to return a status code. It's true that a driver may expose NOTICE/WARNING
> messages via some other channel (Npgsql emits .NET events for these), but
> this is a separate message "channel" that is disconnected API-wise from the
> commit; this makes the mechanism very "undiscoverable".
>
> In other words, if we do agree that there are some legitimate cases where
> a program may end up executing commit on a failed transaction (e.g. because
> of a combination of framework and application code), and we think that a
> well-written client should be aware of the failed transaction and behave in
> an exceptional way around a non-committing commit, then I think that's a
> good case for a server-side change:
>
>- Asking drivers to do this at the client have the exact same breakage
>impact as the server change, since the user-visible behavior changes in the
>same way (the change is just shifted from server to driver). What's worse
>is that every driver now has to reimplement the same new logic, and we'd
>most probably end up with some drivers doing it in some languages, and
>others not doing it in others (so behavioral differences).
>- Asking end-users (i.e. application code) to do this seems even
>worse, as every user/application in the world now has to be made somehow
>aware of a somewhat obscure and very un-discoverable situation.
>
> Shay
>

To be fair this is Bernhard's position which, after thinking about this
some more, I am endorsing.

So we now have two of the largest client bases for PostgreSQL with known
issues effectively losing data because they don't notice that the commit
failed.
It is very likely that this occurs with all clients but they just don't
notice it. That is what is particularly alarming about this problem is that
we are silently ignoring an error.

While we can certainly code around this in the client drivers I don't
believe they should be responsible for fixing the failings of the server.

I fail to see where doing the right thing and reporting an error where
there is one should be trumped by not breaking existing apps which by all
accounts may be broken but just don't know it.

Dave