On 2020/09/05 3:31, Alexey Kondratov wrote:
Hi,

On 2020-07-27 09:44, Andrey V. Lepikhov wrote:
On 7/27/20 11:22 AM, tsunakawa.ta...@fujitsu.com wrote:

US8356007B2 - Distributed transaction management for database systems with 
multiversioning - Google Patents
https://patents.google.com/patent/US8356007


If it is, can we circumvent this patent?


Thank you for the research (and previous links too).
I haven't seen this patent before. This should be carefully studied.

I had a look on the patch set, although it is quite outdated, especially on 
0003.

Two thoughts about 0003:

First, IIUC atomicity of the distributed transaction in the postgres_fdw is 
achieved by the usage of 2PC. I think that this postgres_fdw 2PC support should 
be separated from global snapshots.

Agreed.


It could be useful to have such atomic distributed transactions even without a 
proper visibility, which is guaranteed by the global snapshot. Especially 
taking into account the doubts about Clock-SI and general questions about 
algorithm choosing criteria above in the thread.

Thus, I propose to split 0003 into two parts and add a separate GUC 
'postgres_fdw.use_twophase', which could be turned on independently from 
'postgres_fdw.use_global_snapshots'. Of course if the latter is enabled, then 
2PC should be forcedly turned on as well.

Second, there are some problems with errors handling in the 0003 (thanks to 
Arseny Sher for review).

+error:
+            if (!res)
+            {
+                sql = psprintf("ABORT PREPARED '%s'", fdwTransState->gid);
+                BroadcastCmd(sql);
+                elog(ERROR, "Failed to PREPARE transaction on remote node");
+            }

It seems that we should never reach this point, just because BroadcastStmt will 
throw an ERROR if it fails to prepare transaction on the foreign server:

+            if (PQresultStatus(result) != expectedStatus ||
+                (handler && !handler(result, arg)))
+            {
+                elog(WARNING, "Failed command %s: status=%d, expected 
status=%d", sql, PQresultStatus(result), expectedStatus);
+                pgfdw_report_error(ERROR, result, entry->conn, true, sql);
+                allOk = false;
+            }

Moreover, It doesn't make much sense to try to abort prepared xacts, since if 
we failed to prepare it somewhere, then some foreign servers may become 
unavailable already and this doesn't provide us a 100% guarantee of clean up.

+    /* COMMIT open transaction of we were doing 2PC */
+    if (fdwTransState->two_phase_commit &&
+        (event == XACT_EVENT_PARALLEL_COMMIT || event == XACT_EVENT_COMMIT))
+    {
+        BroadcastCmd(psprintf("COMMIT PREPARED '%s'", fdwTransState->gid));
+    }

At this point, the host (local) transaction is already committed and there is 
no way to abort it gracefully. However, BroadcastCmd may rise an ERROR that 
will cause a PANIC, since it is non-recoverable state:

PANIC:  cannot abort transaction 487, it was already committed

Attached is a patch, which implements a plain 2PC in the postgres_fdw and adds 
a GUC 'postgres_fdw.use_twophase'. Also it solves these errors handling issues 
above and tries to add proper comments everywhere. I think, that 0003 should be 
rebased on the top of it, or it could be a first patch in the set, since it may 
be used independently. What do you think?

Thanks for the patch!

Sawada-san was proposing another 2PC patch at [1]. Do you have any thoughts
about pros and cons between your patch and Sawada-san's?

Regards,

[1]
https://www.postgresql.org/message-id/ca+fd4k4z6_b1etevqamwqhu4rx7xsrn5orl7ohj4b5b6sw-...@mail.gmail.com


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply via email to