This is an automated email from the ASF dual-hosted git repository.

yjhjstz pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/cloudberry.git

commit e4a8e1ecb7d80ae10cbda758a29cb65d1725071e
Author: Soumyadeep Chakraborty <[email protected]>
AuthorDate: Wed May 1 18:03:15 2024 -0700

    Report dtx protocol command dispatch errors reliably
    
    This commit addresses the edge case that if dispatch of a dtx protocol
    command fails (such as 'Release Current Subtransaction'), there is a
    chance that the qeError can be lost (due to FlushErrorState() calls in
    the dispatcher since 143bb7c6fcf and the fact that the caller may not
    CopyErrorData() or it can experience another ERROR, thereby losing out
    this intermediate ERROR).
    
    For instance, earlier in test udf_exception_blocks.sql:
    
    SET debug_dtm_action_target=protocol;
    SET debug_dtm_action_protocol=subtransaction_release;
    SET debug_dtm_action=fail_begin_command;
    SET debug_dtm_action_nestinglevel=0;
    SET debug_dtm_action_segment=0;
    SET debug_print_full_dtm=on;
    
    SELECT test_protocol_allseg(1, 2,'f');
    
    would error out with:
    NOTICE:  catching the exception ...2
    ERROR:  transaction 73730 at level 1 already processed (current level 1) 
(cdbtm.c:2418)  (seg1 127.0.1.1:6003 pid=1181877) (cdbtm.c:2418)
    CONTEXT:  PL/pgSQL function test_protocol_allseg(integer,integer,character) 
line 9 during exception cleanup
    
    Now we would also now print at LOG level:
    error on dispatch of dtx protocol command 'Release Current Subtransaction' 
for gid '73730'","QE reported error: Raise ERROR for debug_dtm_action = 2,
    debug_dtm_action_protocol = Release Current Subtransaction  (seg0 
127.0.1.1:7002 pid=1238517)
    
    which would have been lost earlier.
    
    Also, we improve the log message slightly. Also, we change the logging
    so that raiseError=false also comes under the purview of
    debug_print_full_dtm. Callers with raiseError=false won't suffer from
    that loss really, and debug_print_full_dtm seems the correct scope here.
    
    Reviewed-by: Ashwin Agrawal <[email protected]>
---
 src/backend/cdb/cdbtm.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/src/backend/cdb/cdbtm.c b/src/backend/cdb/cdbtm.c
index f3cbf73f7b..d11814dd39 100644
--- a/src/backend/cdb/cdbtm.c
+++ b/src/backend/cdb/cdbtm.c
@@ -1283,18 +1283,23 @@ doDispatchDtxProtocolCommand(DtxProtocolCommand 
dtxProtocolCommand,
 
        if (qeError)
        {
-               if (!raiseError)
-               {
-                       ereport(LOG,
-                                       (errmsg("DTM error (gathered results 
from cmd '%s')", dtxProtocolCommandStr),
-                                        errdetail("QE reported error: %s", 
qeError->message)));
-               }
-               else
+               /*
+                * Report the ERROR under Debug_print_full_dtm, as it can be 
lost as we
+                * flush below and the caller may forget to CopyErrorData(). 
Also, in
+                * some cases caller may not be able to act on the copy (e.g. 
due to
+                * another error).
+                */
+               ereportif(Debug_print_full_dtm, LOG,
+                                 (errmsg("error on dispatch of dtx protocol 
command '%s' for gid '%s'",
+                                                 dtxProtocolCommandStr, gid),
+                                  errdetail("QE reported error: %s", 
qeError->message)));
+
+               if (raiseError)
                {
+                       /* flush then rethrow, to avoid overflowing the error 
stack */
                        FlushErrorState();
                        ThrowErrorData(qeError);
                }
-               return false;
        }
 
        if (results == NULL)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to