Re: invisible commit question for sync replication

2023-02-01 Thread qihua wu
In the code it will write a warning to postgresql log. Why not also write
the detailed sql? with the exact sql, DBA might do something to fix the
issue.

if (ProcDiePending)
{
   ereport(WARNING,
   (errcode(ERRCODE_ADMIN_SHUTDOWN),
   errmsg("canceling the wait for synchronous replication and terminating
connection due to administrator command"),
   errdetail("The transaction has already committed locally, but might not
have been replicated to the standby.")));
   whereToSendOutput = DestNone;
   SyncRepCancelWait();
   break;
}

On Wed, Feb 1, 2023 at 4:38 PM Laurenz Albe 
wrote:

> On Wed, 2023-02-01 at 14:52 +0800, qihua wu wrote:
> > When run a cluster with sync replication, if DML is done on primary, but
> primary is
> > isolated from all slave, then the DML will hang, if cancel it DML, it
> will say:
> > WARNING:  canceling wait for synchronous replication due to user request
> > DETAIL:  The transaction has already committed locally, but might not
> have been replicated to the standby
> >
> > So the workflow is
> > 1: commit to local.
> > 2: waiting for ACK from remote sync.
> >
> > When cancel the DML at step 2. the data are arealy on local, that's why
> it's warning.
> >
> > But when runs an insert which is waiting for remote ACK, and then query
> from another
> > session, I didn't find that row. Why this happen? If the insert is
> already one locally,
> > whey another session can't read it?
>
> COMMIT is not as atomic as it appears.  When the backend is waiting for
> the standby,
> it has already committed the transaction on disk, but that fact is not
> advertised to
> the other backends yet.
>
> Yours,
> Laurenz Albe
>


Re: invisible commit question for sync replication

2023-02-01 Thread Laurenz Albe
On Wed, 2023-02-01 at 14:52 +0800, qihua wu wrote:
> When run a cluster with sync replication, if DML is done on primary, but 
> primary is
> isolated from all slave, then the DML will hang, if cancel it DML, it will 
> say:
> WARNING:  canceling wait for synchronous replication due to user request
> DETAIL:  The transaction has already committed locally, but might not have 
> been replicated to the standby
> 
> So the workflow is 
> 1: commit to local.
> 2: waiting for ACK from remote sync.
> 
> When cancel the DML at step 2. the data are arealy on local, that's why it's 
> warning.
> 
> But when runs an insert which is waiting for remote ACK, and then query from 
> another
> session, I didn't find that row. Why this happen? If the insert is already 
> one locally,
> whey another session can't read it?

COMMIT is not as atomic as it appears.  When the backend is waiting for the 
standby,
it has already committed the transaction on disk, but that fact is not 
advertised to
the other backends yet.

Yours,
Laurenz Albe




Re: invisible commit question for sync replication

2023-02-01 Thread qihua wu
==》Can the second session see the inserted row before you cancel the insert
that is waiting for sync ack?
The second session can NOT see the inserted row if the first session is
still waiting for sync ACK.

I checked the source code, it makes sense to me now:
The waiting for sync ACK is called in
EndPrepare(gxact);
and after that it will call
ProcArrayClearTransaction(MyProc);

and only after ProcArrayClearTransaction(MyProc) is called, will the row be
visible to others.




On Wed, Feb 1, 2023 at 3:21 PM David G. Johnston 
wrote:

> On Wednesday, February 1, 2023, Julien Rouhaud  wrote:
>
>> Hi,
>>
>> On Wed, Feb 01, 2023 at 02:52:49PM +0800, qihua wu wrote:
>> > When run a cluster with sync replication, if DML is done on primary, but
>> > primary is isolated from all slave, then the DML will hang, if cancel it
>> > DML, it will say:
>> > WARNING:  canceling wait for synchronous replication due to user request
>> > DETAIL:  The transaction has already committed locally, but might not
>> have
>> > been replicated to the standby
>> >
>> > So the workflow is
>> > 1: commit to local.
>> > 2: waiting for ACK from remote sync.
>> >
>> > When cancel the DML at step 2. the data are arealy on local, that's why
>> > it's warning.
>> >
>> > But when runs an insert which is waiting for remote ACK, and then query
>> > from another session, I didn't find that row. Why this happen? If the
>> > insert is already one locally, whey another session can't read it?
>>
>> It works as expected for me, are you sure both sessions are actually
>> connected
>> to the same server and/or querying the same table?
>>
>> [1456]rjuju@127.0.0.1:14295) rjuju=# select * from tt;
>>  id | val
>> +-
>> (0 rows)
>>
>> [1456]rjuju@127.0.0.1:14295) rjuju=# insert into tt select 1;
>> ^CCancel request sent
>> WARNING:  01000: canceling wait for synchronous replication due to user
>> request
>> DETAIL:  The transaction has already committed locally, but might not
>> have been replicated to the standby.
>> LOCATION:  SyncRepWaitForLSN, syncrep.c:287
>> INSERT 0 1
>>
>> [1456]rjuju@127.0.0.1:14295) rjuju=# select pg_backend_pid(), * from tt;
>>  pg_backend_pid | id |  val
>> ++
>>1456 |  1 | 
>> (1 row)
>>
>>
>> and another session:
>>
>> [3327]rjuju@127.0.0.1:14295) rjuju=# select pg_backend_pid(), * from tt;
>>  pg_backend_pid | id |  val
>> ++
>>3327 |  1 | 
>> (1 row)
>>
>>
>>
> This wasn’t the question though.  Can the second session see the inserted
> row before you cancel the insert that is waiting for sync ack?
>
> Supposedly it can (not able to test myself).  Basically, the primary waits
> to make the local transaction visible until either sync ack or until the
> wait for sync ack is cancelled.  It doesn’t make sense to make it visible
> while waiting for sync ack since that would defeat the very behavior sync
> ack provides for.
>
> David J.
>
>


Re: invisible commit question for sync replication

2023-01-31 Thread David G. Johnston
On Wednesday, February 1, 2023, Julien Rouhaud  wrote:

> Hi,
>
> On Wed, Feb 01, 2023 at 02:52:49PM +0800, qihua wu wrote:
> > When run a cluster with sync replication, if DML is done on primary, but
> > primary is isolated from all slave, then the DML will hang, if cancel it
> > DML, it will say:
> > WARNING:  canceling wait for synchronous replication due to user request
> > DETAIL:  The transaction has already committed locally, but might not
> have
> > been replicated to the standby
> >
> > So the workflow is
> > 1: commit to local.
> > 2: waiting for ACK from remote sync.
> >
> > When cancel the DML at step 2. the data are arealy on local, that's why
> > it's warning.
> >
> > But when runs an insert which is waiting for remote ACK, and then query
> > from another session, I didn't find that row. Why this happen? If the
> > insert is already one locally, whey another session can't read it?
>
> It works as expected for me, are you sure both sessions are actually
> connected
> to the same server and/or querying the same table?
>
> [1456]rjuju@127.0.0.1:14295) rjuju=# select * from tt;
>  id | val
> +-
> (0 rows)
>
> [1456]rjuju@127.0.0.1:14295) rjuju=# insert into tt select 1;
> ^CCancel request sent
> WARNING:  01000: canceling wait for synchronous replication due to user
> request
> DETAIL:  The transaction has already committed locally, but might not have
> been replicated to the standby.
> LOCATION:  SyncRepWaitForLSN, syncrep.c:287
> INSERT 0 1
>
> [1456]rjuju@127.0.0.1:14295) rjuju=# select pg_backend_pid(), * from tt;
>  pg_backend_pid | id |  val
> ++
>1456 |  1 | 
> (1 row)
>
>
> and another session:
>
> [3327]rjuju@127.0.0.1:14295) rjuju=# select pg_backend_pid(), * from tt;
>  pg_backend_pid | id |  val
> ++
>3327 |  1 | 
> (1 row)
>
>
>
This wasn’t the question though.  Can the second session see the inserted
row before you cancel the insert that is waiting for sync ack?

Supposedly it can (not able to test myself).  Basically, the primary waits
to make the local transaction visible until either sync ack or until the
wait for sync ack is cancelled.  It doesn’t make sense to make it visible
while waiting for sync ack since that would defeat the very behavior sync
ack provides for.

David J.


Re: invisible commit question for sync replication

2023-01-31 Thread Julien Rouhaud
Hi,

On Wed, Feb 01, 2023 at 02:52:49PM +0800, qihua wu wrote:
> When run a cluster with sync replication, if DML is done on primary, but
> primary is isolated from all slave, then the DML will hang, if cancel it
> DML, it will say:
> WARNING:  canceling wait for synchronous replication due to user request
> DETAIL:  The transaction has already committed locally, but might not have
> been replicated to the standby
>
> So the workflow is
> 1: commit to local.
> 2: waiting for ACK from remote sync.
>
> When cancel the DML at step 2. the data are arealy on local, that's why
> it's warning.
>
> But when runs an insert which is waiting for remote ACK, and then query
> from another session, I didn't find that row. Why this happen? If the
> insert is already one locally, whey another session can't read it?

It works as expected for me, are you sure both sessions are actually connected
to the same server and/or querying the same table?

[1456]rjuju@127.0.0.1:14295) rjuju=# select * from tt;
 id | val
+-
(0 rows)

[1456]rjuju@127.0.0.1:14295) rjuju=# insert into tt select 1;
^CCancel request sent
WARNING:  01000: canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not have been 
replicated to the standby.
LOCATION:  SyncRepWaitForLSN, syncrep.c:287
INSERT 0 1

[1456]rjuju@127.0.0.1:14295) rjuju=# select pg_backend_pid(), * from tt;
 pg_backend_pid | id |  val
++
   1456 |  1 | 
(1 row)


and another session:

[3327]rjuju@127.0.0.1:14295) rjuju=# select pg_backend_pid(), * from tt;
 pg_backend_pid | id |  val
++
   3327 |  1 | 
(1 row)




invisible commit question for sync replication

2023-01-31 Thread qihua wu
When run a cluster with sync replication, if DML is done on primary, but
primary is isolated from all slave, then the DML will hang, if cancel it
DML, it will say:
WARNING:  canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not have
been replicated to the standby

So the workflow is
1: commit to local.
2: waiting for ACK from remote sync.

When cancel the DML at step 2. the data are arealy on local, that's why
it's warning.

But when runs an insert which is waiting for remote ACK, and then query
from another session, I didn't find that row. Why this happen? If the
insert is already one locally, whey another session can't read it?