Re: [HACKERS] some review comments on logical rep code

Petr Jelinek Tue, 18 Apr 2017 09:36:33 -0700

Thank you for working on this!

On 18/04/17 10:16, Masahiko Sawada wrote:
> On Tue, Apr 18, 2017 at 12:24 PM, Kyotaro HORIGUCHI
> <[email protected]> wrote:
>>
>>>>>> 3.
>>>>>>>
>>>>>>> ApplyLauncherWakeup() should be static function.
>>>>>>
>>>>>> Attached 003 patch fixes it (and also fixes #5 review comment).
>>>>>>
>>>>>> 4.
>>>>>>>
>>>>>>> Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's
>>>>>>> subcommands (e.g., ENABLED one) should wake the launcher up on commit.
>>>>>>
>>>>>> This is also reported by Horiguchi-san on another thread[1], and patch 
>>>>>> exists.
>>>>>>
>>>>>> 5.
>>>>>>>
>>>>>>> Why is IsBackendPid() check necessary in ApplyLauncherWakeup()?
>>>>>>
>>>>>> I also guess it's necessary. This change is included in #3 patch.
>>>>>
>>>>> IsBackendPid() is not free, I suppose that just ignoring failure
>>>>> with ESRCH is enough.
>>>>
>>>> After thought, since LogicalRepCtx->launcher_pid could be 0 should, we
>>>> check if launcher_pid != 0?
>>
>> Yes. For example, code to send a signal to autovacuum launcher
>> looks as the follows. I don't think there's a reason to do
>> different thing here.
>>
>> |       avlauncher_needs_signal = false;
>> |       if (AutoVacPID != 0)
>> |         kill(AutoVacPID, SIGUSR2);
>>
> 
> Fixed.


Yes the IsBackendPid was needed only because we didn't reset
launcher_pid and it was causing issues otherwise obviously (and I didn't
realize we are not resetting it...)

> 
>>
>>>>>> 6.
>>>>>>>
>>>>>>> Normal statements that the walsender for logical rep runs are logged
>>>>>>> by log_replication_commands. So if log_statement is also set,
>>>>>>> those commands are logged twice.
>>>>>>
>>>>>> Attached 006 patch fixes it by adding  "-c log_statement = none" to
>>>>>> connection parameter of libpqrcv if logical = true.
>>>>>
>>>>> The control by log_replication_commands is performed on
>>>>> walsender, so this also shuld be done on the same place. Anyway
>>>>> log_statement is irrelevant to walsender.
>>>>
>>>> Patch 006 emits all logs executed by the apply worker as a log of
>>>> log_replication_command but perhaps you're right. It would be better
>>>> to leave the log of log_statement if the command executed by the apply
>>>> worker is a SQL command. Will fix.
>>
>> Here, we can choose whether a SQL command is a part of
>> replication commands or not. The previous fix thought it is and
>> this doesn't. (They are emitted in different forms) Is this ok?
> 
> Yes, v2 patch logs other than T_SQLCmd as a replication command, and
> T_SQLCmd is logged later in exec_simple_query. The
> log_replication_command logs only replication commands[1], it doesn't
> mean to log commands executed on replication connection. I think such
> behavior is right.
> 
>> I'm not sure why you have moved the location of the code, but if
>> it should show all received commands, it might be better to be
>> placed before CHECK_FOR_INTERRUPTS..
> 
> Hmm, the reason why I've moved it is that we cannot know whether given
> cmd_string is a SQL command or a replication command before parsing.
> 

Well the issue is that then the query is not logged if there was parsing
issue even when logging was specifically requested. I don't know what's
good solution here, maybe teaching exec_simple_query to not log the
query if it came from walsender.

BTW reading that function in walsender, there is :
>       /*
>        * CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot until the next
>        * command arrives. Clean up the old stuff if there's anything.
>        */
>       SnapBuildClearExportedSnapshot();

and then

>       /*
>        * CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot. If it was
>        * called outside of transaction the snapshot should be cleared here.
>        */
>       if (!IsTransactionBlock())
>               SnapBuildClearExportedSnapshot();

The first one should not be there, it looks like a result of some merge
conflict being solved wrongly. Maybe your patch could fix that too?

>>>>
>>>>>> 10.
>>>>>>>
>>>>>>>     SpinLockAcquire(&MyLogicalRepWorker->relmutex);
>>>>>>>     MyLogicalRepWorker->relstate =
>>>>>>>       GetSubscriptionRelState(MyLogicalRepWorker->subid,
>>>>>>>             MyLogicalRepWorker->relid,
>>>>>>>             &MyLogicalRepWorker->relstate_lsn,
>>>>>>>             false);
>>>>>>>     SpinLockRelease(&MyLogicalRepWorker->relmutex);
>>>>>>>
>>>>>>> Non-"short-term" function like GetSubscriptionRelState() should not
>>>>>>> be called while spinlock is being held.
>>>>>>>
>>>>>>
>>>>>> One option is adding new LWLock for the relation state change but this
>>>>>> lock is used at many locations where the operation actually doesn't
>>>>>> take time. I think that the discussion would be needed to get
>>>>>> consensus, so patch for it is not attached.
>>>>>
>>>>> From the point of view of time, I agree that it doesn't seem to
>>>>> harm. Bt I thing it as more significant problem that
>>>>> potentially-throwable function is called within the mutex
>>>>> region. It potentially causes a kind of dead lock.  (That said,
>>>>> theoretically dosn't occur and I'm not sure what happens by the
>>>>> dead lock..)
>>>>>

Hmm, I think doing what I attached should be fine here. We don't need to
hold spinlock for table read, just for changing the
MyLogicalRepWorker->relstate.


-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

diff --git a/src/backend/replication/logical/tablesync.c 
b/src/backend/replication/logical/tablesync.c
index 108326b..73e8c42 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -705,17 +705,19 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 {
        char               *slotname;
        char               *err;
+       char                    relstate;
 
        /* Check the state of the table synchronization. */
        StartTransactionCommand();
+       relstate = GetSubscriptionRelState(MyLogicalRepWorker->subid,
+                                                                          
MyLogicalRepWorker->relid,
+                                                                          
&MyLogicalRepWorker->relstate_lsn,
+                                                                          
false);
+       CommitTransactionCommand();
+
        SpinLockAcquire(&MyLogicalRepWorker->relmutex);
-       MyLogicalRepWorker->relstate =
-               GetSubscriptionRelState(MyLogicalRepWorker->subid,
-                                                               
MyLogicalRepWorker->relid,
-                                                               
&MyLogicalRepWorker->relstate_lsn,
-                                                               false);
+       MyLogicalRepWorker->relstate = relstate;
        SpinLockRelease(&MyLogicalRepWorker->relmutex);
-       CommitTransactionCommand();
 
        /*
         * To build a slot name for the sync work, we are limited to 
NAMEDATALEN -

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] some review comments on logical rep code

Reply via email to