At Fri, 15 May 2020 14:01:46 +0500, "Andrey M. Borodin" <x4...@yandex-team.ru> wrote in > > > > 15 мая 2020 г., в 05:03, Kyotaro Horiguchi <horikyota....@gmail.com> > > написал(а): > > > > At Thu, 14 May 2020 11:44:01 +0500, "Andrey M. Borodin" > > <x4...@yandex-team.ru> wrote in > >>> GetMultiXactIdMembers believes that 4 is successfully done if 2 > >>> returned valid offset, but actually that is not obvious. > >>> > >>> If we add a single giant lock just to isolate ,say, > >>> GetMultiXactIdMember and RecordNewMultiXact, it reduces concurrency > >>> unnecessarily. Perhaps we need finer-grained locking-key for standby > >>> that works similary to buffer lock on primary, that doesn't cause > >>> confilicts between irrelevant mxids. > >>> > >> We can just replay members before offsets. If offset is already there - > >> members are there too. > >> But I'd be happy if we could mitigate those 1000us too - with a hint about > >> last maixd state in a shared MX state, for example. > > > > Generally in such cases, condition variables would work. In the > > attached PoC, the reader side gets no penalty in the "likely" code > > path. The writer side always calls ConditionVariableBroadcast but the > > waiter list is empty in almost all cases. But I couldn't cause the > > situation where the sleep 1000u is reached. > Thanks! That really looks like a good solution without magic timeouts. > Beautiful! > I think I can create temporary extension which calls MultiXact API and tests > edge-cases like this 1000us wait. > This extension will also be also useful for me to assess impact of bigger > buffers, reduced read locking (as in my 2nd patch) and other tweaks.
Happy to hear that, It would need to use timeout just in case, though. > >> Actually, if we read empty mxid array instead of something that is > >> replayed just yet - it's not a problem of inconsistency, because > >> transaction in this mxid could not commit before we started. ISTM. > >> So instead of fix, we, probably, can just add a comment. If this reasoning > >> is correct. > > > > The step 4 of the reader side reads the members of the target mxid. It > > is already written if the offset of the *next* mxid is filled-in. > Most often - yes, but members are not guaranteed to be filled in order. Those > who win MXMemberControlLock will write first. > But nobody can read members of MXID before it is returned. And its members > will be written before returning MXID. Yeah, right. Otherwise assertion failure happens. regards. -- Kyotaro Horiguchi NTT Open Source Software Center