[ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775105#comment-16775105
 ] 

Andrei Dulceanu commented on OAK-8063:
--------------------------------------

[~frm], in [^OAK-8063.patch] I propose a new, simplified implementation for 
{{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}}. It follows the 
classical topological order algorithm with a twist: by eagerly marking a node 
visited, it breaks possible loops if the graph contains a cycle. BTW, could 
really be the case that the segment graph to be transferred from primary 
contains a cycle? My understanding is that segments only refer previous written 
segments, without any forward references, thus avoiding cycles.

I also kept the optimisation about segments already transferred (i.e. local), 
but removed the one for the "diamond problem" which could not be possible in 
the current implementation. 

 Could you please take a look at the patch?

> The cold standby client doesn't correctly handle backward references
> --------------------------------------------------------------------
>
>                 Key: OAK-8063
>                 URL: https://issues.apache.org/jira/browse/OAK-8063
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar, tarmk-standby
>    Affects Versions: 1.6.0
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>            Priority: Major
>              Labels: cold-standby
>             Fix For: 1.12, 1.10.1, 1.8.12
>
>         Attachments: OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to