[
https://issues.apache.org/jira/browse/BOOKKEEPER-246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429031#comment-13429031
]
Rakesh R commented on BOOKKEEPER-246:
-------------------------------------
bq.The worse case scenario here will be another replicator comes along, and
sees that the ledger is already fully replicated, so it does nothing.
There would be cases with partial replication: after his work, replicator will
give chance to others by releasing the lock. Assume following is the ledger
metadata.
0 BK1, BK2, BK3
10 BK1, BK4, BK3
Say BK1 shuts down, BK4 has acquired the lock and would able to replicate only
first fragment as BK4 is already has one copy of second fragment. Now assume
while releasing lock there is a slight zk fluctuation and got connection loss
exception(but zk session is still alive). Since the BK4 lock exists, others
couldn't acquire the lock. Here I feel, just recreation of the LedgerManager
won't work fully, instead needs to either close zk session or force releasing
the lock till session expiry(timeout).
Actually, I'm afraid of orphan locks that would create situations where holding
locks infinitely. Also, LedgerUnderreplicationManager presently doesn't have
any close apis and its taking zkclient externally?
> Recording of underreplication of ledger entries
> -----------------------------------------------
>
> Key: BOOKKEEPER-246
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-246
> Project: Bookkeeper
> Issue Type: Sub-task
> Components: bookkeeper-client, bookkeeper-server
> Reporter: Ivan Kelly
> Assignee: Ivan Kelly
> Fix For: 4.2.0
>
> Attachments: BOOKKEEPER-246.diff, BOOKKEEPER-246.diff,
> BOOKKEEPER-246.diff, BOOKKEEPER-246.diff
>
>
> This JIRA is to decide how to record that entries in a ledger are
> underreplicated.
> I think there is a common understanding (correct me if im wrong), that
> rereplication can be broken into two logically distinct phases. A) Detection
> of entry underreplication & B) Rereplication.
> This subtask is to handle the interaction between these two stages. Stage B
> needs to know what to rereplicate; how should Stage A inform it?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira