[jira] [Commented] (BOOKKEEPER-733) Improve ReplicationWorker to handle the urLedgers which already have same leder replica in hand

Rakesh R (JIRA) Thu, 06 Mar 2014 23:00:06 -0800

    [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13923636#comment-13923636
 ]


Rakesh R commented on BOOKKEEPER-733:
-------------------------------------

Hi All,

Following are few cases where the target bookie is not able to proceed happily 
with the rereplication procedures. 

I'm trying to put together all such cases where we come across. My intention is 
simple to make everyone aware about the cases and I hope this will help us to 
reach to a common conclusion.

+Case-1)+ Already have replica(am part of all the ledger fragments)
+Case-2)+ BKException - BKReadException, BKBookieHandleNotAvailableException
        - quorum lost (Thanks a lot Ivan for bringing this scenario, where the 
ledger losts the quorum and hanging around for re-replication.)
        - slow bookies and not get enough responses et.

+Case-3)+ Other BKExceptions (if anything requires special attention).

Please see the initial draft proposal where I'm trying to address the cases 
very specifically by introducing diff return codes. One reason for specific 
handling is, these exceptions are known to the AutoRecovery module and can 
esaily build intelligence out of this. Like I can utilize zk watch notification 
mechanism or wait for configured retry intervals to trigger me et. I agree, 
specific handling should not leave any loopholes.

I'd like to see the feedback and if agrees would explore more on this appraoch.

*Proposal:*
Introduce return code while releasing the lock like: 
LedgerUnderreplicationManager#releaseUnderreplicatedLedger(ledgerId, rc)
ReturnCode:REPLICA_EXISTS
ReturnCode:READ_FAILURE
ReturnCode:FAILED
ReturnCode:OK

Based on the rc ZkLedgerUnderreplicationManager can build intelligence to 
handle specific cases like:

Say, ZkLedgerUnderreplicationManager wil maintain a map say visitedLedgers - 
<ReturnCode vs ListOfLedgers>
# *Case-1)* Already have replica(am part of all the ledger fragments)
Add/update collection to represents 'existingLedgers' in 
ZkLedgerUnderreplicationManager and put the entry into 'visitedLedgers'
_RW Thread:_
        step-1) On receving rc, he will add into this list.
        step-2) Add a watcher to this ledger for further cleanups.
        step-3) While getLedgerToRereplicate(), he will use 'existingLedgers' 
and skip this ledger for now. So the unnecessary looping will be avoided for 
this ledger.
_ZK Watcher Thread:_ 
Now on any NodeDeleted/NodeDataChanged, it will remove the ledger from the 
list, considering that if ledger still exists as underreplicated. Now RW will 
be able to recheck again to see any fragments can be rereplicated to me.
# *Case-2)* BKException - BKReadException, BKBookieHandleNotAvailableException
Add/update collection to represents 'errLedgers' in 
ZkLedgerUnderreplicationManager and put the entry into 'visitedLedgers'
_RW Thread:_
        step-1) On receiving rc, he will add into this list.
        step-2) Add a watcher to this ledger for further cleanups.
        step-3) Here the idea is to postpone the ledger replication after some 
interval. Define the next interval, where he needs to consider this ledger to 
check for rereplication. If this errLedger reaches the interval, it will just 
remove it from the 'errLedgers', so that it will be available for rereplication 
phase.
        step-4) While getLedgerToRereplicate(), he will use 'errLedgers' and 
skip this ledger for now. So the unnecessary looping will be avoided for this 
ledger.
_ZK Watcher Thread:_
               Now on any NodeDeleted/NodeDataChanged, it will remove the 
ledger from the list, considering that the ledger can be rechecked again. This 
will occur when the ledger is rereplicated by other guys. Or Auditor has 
reported few more bookie failures for this ledger et.
# *Case-3)* Other BKExceptions (if anything requires special attention).
As of know, I didn't see any extra handling needed for this. It can follow the 
same as Case-2

Thanks,
Rakesh

> Improve ReplicationWorker to handle the urLedgers which already have same 
> leder replica in hand
> -----------------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-733
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-733
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-auto-recovery
>    Affects Versions: 4.2.2, 4.3.0
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>
> +Scenario:+
> Step1 : Have three bookies BK1, BK2, BK3
> Step2 : Have written ledgers with quorum 2
> Step3 : Unfortunately BK2 and BK3 both went down for few moments.
> The following logs are flooded in BK1 autorecovery logs. RW is trying to 
> replicate the ledgers, but it simply skip this fragment and moves to next 
> cycle when it sees a replica found in his hand. IMO, we should have a 
> mechanism in place to avoid unnecessary cycles.
> {code}
> 2014-02-18 21:47:55,140 - ERROR - [New I/O client boss 
> #2-1:PerChannelBookieClient$1@230] - Could not connect to bookie: [id: 
> 0x00ba679e]/10.18.170.130:15002, current state CONNECTING : 
> java.net.ConnectException: Connection refused: no further information
>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>       at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>       at 
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:401)
>       at 
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:370)
>       at 
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:292)
>       at 
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>       at 
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> 2014-02-18 21:47:55,140 - INFO  - 2014-02-18 21:59:33,215 - DEBUG  - 
> [ReplicationWorker:ReplicationWorker@182] - Target 
> Bookie[10.18.170.130:15003] found in the fragment ensemble: 
> [10.18.170.130:15003, 10.18.170.130:15001, 10.18.170.130:15002]
> [ReplicationWorker:PerChannelBookieClient@194] - Connecting to bookie: 
> 10.18.170.130:15002
> 2014-02-18 21:47:56,162 - ERROR - [New I/O client boss 
> #2-1:PerChannelBookieClient$1@230] - Could not connect to bookie: [id: 
> 0x0003f377]/10.18.170.130:15002, current state CONNECTING : 
> java.net.ConnectException: Connection refused: no further information
>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>       at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>       at 
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:401)
>       at 
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:370)
>       at 
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:292)
>       at 
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
>       at 
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>       at java.lang.Thread.run(Thread.java:619)
> 2014-02-18 21:59:33,215 - DEBUG  - [ReplicationWorker:ReplicationWorker@182] 
> - Target Bookie[10.18.170.130:15003] found in the fragment ensemble: 
> [10.18.170.130:15003, 10.18.170.130:15001, 10.18.170.130:15002]
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (BOOKKEEPER-733) Improve ReplicationWorker to handle the urLedgers which already have same leder replica in hand

Reply via email to