Rakesh R created BOOKKEEPER-733:
-----------------------------------

             Summary: Improve ReplicationWorker to handle the urLedgers which 
already have same leder replica in hand
                 Key: BOOKKEEPER-733
                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-733
             Project: Bookkeeper
          Issue Type: Bug
          Components: bookkeeper-auto-recovery
    Affects Versions: 4.2.2, 4.3.0
            Reporter: Rakesh R
            Assignee: Rakesh R


+Scenario:+

Step1 : Have three bookies BK1, BK2, BK3
Step2 : Have written ledgers with quorum 2
Step3 : Unfortunately BK2 and BK3 both went down for few moments.

The following logs are flooded in BK1 autorecovery logs. RW is trying to 
replicate the ledgers, but it simply skip this fragment and moves to next cycle 
when it sees a replica found in his hand. IMO, we should have a mechanism in 
place to avoid unnecessary cycles.

{code}
2014-02-18 21:47:55,140 - ERROR - [New I/O client boss 
#2-1:PerChannelBookieClient$1@230] - Could not connect to bookie: [id: 
0x00ba679e]/10.18.170.130:15002, current state CONNECTING : 
java.net.ConnectException: Connection refused: no further information
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at 
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:401)
        at 
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:370)
        at 
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:292)
        at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)
2014-02-18 21:47:55,140 - INFO  - 2014-02-18 21:59:33,215 - DEBUG  - 
[ReplicationWorker:ReplicationWorker@182] - Target Bookie[10.18.170.130:15003] 
found in the fragment ensemble: [10.18.170.130:15003, 10.18.170.130:15001, 
10.18.170.130:15002]
[ReplicationWorker:PerChannelBookieClient@194] - Connecting to bookie: 
10.18.170.130:15002
2014-02-18 21:47:56,162 - ERROR - [New I/O client boss 
#2-1:PerChannelBookieClient$1@230] - Could not connect to bookie: [id: 
0x0003f377]/10.18.170.130:15002, current state CONNECTING : 
java.net.ConnectException: Connection refused: no further information
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at 
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:401)
        at 
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:370)
        at 
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:292)
        at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)
2014-02-18 21:59:33,215 - DEBUG  - [ReplicationWorker:ReplicationWorker@182] - 
Target Bookie[10.18.170.130:15003] found in the fragment ensemble: 
[10.18.170.130:15003, 10.18.170.130:15001, 10.18.170.130:15002]

{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to