Rakesh R created BOOKKEEPER-733:
-----------------------------------
Summary: Improve ReplicationWorker to handle the urLedgers which
already have same leder replica in hand
Key: BOOKKEEPER-733
URL: https://issues.apache.org/jira/browse/BOOKKEEPER-733
Project: Bookkeeper
Issue Type: Bug
Components: bookkeeper-auto-recovery
Affects Versions: 4.2.2, 4.3.0
Reporter: Rakesh R
Assignee: Rakesh R
+Scenario:+
Step1 : Have three bookies BK1, BK2, BK3
Step2 : Have written ledgers with quorum 2
Step3 : Unfortunately BK2 and BK3 both went down for few moments.
The following logs are flooded in BK1 autorecovery logs. RW is trying to
replicate the ledgers, but it simply skip this fragment and moves to next cycle
when it sees a replica found in his hand. IMO, we should have a mechanism in
place to avoid unnecessary cycles.
{code}
2014-02-18 21:47:55,140 - ERROR - [New I/O client boss
#2-1:PerChannelBookieClient$1@230] - Could not connect to bookie: [id:
0x00ba679e]/10.18.170.130:15002, current state CONNECTING :
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:401)
at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:370)
at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:292)
at
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
2014-02-18 21:47:55,140 - INFO - 2014-02-18 21:59:33,215 - DEBUG -
[ReplicationWorker:ReplicationWorker@182] - Target Bookie[10.18.170.130:15003]
found in the fragment ensemble: [10.18.170.130:15003, 10.18.170.130:15001,
10.18.170.130:15002]
[ReplicationWorker:PerChannelBookieClient@194] - Connecting to bookie:
10.18.170.130:15002
2014-02-18 21:47:56,162 - ERROR - [New I/O client boss
#2-1:PerChannelBookieClient$1@230] - Could not connect to bookie: [id:
0x0003f377]/10.18.170.130:15002, current state CONNECTING :
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:401)
at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:370)
at
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:292)
at
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
2014-02-18 21:59:33,215 - DEBUG - [ReplicationWorker:ReplicationWorker@182] -
Target Bookie[10.18.170.130:15003] found in the fragment ensemble:
[10.18.170.130:15003, 10.18.170.130:15001, 10.18.170.130:15002]
{code}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)