Re: Bookkeeper Recovery Issues

Sijie Guo Tue, 29 Nov 2016 10:55:46 -0800

Are you using BookKeeperTools? There is a problem on BookKeeperTools. I
filed a jira for it - https://issues.apache.org/jira/browse/BOOKKEEPER-980


But this tool is not recommended to use. Instead, the bookkeeper shell
(bin/bookkeeper shell) is the recommended tool, as it contains all the
tools that can be used for managing a bookkeeper cluster.

There is a command in bookkeeper shell to recover a bookie - "recover
[-deleteCookie] <bookieSrc> [bookieDest]"

Can you try that and let me if it works for you?

- Sijie


On Tue, Nov 29, 2016 at 5:00 AM, Sebastián Schepens <
[email protected]> wrote:

> Sijie,
>
> Thank you very much!
>
> We will be trying all of this.
>
> We’ve also encountered some issues with the manual recovery tool.
>
>
> First, ti's supposed to be able to take 2 arguments (ZK node and BK node)
> and then choose randomly the node for each replicated ledger, but running
> it with just 2 arguments crashes with and IndexOutOfBounds exception while
> parsing arguments.
>
>
> Secondly, even while specifying the target Bookie to which to replicate
> ledgers, it still defaults to just randomly distribuiting ledgers across
> all nodes.
>
>
> Agian, thank you very much for all the help.
>
> Cheers,
>
> Sebastian
>
> On Mon, Nov 28, 2016 at 5:52 PM Sijie Guo <[email protected]> wrote:
>
>> Sebastian,
>>
>> Sorry for late response. (I didn't have network access over the weekend).
>>
>> Let me try to answer your questions in sequence and then I will suggest a
>> solution for you.
>>
>> *Re "2016-11-25 13:55:36,775 - ERROR - [New I/O boss
>> #9:PerChannelBookieClient$1@**255] - Could not connect to bookie: [id:
>> 0x3dc4f6b8]/10.3.3.41:3181 <http://10.3.3.41:3181>, current state
>> CONNECTING :"*
>>
>> The logging here is expected. It is a bit noisy. I created
>> https://issues.apache.org/jira/browse/BOOKKEEPER-978  to turn that
>> logging into debug.
>>
>> *Re "The strange thing is that 10.3.3.41 is the autorecovery node, it
>> seems to be trying to replicate data to itself, is this ok?"*
>>
>> It is a problem for running autorecovery in separate nodes. As it will
>> use itself as the target bookie. This is a bug and we've fixed at
>> https://github.com/twitter/bookkeeper/commit/
>> fc7e171135a58cddb9e91f5f614b3ceb6f9f9fee . I will port this change to
>> master in https://issues.apache.org/jira/browse/BOOKKEEPER-977
>>
>> *Re "Is it normal for the recovery processes to run out of direct
>> memory?"*
>>
>> I don't think it is normal. I suspected that it might be related to
>> replicate too many ledgers at the same time. Do you mind me asking how many
>> pulsar topics do you have?
>>
>>
>> *Re "*
>> *2016-11-25 18:15:44,792 - INFO  - 
>> [main-EventThread:**LedgerFragmentReplicator@139]
>> - Fragment :Fragment(LedgerID: 14401, FirstEntryID: 0[-1],
>> LastKnownEntryID: -1[-1], Host: 10.3.2.*
>> *158:3181, Closed: true) is split into sub fragments :[Fragment(LedgerID:
>> 14401, FirstEntryID: 0[-1], LastKnownEntryID: -1[-1], Host: 10.3.2.158:3181
>> <http://10.3.2.158:3181>, Closed: true)]*
>> *2016-11-25 18:15:44,792 - ERROR - [main-EventThread:*
>> *LedgerFragmentReplicator$2@251**] - *
>>
>> *BK error reading ledger entry: -1"*
>> You are right. The replicator worker should skip replicating empty
>> ledgers. (https://issues.apache.org/jira/browse/BOOKKEEPER-979) I am
>> going to port the fix today.
>>
>>
>> I would suggest ignoring the logging or set the logging level to WARN for
>> now. Then try to run the recover tool to replicate to a specific bookie
>> (specify the target bookie), or try to run autorecovery as daemon in the
>> bookie server right now. Once I finished porting the fixes for the above
>> jiras, I will let you know and you can try the fixes from master to run
>> autorecovery as separate daemons.
>>
>> Let me know if this helps.
>>
>> - Sijie
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Nov 25, 2016 at 6:19 AM, Sebastián Schepens <sebastian.schepens@
>> mercadolibre.com> wrote:
>>
>> Guys,
>> Still seeing weird behavior.
>> Trying autorecovery on separate nodes, they've been running for a long
>> time just spitting error logs, but apparently moving data, but they never
>> seem to end and eventually throw OutOfDirectMemory errors (we have assigned
>> 20G direct memory to them...).
>>
>> I now have started autorecovery processes of version 4.4 (rest of the
>> cluster is 4.3), they throw the following errors and do not seem to be
>> moving data, all node's IO metrics are flat:
>>
>> 2016-11-25 13:55:36,775 - ERROR - [New I/O boss
>> #9:PerChannelBookieClient$1@255] - Could not connect to bookie: [id:
>> 0x3dc4f6b8]/10.3.3.41:3181, current state CONNECTING :
>> java.net.ConnectException: Connection refused: /10.3.3.41:3181
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at sun.nio.ch.SocketChannelImpl.finishConnect(
>> SocketChannelImpl.java:717)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.
>> connect(NioClientBoss.java:152)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.
>> processSelectedKeys(NioClientBoss.java:105)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.
>> process(NioClientBoss.java:79)
>>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(
>> AbstractNioSelector.java:318)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.run(
>> NioClientBoss.java:42)
>>         at org.jboss.netty.util.ThreadRenamingRunnable.run(
>> ThreadRenamingRunnable.java:108)
>>         at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(
>> DeadLockProofWorker.java:42)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>> 2016-11-25 13:46:46,458 - ERROR - [BookKeeperClientWorker-
>> orderedsafeexecutor-1-0:BookkeeperInternalCallbacks$MultiCallback@125] -
>> Error in multi callback : -8
>> 2016-11-25 13:46:46,458 - ERROR - [BookKeeperClientWorker-
>> orderedsafeexecutor-1-0:LedgerFragmentReplicator$2$1@274] - BK error
>> writing entry for ledgerId: 9449, entryId: 2536703, b
>> ookie: 10.3.3.41:3181
>> org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableExce
>> ption
>>         at org.apache.bookkeeper.client.BKException.create(
>> BKException.java:62)
>>         at org.apache.bookkeeper.client.LedgerFragmentReplicator$2$1.
>> writeComplete(LedgerFragmentReplicator.java:279)
>>         at org.apache.bookkeeper.proto.BookieClient$1$1.safeRun(
>> BookieClient.java:189)
>>         at org.apache.bookkeeper.util.SafeRunnable.run(SafeRunnable.
>> java:31)
>>         at java.util.concurrent.Executors$RunnableAdapter.
>> call(Executors.java:511)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>> The strange thing is that 10.3.3.41 is the autorecovery node, it seems to
>> be trying to replicate data to itself, is this ok?
>>
>> It's also attempting to connect to the failed node:
>>
>> 2016-11-25 13:55:36,777 - ERROR - [New I/O boss
>> #9:PerChannelBookieClient$1@255] - Could not connect to bookie: [id:
>> 0xb1b27b5b]/10.3.2.158:3181, current state CONNECTING :
>> java.net.ConnectException: Connection refused: /10.3.2.158:3181
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at sun.nio.ch.SocketChannelImpl.finishConnect(
>> SocketChannelImpl.java:717)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.
>> connect(NioClientBoss.java:152)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.
>> processSelectedKeys(NioClientBoss.java:105)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.
>> process(NioClientBoss.java:79)
>>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(
>> AbstractNioSelector.java:318)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.run(
>> NioClientBoss.java:42)
>>         at org.jboss.netty.util.ThreadRenamingRunnable.run(
>> ThreadRenamingRunnable.java:108)
>>         at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(
>> DeadLockProofWorker.java:42)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>> If I execute a manual recovery I get the same errors:
>> 2016-11-25 14:16:56,438 - ERROR - [New I/O boss
>> #9:PerChannelBookieClient$1@255] - Could not connect to bookie: [id:
>> 0x3e258d5b]/10.3.2.158:3181, current state CONNECTING :
>> java.net.ConnectException: Connection refused: /10.3.2.158:3181
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at sun.nio.ch.SocketChannelImpl.finishConnect(
>> SocketChannelImpl.java:717)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.
>> connect(NioClientBoss.java:152)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.
>> processSelectedKeys(NioClientBoss.java:105)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.
>> process(NioClientBoss.java:79)
>>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(
>> AbstractNioSelector.java:318)
>>         at org.jboss.netty.channel.socket.nio.NioClientBoss.run(
>> NioClientBoss.java:42)
>>         at org.jboss.netty.util.ThreadRenamingRunnable.run(
>> ThreadRenamingRunnable.java:108)
>>         at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(
>> DeadLockProofWorker.java:42)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>> But it seems to be moving data now.
>>
>> Could you guys give me some insight?
>> Are these logs normal? If so, why are they level error? Why is the
>> autorecovery process trying to connect to itself and not replicating data
>> to other nodes? Is there a way to know the progress of the recovery
>> process? Is it normal for the recovery processes to run out of direct
>> memory?
>>
>> Thanks,
>> Sebastian
>>
>>
>> On Wed, Nov 23, 2016 at 11:33 AM Sebastián Schepens <sebastian.schepens@
>> mercadolibre.com> wrote:
>>
>> Sijie,
>> Yes, that's precisely what I meant, we're running separate autorecovery
>> processes, not daemons on all nodes.
>>
>> Autorecovery processes run quietly until I stop a node, as soon as I stop
>> a node, they're plagued with logs like the following, where the stopped
>> node is (10.3.2.56):
>>
>> 2016-11-22 17:34:06,085 - ERROR - [bookkeeper-io-1-1:
>> PerChannelBookieClient$2@284] - Could not connect to bookie: [id:
>> 0xd3b0c759, L:/10.3.3.42:45164]/10.3.2.56:3181, current state CONNECTING
>> :
>> java.net.ConnectException: syscall:getsockopt(...): /10.3.2.56:3181
>>
>> There seems to be waves of thousands and thousands of these logs while
>> some data movement seems to be occurring, but it's really weird that it's
>> constantly trying to connect to the failed node.
>> Couldn't it realize it's down because it's not shown as available on
>> zookeeper?
>>
>> We also see a couple of this logs, but really few of them compared to the
>> previous.
>>
>> 2016-11-23 14:28:01,661 - WARN  - [ReplicationWorker:
>> RackawareEnsemblePlacementPolicy@543] - Failed to choose a bookie:
>> excluded [<Bookie:10.3.2.57:3181>, <Bookie:10.3.2.195:3181>
>> , <Bookie:10.3.2.158:3181>], fallback to choose bookie randomly from the
>> cluster.
>>
>>
>> The cluster currently has 6 nodes, and as I said before we're using
>> ensemble size 3, write quorum 3 and ack quorum 2.
>>
>> Thanks,
>> Sebastian
>>
>> On Tue, Nov 22, 2016 at 2:10 PM Sijie Guo <[email protected]> wrote:
>>
>> I think what Sebastian said is that manual recovery didn't even work.
>> This seems to a bit strange to me. The autorecovery will check if the
>> bookie is available or not. After that, it should rereplicate the data from
>> other nodes in the ensemble. This seems to indicate something is broken.
>> Sebastian, Can you point us some loggings?
>>
>> Sijie
>>
>> On Nov 19, 2016 9:46 AM, "Rithin Shetty" <[email protected]> wrote:
>>
>> A few things to note: Make sure 'autoRecoveryDaemonEnabled' set to true
>> on all the bookie conf files; by default this is false. Otherwise recovery
>> will not work. The auto recovery process tries to make sure that the data
>> doesn't exist on the source node before replicating to destination. That
>> might be the reason why it is trying to talk to the source node.
>>
>> --Rithin
>>
>> On Fri, Nov 18, 2016 at 12:00 PM, Sebastián Schepens <sebastian.schepens@
>> mercadolibre.com> wrote:
>>
>> Hi guys,
>> I'm running into some issues while trying to recover a decomissioned node.
>> Both the recovery command and autorecovery processes fail trying to
>> connect to the failing node, which seems reasonable because the node is
>> down.
>> But I don't get why it's trying to connect to that node, according to the
>> documentation it should pull ledger data from other nodes in the ensemble
>> (3) and replicate them.
>> The recovery command also seems to completely ignore the destination node
>> given as third argument.
>>
>> Could someone give us some help?
>> Thanks,
>> Sebastian
>>
>>
>>
>>

Re: Bookkeeper Recovery Issues

Reply via email to