If a broker crashes and restarts, it will catch up the missing data from
the leader replicas. Normally, when this broker is catching up, it won't be
serving any client requests though. Are you seeing those errors on the
crashed broker? Also, you are not supposed to see OffsetOutOfRangeException
with just one broker failure with 3 replicas. Do you see the following in
the controller log?

"No broker in ISR is alive for ... There's potential data loss."

Thanks,

Jun

On Fri, Jan 3, 2014 at 1:23 AM, Vincent Rischmann <zecmerqu...@gmail.com>wrote:

> Hi all,
>
> We have a cluster of 3 0.8 brokers, and this morning one of the broker
> crashed.
> It is a test broker, and we stored the logs in /tmp/kafka-logs. All topics
> in use are replicated on the three brokers.
>
> You can guess the problem, when the broker rebooted it wiped all the data
> in the logs.
>
> The producers and consumers are fine, but the broker with the wiped data
> keeps generating a lot of exceptions, and I don't really know what to do to
> recover.
>
> Example exception:
>
> [2014-01-03 10:09:47,755] ERROR [KafkaApi-1] Error when processing fetch
> request for partition [topic,0] offset 814798 from consumer with
> correlation id 0 (kafka.server.KafkaApis)
> kafka.common.OffsetOutOfRangeException: Request for offset 814798 but we
> only have log segments in the range 0 to 19372.
>
> There are a lot of them, something like 10+ per second. I (maybe wrongly)
> assumed that the broker would catch up, if that's the case how can I see
> the progress ?
>
> In general, what is the recommended way to bring back a broker with wiped
> data in a cluster ?
>
> Thanks.
>

Reply via email to