Bryan,

Did you take down some brokers in your cluster while hitting KAFKA-1028? If
yes, you may be hitting KAFKA-1647 also.

Guozhang

On Mon, Oct 20, 2014 at 1:18 PM, Bryan Baugher <bjb...@gmail.com> wrote:

> Hi everyone,
>
> We run a 3 Kafka cluster using 0.8.1.1 with all topics having a replication
> factor of 3 meaning every broker has a replica of every partition.
>
> We recently ran into this issue (
> https://issues.apache.org/jira/browse/KAFKA-1028) and saw data loss within
> Kafka. We understand why it happened and have plans to try to ensure it
> doesn't happen again.
>
> The strange part was that the broker that was chosen for the un-clean
> leader election seemed to drop all of its own data about the partition in
> the process as our monitoring shows the broker offset was reset to 0 for a
> number of partitions.
>
> Following the broker's server logs in chronological order for a particular
> partition that saw data loss I see this,
>
> 2014-10-16 10:18:11,104 INFO kafka.log.Log: Completed load of log TOPIC-6
> with log end offset 528026
>
> 2014-10-16 10:20:18,144 WARN
> kafka.controller.OfflinePartitionLeaderSelector:
> [OfflinePartitionLeaderSelector]: No broker in ISR is alive for [TOPIC,6].
> Elect leader 1 from live brokers 1,2. There's potential data loss.
>
> 2014-10-16 10:20:18,277 WARN kafka.cluster.Partition: Partition [TOPIC,6]
> on broker 1: No checkpointed highwatermark is found for partition [TOPIC,6]
>
> 2014-10-16 10:20:18,698 INFO kafka.log.Log: Truncating log TOPIC-6 to
> offset 0.
>
> 2014-10-16 10:21:18,788 INFO kafka.log.OffsetIndex: Deleting index
> /storage/kafka/00/kafka_data/TOPIC-6/00000000000000528024.index.deleted
>
> 2014-10-16 10:21:18,781 INFO kafka.log.Log: Deleting segment 528024 from
> log TOPIC-6.
>
> I'm not too worried about this since I'm hoping to move to Kafka 0.8.2 ASAP
> but I was curious if anyone could explain this behavior.
>
> -Bryan
>



-- 
-- Guozhang

Reply via email to