Thanks guy. with unclean.leader.election.enable set to false the issue is
fixed

On Tue, Mar 3, 2015 at 2:50 PM, Gwen Shapira <gshap...@cloudera.com> wrote:

> of course :)
> unclean.leader.election.enable
>
> On Mon, Mar 2, 2015 at 9:10 PM, tao xiao <xiaotao...@gmail.com> wrote:
> > How do I achieve point 3? is there a config that I can set?
> >
> > On Tue, Mar 3, 2015 at 1:02 PM, Jiangjie Qin <j...@linkedin.com.invalid>
> > wrote:
> >
> >> The scenario you mentioned is equivalent to an unclean leader election.
> >> The following settings will make sure there is no data loss:
> >> 1. Set replica factor to 3 and minimum ISR size to 2.
> >> 2. When produce, use acks=-1 or acks=all
> >> 3. Disable unclean leader election.
> >>
> >> 1) and 2) Guarantees committed messages will be at least in to brokers.
> >> 3) Means if a broker is not in ISR, it cannot be elected as a leader, so
> >> the log truncate as mentioned earlier will not happen.
> >>
> >> Jiangjie (Becket) Qin
> >>
> >> On 3/2/15, 7:16 PM, "tao xiao" <xiaotao...@gmail.com> wrote:
> >>
> >> >Since I reused the same consumer group to consume the messages after
> step
> >> >6
> >> >data there was no data loss occurred. But if I create a new consumer
> group
> >> >for sure the new consumer will suffer data loss.
> >> >
> >> >I am more concerning about if this is an acceptable behavior by Kafka
> that
> >> >an out of sync broker can be elected as the leader for a partition. Is
> >> >there any mechanism built around Kafka to ensure that only the in-sync
> >> >broker can be chosen to be a leader? If no, what is the best practice
> to
> >> >restart brokers if some of the replicas are out of sync?
> >> >
> >> >On Tue, Mar 3, 2015 at 2:35 AM, Jiangjie Qin <j...@linkedin.com.invalid
> >
> >> >wrote:
> >> >
> >> >> In this case you have data loss. In step 6, when broker 1 comes up,
> it
> >> >> becomes the leader and has log end offset 1000. When broker 0 comes
> up,
> >> >>it
> >> >> becomes follower and will truncate its log to 1000, i.e. 1000
> messages
> >> >> were lost. Next time when the consumer starts, its offset will be
> reset
> >> >>to
> >> >> either the smallest or the largest depending on the setting.
> >> >>
> >> >> Jiangjie (Becket) Qin
> >> >>
> >> >> On 3/2/15, 9:32 AM, "Stuart Reynolds" <s...@stureynolds.com> wrote:
> >> >>
> >> >> >Each topic has:  earliest and latest offsets (per partition)
> >> >> >Each consumer group has a current offset (per topic, partition pair)
> >> >> >
> >> >> >I see -1 for the current offsets new consumer groups that haven't
> yet
> >> >> >committed an offset. I think it means that the offsets for that
> >> >> >consumer group are undefined.
> >> >> >
> >> >> >Is it possible you generated new consumer groups when you restarted
> >> >>your
> >> >> >broker?
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >On Mon, Mar 2, 2015 at 3:15 AM, tao xiao <xiaotao...@gmail.com>
> wrote:
> >> >> >> Hi team,
> >> >> >>
> >> >> >> I have 2 brokers (0 and 1) serving a topic mm-benchmark-test. I
> did
> >> >>some
> >> >> >> tests on the two brokers to verify how leader got elected. Here
> are
> >> >>the
> >> >> >> steps:
> >> >> >>
> >> >> >> 1. started 2 brokers
> >> >> >> 2. created a topic with partition=1 and replication-factor=2. Now
> >> >> >>brokers 1
> >> >> >> was elected as leader
> >> >> >> 3. sent 1000 messages to the topic and consumed from a high level
> >> >> >>consumer
> >> >> >> using zk as the offset storage.
> >> >> >> 4. shutdown broker 1 and now broker 0 was elected as leader
> >> >> >> 5. sent another 1000 messages to topic and consumed again
> >> >> >> 6. completely shutdown broker 0 and then started broker 1. now
> >> >>broker 1
> >> >> >> became the leader
> >> >> >> 7. started broker 0 and ran ConsumerOffsetChecker which showed
> >> >>negative
> >> >> >>lag
> >> >> >> (-1000 in my case)
> >> >> >>
> >> >> >> I think this is because the consumed offset in zk was 2000 and
> >> >>logsize
> >> >> >> retrieved from the leader (broker 1) which missed 1000 messages in
> >> >>step
> >> >> >>5
> >> >> >> in this case was 1000 there -1000 = 1000 - 2000 was given.
> >> >> >>
> >> >> >> Is this a bug or expected behavior?
> >> >> >>
> >> >> >> --
> >> >> >> Regards,
> >> >> >> Tao
> >> >>
> >> >>
> >> >
> >> >
> >> >--
> >> >Regards,
> >> >Tao
> >>
> >>
> >
> >
> > --
> > Regards,
> > Tao
>



-- 
Regards,
Tao

Reply via email to