Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Dong Lin Wed, 15 Mar 2017 10:51:19 -0700

Hey Ismael,

Sure, I have updated "Changes in Operational Procedures" section in KIP-113
to specify the problem and solution with known disk failure. And I updated
the "Test Plan" section to note that we have test in KIP-113 to verify that
replicas already created on the good log directories will not be affected
by failure of other log directories.


Please let me know if there is any other improvement I can make. Thanks for
your comment.

Dong


On Wed, Mar 15, 2017 at 3:18 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Dong,
>
> Yes, that sounds good to me. I'd list option 2 first since that is safe
> and, as you said, no worse than what happens today. The file approach is a
> bit hacky as you said, so it may be a bit fragile. Not sure if we really
> want to mention that. :)
>
> About the note in KIP-112 versus adding the test in KIP-113, I think it
> would make sense to add a short sentence stating that this scenario is
> covered in KIP-113. People won't necessarily read both KIPs at the same
> time and it's helpful to cross-reference when it makes sense.
>
> Thanks for your work on this.
>
> Ismael
>
> On Tue, Mar 14, 2017 at 11:00 PM, Dong Lin <lindon...@gmail.com> wrote:
>
> > Hey Ismael,
> >
> > I get your concern that it is more likely for a disk to be slow, or
> exhibit
> > other forms of non-fatal symptom, after some known fatal error. Then it
> is
> > weird for user to start broker with the likely-problematic disk in the
> > broker config. In that case, I think there are two things user can do:
> >
> > 1) Intentionally change the log directory in the config to point to a
> file.
> > This is a bit hacky but it works well before we make more-appropriate
> > long-term change in Kafka to handle this case.
> > 2) Just don't start broker with bad log directories. Always fix disk
> before
> > restarting the broker. This is a safe approach that is no worse than
> > current practice.
> >
> > Would this address your concern if I specify the problem and the two
> > solutions in the KIP?
> >
> > Thanks,
> > Dong
> >
> > On Tue, Mar 14, 2017 at 3:29 PM, Dong Lin <lindon...@gmail.com> wrote:
> >
> > > Hey Ismael,
> > >
> > > Thanks for the comment. Please see my reply below.
> > >
> > > On Tue, Mar 14, 2017 at 10:31 AM, Ismael Juma <ism...@juma.me.uk>
> wrote:
> > >
> > >> Thanks Dong. Comments inline.
> > >>
> > >> On Fri, Mar 10, 2017 at 6:25 PM, Dong Lin <lindon...@gmail.com>
> wrote:
> > >> >
> > >> > I get your point. But I am not sure we should recommend user to
> simply
> > >> > remove disk from the broker config. If user simply does this without
> > >> > checking the utilization of good disks, replica on the bad disk will
> > be
> > >> > re-created on the good disk and may overload the good disks, causing
> > >> > cascading failure.
> > >> >
> > >>
> > >> Good point.
> > >>
> > >>
> > >> >
> > >> > I agree with you and Colin that slow disk may cause problem.
> However,
> > >> > performance degradation due to slow disk this is an existing problem
> > >> that
> > >> > is not detected or handled by Kafka or KIP-112.
> > >>
> > >>
> > >> I think an important difference is that a number of disk errors are
> > >> currently fatal and won't be after KIP-112. So it introduces new
> > scenarios
> > >> (for example, bouncing a broker that is working fine although some
> disks
> > >> have been marked bad).
> > >>
> > >
> > > Hmm.. I am still trying to understand why KIP-112 creates new
> scenarios.
> > > Slow disk is not considered fatal error and won't be caught by either
> > > existing Kafka design or this KIP. If any disk is marked bad, it means
> > > broker encounters IOException when accessing disk, most likely the
> broker
> > > will encounter IOException again when accessing this disk and mark this
> > > disk as bad after bounce. I guess you are talking about the case that a
> > > disk is marked bad, broker is bounced, then the disk provides degraded
> > > performance without being marked bad, right? But this seems to be an
> > > existing problem we already have today with slow disk.
> > >
> > > Here are the possible scenarios with bad disk after broker bounce:
> > >
> > > 1) bad disk -> broker bounce -> good disk. This would be great.
> > > 2) bad disk -> broker bounce -> slow disk. Slow disk is an existing
> > > problem that is not addressed by Kafka today.
> > > 3) bad disk -> broker bounce -> bad disk. This is handled by this KIP
> > such
> > > that only replicas on the bad disk become offline.
> > >
> > >
> > >>
> > >> > Detection and handling of
> > >> > slow disk is a separate problem that needs to be addressed in a
> future
> > >> KIP.
> > >> > It is currently listed in the future work. Does this sound OK?
> > >> >
> > >>
> > >> I'm OK with it being handled in the future. In the meantime, I was
> just
> > >> hoping that we can make it clear to users about the potential issue
> of a
> > >> disk marked as bad becoming good again after a bounce (which can be
> > >> dangerous).
> > >>
> > >> The main benefit of creating the second topic after log directory goes
> > >> > offline is that we can make sure the second topic is created on the
> > good
> > >> > log directory. I am not sure we can simply assume that the first
> topic
> > >> will
> > >> > always be created on the first log directory in the broker config
> and
> > >> the
> > >> > second topic will be created on the second log directory in the
> broker
> > >> > config.
> > >>
> > >>
> > >>
> > >> > However, I can add this test in KIP-113 which allows user to
> > >> > re-assign replica to specific log directory of a broker. Is this OK?
> > >> >
> > >>
> > >> OK. Please add a note to KIP-112 about this as well (so that it's
> clear
> > >> why
> > >> we only do it in KIP-113).
> > >>
> > >
> > > Sure. Instead of adding note to KIP-112, I have added test in KIP-113
> to
> > > verify that bad log directories discovered during runtime would not
> > affect
> > > replicas on the good log directories. Does this address the problem?
> > >
> > >
> > >> Ismael
> > >>
> > >
> > >
> >
>

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Reply via email to