Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Ismael Juma Fri, 10 Mar 2017 09:14:24 -0800

Hi Dong,

Thanks for the updates, they look good. A couple of comments below.

On Tue, Mar 7, 2017 at 7:30 PM, Dong Lin <[email protected]> wrote:
>
> >
> > 3. Another point regarding operational procedures, with a large enough
> > cluster, disk failures may not be that uncommon. It may be worth
> explaining
> > the recommended procedure if someone needs to do a rolling bounce of a
> > cluster with some bad disks. One option is to simply do the bounce and
> hope
> > that the bad disks are detected during restart, but we know that this is
> > not guaranteed to happen immediately. A better option may be to remove
> the
> > bad log dirs from the broker config until the disk is replaced.
> >
>
> I am not sure if I understand your suggestion here. I think user doesn't
> need to differentiate between log directory failure during rolling bounce
> and log directory failure during runtime. All they need to do is to detect
> and handle log directory failure specified above. And they don't have to
> remove the bad log directory immediately from broker config. The only
> drawback of keeping log directory there is that a new replica may not be
> created on the broker. But the chance of that happening is really low,
> since the controller has to fail in a small window after user initiated the
> topic creation but before it sends LeaderAndIsrRequest with
> is_new_replica=true to the broker. In practice this shouldn't matter.
>

 Let me try to clarify what I mean. The document states that a broker
assumes that a log directory is good if it can read from it when it starts.
So, bouncing a broker with a bad disk without doing anything is a bit
dangerous because it may be considered good again and cause issues due to
slow performance, for example. As Colin pointed out, this is not uncommon.
So, perhaps we should state that it is safer to remove the bad log dir from
the broker config if a bounce is required before the disk is fixed. Does
that make sense?

Sure. I have updated the test description to specify that each broker will
> have two log directories.
>
> The existing test case will actually create 2 topics to validate that
> failed log directory won't affect the good ones. You can find them after
> "Now validate that the previous leader can still serve replicas on the good
> log directories" and "Now validate that the follower can still serve
> replicas on the good log directories".

The current plan suggests creating a second topic after the log directory
has been marked as bad via the permission change. I am suggesting that we
should ideally have more than one topic (or partition) before the log
directory is marked as bad. Both cases are important and should be tested,
in my opinion.

Ismael

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Reply via email to