Re: [VOTE] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

Chen Liang Thu, 06 Dec 2018 14:41:49 -0800

Hi Daryn,

This is an interesting and valid point to consider different implications
for security.

The purpose of the alignment context is to allow clients and servers sync
on their global state, so that when clients switch between ANN/SBN or
between SBNs, the reads are always consistent. One reason of doing this on
RPC layer so that it is decoupled from client logic. Handlers reinserting
the call to the queue is a part of implementing the catch-up logic in
HDFS-13767 that standby waits until it receives all transactions to catch
up with the client's state.

By using RetriableExceptions, I assume you mean letting client retry if
server state is not ready? We did consider similar approach, but that
introduces multiple RPC calls for a single operation, adding overhead to
RPC queue which is already often a bottleneck as we've seen. To this
extend, even with RetriableException, it appears to me a buggy client can
still hurt NameNode although in a different way.

I agree that calls can potentially get stuck in the queue for a long time,
which can cause serious issues. We do have plans to introduce logic, which
makes Obsrever reject client requests if it has fallen too far behind the
client's state, please see HDFS-13873. Then Observer simply rejects the
call, and lets the client retry with other Observers or go straight to ANN.
This would free the Observer from serving this call and thus limit how much
damage a malicious client can do to it. Secondly, reinserting to queue
should by design only happen on Observer nodes, but never on ANN, so the
damage of potentially bad calls would not affect ANN. Meaning even in an
unlikely case that Observers were overloaded because of buggy/malicious
client calls, due to the rejection logic the clients will all end up
talking to ANN, which is still no worse than what we have today.

Thanks,
Chen

Konstantin Shvachko <shv.had...@gmail.com> 于2018年12月6日周四 上午11:23写道：

> Hi Yongjun,
>
> Automatic failover sure needs to be fixed (see HDFS-14130 and HDFS-13182).
> Along with all other outstanding issues. We plan to continue this on trunk.
> The feature is usable now without this issues (see HDFS-14067).
> And we would like to get it in, so that people could have early access,
> and so that newly developed features were aware of this functionality.
> Let us know if you have other suggestions.
>
> Thanks,
> --Konstantin
>
> On Wed, Dec 5, 2018 at 11:24 PM Yongjun Zhang <yzh...@cloudera.com> wrote:
>
> > Great work guys.
> >
> > Wonder if we can elaborate what's impact of not having #2 fixed, and why
> > #2 is not needed for the feature to complete?
> > 2. Need to fix automatic failover with ZKFC. Currently it does not
> doesn't
> > know about ObserverNodes trying to convert them to SBNs.
> >
> > Thanks.
> > --Yongjun
> >
> >
> > On Wed, Dec 5, 2018 at 5:27 PM Konstantin Shvachko <shv.had...@gmail.com
> >
> > wrote:
> >
> >> Hi Hadoop developers,
> >>
> >> I would like to propose to merge to trunk the feature branch HDFS-12943
> >> for
> >> Consistent Reads from Standby Node. The feature is intended to scale
> read
> >> RPC workloads. On large clusters reads comprise 95% of all RPCs to the
> >> NameNode. We should be able to accommodate higher overall RPC workloads
> >> (up
> >> to 4x by some estimates) by adding multiple ObserverNodes.
> >>
> >> The main functionality has been implemented see sub-tasks of HDFS-12943.
> >> We followed up with the test plan. Testing was done on two independent
> >> clusters (see HDFS-14058 and HDFS-14059) with security enabled.
> >> We ran standard HDFS commands, MR jobs, admin commands including manual
> >> failover.
> >> We know of one cluster running this feature in production.
> >>
> >> There are a few outstanding issues:
> >> 1. Need to provide proper documentation - a user guide for the new
> feature
> >> 2. Need to fix automatic failover with ZKFC. Currently it does not
> doesn't
> >> know about ObserverNodes trying to convert them to SBNs.
> >> 3. Scale testing and performance fine-tuning
> >> 4. As testing progresses, we continue fixing non-critical bugs like
> >> HDFS-14116.
> >>
> >> I attached a unified patch to the umbrella jira for the review and
> Jenkins
> >> build.
> >> Please vote on this thread. The vote will run for 7 days until Wed Dec
> 12.
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >
>

Re: [VOTE] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

Reply via email to