Hi Wei-Chiu,

Thanks for starting the thread and summarizing the problem. Sorry for slow
response.
We've been looking at the encrypted performance as well and are interested
in this effort.
We ran some benchmarks locally. Our benchmarks also showed substantial
penalty for turning on wire encryption on rpc.
Although it was less drastic - more in the range of -40%. But we ran a
different benchmark NNThroughputBenchmark, and we ran it on 2.6 last year.
Could have published the results, but need to rerun on more recent versions.

Three points from me on this discussion:

1. We should settle on the benchmarking tools.
For development RPCCallBenchmark is good as it measures directly the
improvement on the RPC layer. But for external consumption it is more
important to know about e.g. NameNode RPCs performance. So we probably
should run both benchmarks.
2. SASL vs SSL.
Since current implementation is based on SASL, I think it would make sense
to make improvements in this direction. I assume switching to SSL would
require changes in configuration. Not sure if it will be compatible, since
we don't have the details. At this point I would go with HADOOP-10768.
Given all (Daryn's) concerns are addressed.
3. Performance improvement expectations.
Ideally we want to have < 10% penalty for encrypted communication. Anything
over 30% will probably have very limited usability. And there is the gray
area in between, which could be mitigated by allowing mixed encrypted and
un-encrypted RPCs on the single NameNode like in HDFS-13566.

Thanks,
--Konstantin

On Wed, Oct 31, 2018 at 7:39 AM Daryn Sharp <da...@oath.com.invalid> wrote:

> Various KMS tasks have been delaying my RPC encryption work – which is 2nd
> on TODO list.  It's becoming a top priority for us so I'll try my best to
> get a preliminary netty server patch (sans TLS) up this week if that helps.
>
> The two cited jiras had some critical flaws.  Skimming my comments, both
> use blocking IO (obvious nonstarter).  HADOOP-10768 is a hand rolled
> TLS-like encryption which I don't feel is something the community can or
> should maintain from a security standpoint.
>
> Daryn
>
> On Wed, Oct 31, 2018 at 8:43 AM Wei-Chiu Chuang <weic...@apache.org>
> wrote:
>
> > Ping. Any one? Cloudera is interested in moving forward with the RPC
> > encryption improvements, but I just like to get a consensus which
> approach
> > to go with.
> >
> > Otherwise I'll pick HADOOP-10768 since it's ready for commit, and I've
> > spent time on testing it.
> >
> > On Thu, Oct 25, 2018 at 11:04 AM Wei-Chiu Chuang <weic...@apache.org>
> > wrote:
> >
> > > Folks,
> > >
> > > I would like to invite all to discuss the various Hadoop RPC encryption
> > > performance improvements. As you probably know, Hadoop RPC encryption
> > > currently relies on Java SASL, and have _really_ bad performance (in
> > terms
> > > of number of RPCs per second, around 15~20% of the one without SASL)
> > >
> > > There have been some attempts to address this, most notably,
> HADOOP-10768
> > > <https://issues.apache.org/jira/browse/HADOOP-10768> (Optimize Hadoop
> > RPC
> > > encryption performance) and HADOOP-13836
> > > <https://issues.apache.org/jira/browse/HADOOP-13836> (Securing Hadoop
> > RPC
> > > using SSL). But it looks like both attempts have not been progressing.
> > >
> > > During the recent Hadoop contributor meetup, Daryn Sharp mentioned he's
> > > working on another approach that leverages Netty for its SSL
> encryption,
> > > and then integrate Netty with Hadoop RPC so that Hadoop RPC
> automatically
> > > benefits from netty's SSL encryption performance.
> > >
> > > So there are at least 3 attempts to address this issue as I see it. Do
> we
> > > have a consensus that:
> > > 1. this is an important problem
> > > 2. which approach we want to move forward with
> > >
> > > --
> > > A very happy Hadoop contributor
> > >
> >
> >
> > --
> > A very happy Hadoop contributor
> >
>
>
> --
>
> Daryn
>

Reply via email to