[jira] [Created] (HADOOP-13013) Introduce Apache Kerby into Hadoop

2016-04-11 Thread Jiajia Li (JIRA)
Jiajia Li created HADOOP-13013:
--

 Summary: Introduce Apache Kerby into Hadoop
 Key: HADOOP-13013
 URL: https://issues.apache.org/jira/browse/HADOOP-13013
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Jiajia Li
Assignee: Jiajia Li


As discussed in the mailing list, we’d like to introduce Apache Kerby into 
Hadoop. Apache Kerby is a Kerberos centric project and aims to provide a first 
Java Kerberos library that contains both client and server supports. The 
relevant features include:
It supports full Kerberos encryption types aligned with both MIT KDC and MS AD; 
Client APIs to allow to login via password, credential cache, keytab file and 
etc.; Utilities for generate, operate and inspect keytab and credential cache 
files; A simple KDC server that borrows some ideas from Hadoop-MiniKDC and can 
be used in tests but with minimal overhead in external dependencies; A brand 
new token mechanism is provided, can be experimentally used, using it a JWT 
token can be used to exchange a TGT or service ticket; Anonymous PKINIT 
support, can be experientially used, as the first Java library that supports 
the Kerberos major extension.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Introduce Apache Kerby to Hadoop

2016-02-29 Thread Haohui Mai
Handling Kerberos is similar to what we have done for WebHDFS now. Kerby
will be in the picture but things are much simpler.

If protobuf is a concern, why not shading it into hadoop-common? The
generated binaries might not be compatible but the wire format is.


On Mon, Feb 29, 2016 at 1:55 AM Steve Loughran <ste...@hortonworks.com>
wrote:

>
> > On 27 Feb 2016, at 19:02, Haohui Mai <ricet...@gmail.com> wrote:
> >
> > Have we evaluated GRPC? A robust RPC requires significant effort.
> Migrating
> > to GRPC can save ourselves a lot of headache.
> >
>
> That's the google protobuf 3 based GRPC? More specifically,
> protobufVersion = '3.0.0-beta-2'?
>
> That's successor to the protobuf.jar whose Alejandro-choreographed
> cross-project upgrade caused the "great protobuf upgrade of 2013"? That's
> the protobuf library where some of us have seriously considered forking the
> library so that we could have a version of protobuf which would link across
> java classes generated with older versions?
>
> We have enough problems working with a released version of protobuf
> breaking across minor point releases, whose guava JARs are a recurrent
> source of cross version compatibility pain?
>
>
> I would rather stab myself in the leg with a fork —repeatedly— than adopt
> something based on a beta-release of a google artifact as critical path of
> the Hadoop RPC chain.
>
> While google are pretty obsessive about wire format compatibility across
> languages and versions, we just can't trust google to maintain binary
> compatibility, primarily due to a build process which clean builds
> everything from scratch. They don't have the same problem of trying to
> nudge things up across a loosely coupled set of projects, including those
> who still have requirements of JAR-sharing compatibility with older hadoop
> versions. Indeed, for those projects, being backwards compatible with
> Hadoop 1.x (no protobuf) is easier than working with Hadoop 2.205, purely
> due to to that protobuf difference.
>
>
>  Even when protobuf 3.0 finally ships, we should hold back even adopting
> it for its current role until 3.1 comes out so we can asses google's
> compatibility policy in the 3.x line.
>
>
> > Haohui
> > On Sat, Feb 27, 2016 at 1:35 AM Andrew Purtell <andrew.purt...@gmail.com
> >
> > wrote:
> >
> >> I get a excited thinking about the prospect of better performance with
> >> auth-conf QoP. HBase RPC is an increasingly distant fork but still close
> >> enough to Hadoop in that respect. Our bulk data transfer protocol isn't
> a
> >> separate thing like in HDFS, which avoids a SASL wrapped
> implementation, so
> >> we really suffer when auth-conf is negotiated. You'll see the same
> impact
> >> where there might be a high frequency of NameNode RPC calls or similar
> >> still. Throughput drops 3-4x, or worse.
> >>
> >>> On Feb 22, 2016, at 4:56 PM, Zheng, Kai <kai.zh...@intel.com> wrote:
> >>>
> >>> Thanks for the confirm and further inputs, Steve.
> >>>
> >>>>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> >>> Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby
> >> can help with, it's possible because we may hook Chimera or AES-NI thing
> >> into the Kerberos layer by leveraging the Kerberos library. As it may be
> >> noted, HADOOP-12725 is on the going for this aspect. There may be good
> >> result and further update on this recently.
> >>>
> >>>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
> >> how it works.
> >>> Yes, starting with this initial steps upgrading MiniKDC to use Kerby is
> >> the right thing we could do. After some interactions with Kerby
> project, we
> >> may have more ideas how to proceed on the followings.
> >>>
> >>>>> Long term, I'd like Hadoop 3 to be Kerby-ized
> >>> This sounds great! With necessary support from the community like
> >> feedback and patch reviewing, we can speed up the related work.
> >>>
> >>> Regards,
> >>> Kai
> >>>
> >>> -Original Message-
> >>> From: Steve Loughran [mailto:ste...@hortonworks.com]
> >>> Sent: Monday, February 22, 2016 6:51 PM
> >>> To: common-dev@hadoop.apache.org
> >>> Subject: Re: Introduce Apache Kerby to Hadoop
> >>>
> >>>
> >>>
> >>> I've discussed this offline with Kai, as part of the "let's fix
> >> kerberos&

Re: Introduce Apache Kerby to Hadoop

2016-02-29 Thread Steve Loughran

> On 27 Feb 2016, at 19:02, Haohui Mai <ricet...@gmail.com> wrote:
> 
> Have we evaluated GRPC? A robust RPC requires significant effort. Migrating
> to GRPC can save ourselves a lot of headache.
> 

That's the google protobuf 3 based GRPC? More specifically, protobufVersion = 
'3.0.0-beta-2'? 

That's successor to the protobuf.jar whose Alejandro-choreographed 
cross-project upgrade caused the "great protobuf upgrade of 2013"? That's the 
protobuf library where some of us have seriously considered forking the library 
so that we could have a version of protobuf which would link across java 
classes generated with older versions? 

We have enough problems working with a released version of protobuf breaking 
across minor point releases, whose guava JARs are a recurrent source of cross 
version compatibility pain?


I would rather stab myself in the leg with a fork —repeatedly— than adopt 
something based on a beta-release of a google artifact as critical path of the 
Hadoop RPC chain.

While google are pretty obsessive about wire format compatibility across 
languages and versions, we just can't trust google to maintain binary 
compatibility, primarily due to a build process which clean builds everything 
from scratch. They don't have the same problem of trying to nudge things up 
across a loosely coupled set of projects, including those who still have 
requirements of JAR-sharing compatibility with older hadoop versions. Indeed, 
for those projects, being backwards compatible with Hadoop 1.x (no protobuf) is 
easier than working with Hadoop 2.205, purely due to to that protobuf 
difference.


 Even when protobuf 3.0 finally ships, we should hold back even adopting it for 
its current role until 3.1 comes out so we can asses google's compatibility 
policy in the 3.x line.


> Haohui
> On Sat, Feb 27, 2016 at 1:35 AM Andrew Purtell <andrew.purt...@gmail.com>
> wrote:
> 
>> I get a excited thinking about the prospect of better performance with
>> auth-conf QoP. HBase RPC is an increasingly distant fork but still close
>> enough to Hadoop in that respect. Our bulk data transfer protocol isn't a
>> separate thing like in HDFS, which avoids a SASL wrapped implementation, so
>> we really suffer when auth-conf is negotiated. You'll see the same impact
>> where there might be a high frequency of NameNode RPC calls or similar
>> still. Throughput drops 3-4x, or worse.
>> 
>>> On Feb 22, 2016, at 4:56 PM, Zheng, Kai <kai.zh...@intel.com> wrote:
>>> 
>>> Thanks for the confirm and further inputs, Steve.
>>> 
>>>>> the latter would dramatically reduce the cost of wire-encrypting IPC.
>>> Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby
>> can help with, it's possible because we may hook Chimera or AES-NI thing
>> into the Kerberos layer by leveraging the Kerberos library. As it may be
>> noted, HADOOP-12725 is on the going for this aspect. There may be good
>> result and further update on this recently.
>>> 
>>>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
>> how it works.
>>> Yes, starting with this initial steps upgrading MiniKDC to use Kerby is
>> the right thing we could do. After some interactions with Kerby project, we
>> may have more ideas how to proceed on the followings.
>>> 
>>>>> Long term, I'd like Hadoop 3 to be Kerby-ized
>>> This sounds great! With necessary support from the community like
>> feedback and patch reviewing, we can speed up the related work.
>>> 
>>> Regards,
>>> Kai
>>> 
>>> -Original Message-
>>> From: Steve Loughran [mailto:ste...@hortonworks.com]
>>> Sent: Monday, February 22, 2016 6:51 PM
>>> To: common-dev@hadoop.apache.org
>>> Subject: Re: Introduce Apache Kerby to Hadoop
>>> 
>>> 
>>> 
>>> I've discussed this offline with Kai, as part of the "let's fix
>> kerberos" project. Not only is it a better Kerberos engine, we can do more
>> diagnostics, get better algorithms and ultimately get better APIs for doing
>> Kerberos and SASL —the latter would dramatically reduce the cost of
>> wire-encrypting IPC.
>>> 
>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
>> how it works.
>>> 
>>> Long term, I'd like Hadoop 3 to be Kerby-ized
>>> 
>>> 
>>>> On 22 Feb 2016, at 06:41, Zheng, Kai <kai.zh...@intel.com> wrote:
>>>> 
>>>> Hi folks,
>>>> 
>>>> I'd like to mention Apache Kerby [1] here to the community and propose
>> to introduce the project to Hadoop, a sub

RE: Introduce Apache Kerby to Hadoop

2016-02-27 Thread Zheng, Kai
Hi Haohui,

I'm glad to know GRPC and it sounds cool. I think it's a good proposal to 
suggest Hadoop IPC/RPC upgrading to GRPC. 

We haven't evaluated GRPC for the question of RPC encryption optimization 
because it's another story. It's not an overlap for the optimization work 
because even if we use GRPC, the RPC protocol messages still need to go through 
the stack of SASL/GSSAPI/Kerberos. What's desired here is not to re-implement 
any RPC layer, or the stack, but is to optimize the stack, by possibly 
implementing and plugin-ing new SASL or GSSAPI mechanism. Hope this clarifying 
helps. Thanks.

Regards,
Kai

-Original Message-
From: Haohui Mai [mailto:ricet...@gmail.com] 
Sent: Sunday, February 28, 2016 3:02 AM
To: common-dev@hadoop.apache.org
Subject: Re: Introduce Apache Kerby to Hadoop

Have we evaluated GRPC? A robust RPC requires significant effort. Migrating to 
GRPC can save ourselves a lot of headache.

Haohui
On Sat, Feb 27, 2016 at 1:35 AM Andrew Purtell <andrew.purt...@gmail.com>
wrote:

> I get a excited thinking about the prospect of better performance with 
> auth-conf QoP. HBase RPC is an increasingly distant fork but still 
> close enough to Hadoop in that respect. Our bulk data transfer 
> protocol isn't a separate thing like in HDFS, which avoids a SASL 
> wrapped implementation, so we really suffer when auth-conf is 
> negotiated. You'll see the same impact where there might be a high 
> frequency of NameNode RPC calls or similar still. Throughput drops 3-4x, or 
> worse.
>
> > On Feb 22, 2016, at 4:56 PM, Zheng, Kai <kai.zh...@intel.com> wrote:
> >
> > Thanks for the confirm and further inputs, Steve.
> >
> >>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> > Yes to optimize Hadoop IPC/RPC encryption is another opportunity 
> > Kerby
> can help with, it's possible because we may hook Chimera or AES-NI 
> thing into the Kerberos layer by leveraging the Kerberos library. As 
> it may be noted, HADOOP-12725 is on the going for this aspect. There 
> may be good result and further update on this recently.
> >
> >>> For now, I'd like to see basic steps -upgrading minkdc to krypto, 
> >>> see
> how it works.
> > Yes, starting with this initial steps upgrading MiniKDC to use Kerby 
> > is
> the right thing we could do. After some interactions with Kerby 
> project, we may have more ideas how to proceed on the followings.
> >
> >>> Long term, I'd like Hadoop 3 to be Kerby-ized
> > This sounds great! With necessary support from the community like
> feedback and patch reviewing, we can speed up the related work.
> >
> > Regards,
> > Kai
> >
> > -Original Message-
> > From: Steve Loughran [mailto:ste...@hortonworks.com]
> > Sent: Monday, February 22, 2016 6:51 PM
> > To: common-dev@hadoop.apache.org
> > Subject: Re: Introduce Apache Kerby to Hadoop
> >
> >
> >
> > I've discussed this offline with Kai, as part of the "let's fix
> kerberos" project. Not only is it a better Kerberos engine, we can do 
> more diagnostics, get better algorithms and ultimately get better APIs 
> for doing Kerberos and SASL —the latter would dramatically reduce the 
> cost of wire-encrypting IPC.
> >
> > For now, I'd like to see basic steps -upgrading minkdc to krypto, 
> > see
> how it works.
> >
> > Long term, I'd like Hadoop 3 to be Kerby-ized
> >
> >
> >> On 22 Feb 2016, at 06:41, Zheng, Kai <kai.zh...@intel.com> wrote:
> >>
> >> Hi folks,
> >>
> >> I'd like to mention Apache Kerby [1] here to the community and 
> >> propose
> to introduce the project to Hadoop, a sub project of Apache Directory 
> project.
> >>
> >> Apache Kerby is a Kerberos centric project and aims to provide a 
> >> first
> Java Kerberos library that contains both client and server supports. 
> The relevant features include:
> >> It supports full Kerberos encryption types aligned with both MIT 
> >> KDC and MS AD; Client APIs to allow to login via password, 
> >> credential cache, keytab file and etc.; Utilities for generate, 
> >> operate and inspect keytab and credential cache files; A simple KDC 
> >> server that borrows some ideas from Hadoop-MiniKDC and can be used 
> >> in tests but with minimal overhead in external dependencies; A 
> >> brand new token
> mechanism is provided, can be experimentally used, using it a JWT 
> token can be used to exchange a TGT or service ticket; Anonymous 
> PKINIT support, can be experientially used, as the first Java library 
> that supports the Kerberos major ext

RE: Introduce Apache Kerby to Hadoop

2016-02-27 Thread Zheng, Kai
Thanks Andrew for the update on HBase side!

>> Throughput drops 3-4x, or worse.
Hopefully we can avoid much of the encryption overhead. We're prototyping a 
solution working on that.

Regards,
Kai

-Original Message-
From: Andrew Purtell [mailto:andrew.purt...@gmail.com] 
Sent: Saturday, February 27, 2016 5:35 PM
To: common-dev@hadoop.apache.org
Subject: Re: Introduce Apache Kerby to Hadoop

I get a excited thinking about the prospect of better performance with 
auth-conf QoP. HBase RPC is an increasingly distant fork but still close enough 
to Hadoop in that respect. Our bulk data transfer protocol isn't a separate 
thing like in HDFS, which avoids a SASL wrapped implementation, so we really 
suffer when auth-conf is negotiated. You'll see the same impact where there 
might be a high frequency of NameNode RPC calls or similar still. Throughput 
drops 3-4x, or worse. 

> On Feb 22, 2016, at 4:56 PM, Zheng, Kai <kai.zh...@intel.com> wrote:
> 
> Thanks for the confirm and further inputs, Steve. 
> 
>>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby can 
> help with, it's possible because we may hook Chimera or AES-NI thing into the 
> Kerberos layer by leveraging the Kerberos library. As it may be noted, 
> HADOOP-12725 is on the going for this aspect. There may be good result and 
> further update on this recently.
> 
>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how 
>>> it works.
> Yes, starting with this initial steps upgrading MiniKDC to use Kerby is the 
> right thing we could do. After some interactions with Kerby project, we may 
> have more ideas how to proceed on the followings.
> 
>>> Long term, I'd like Hadoop 3 to be Kerby-ized
> This sounds great! With necessary support from the community like feedback 
> and patch reviewing, we can speed up the related work.
> 
> Regards,
> Kai
> 
> -Original Message-
> From: Steve Loughran [mailto:ste...@hortonworks.com]
> Sent: Monday, February 22, 2016 6:51 PM
> To: common-dev@hadoop.apache.org
> Subject: Re: Introduce Apache Kerby to Hadoop
> 
> 
> 
> I've discussed this offline with Kai, as part of the "let's fix kerberos" 
> project. Not only is it a better Kerberos engine, we can do more diagnostics, 
> get better algorithms and ultimately get better APIs for doing Kerberos and 
> SASL —the latter would dramatically reduce the cost of wire-encrypting IPC.
> 
> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it 
> works.
> 
> Long term, I'd like Hadoop 3 to be Kerby-ized
> 
> 
>> On 22 Feb 2016, at 06:41, Zheng, Kai <kai.zh...@intel.com> wrote:
>> 
>> Hi folks,
>> 
>> I'd like to mention Apache Kerby [1] here to the community and propose to 
>> introduce the project to Hadoop, a sub project of Apache Directory project.
>> 
>> Apache Kerby is a Kerberos centric project and aims to provide a first Java 
>> Kerberos library that contains both client and server supports. The relevant 
>> features include:
>> It supports full Kerberos encryption types aligned with both MIT KDC 
>> and MS AD; Client APIs to allow to login via password, credential 
>> cache, keytab file and etc.; Utilities for generate, operate and 
>> inspect keytab and credential cache files; A simple KDC server that 
>> borrows some ideas from Hadoop-MiniKDC and can be used in tests but 
>> with minimal overhead in external dependencies; A brand new token mechanism 
>> is provided, can be experimentally used, using it a JWT token can be used to 
>> exchange a TGT or service ticket; Anonymous PKINIT support, can be 
>> experientially used, as the first Java library that supports the Kerberos 
>> major extension.
>> 
>> The project stands alone and is ensured to only depend on JRE for easier 
>> usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is 
>> upcoming.
>> 
>> 
>> As an initial step, this proposal suggests using Apache Kerby to upgrade the 
>> existing codes related to ApacheDS for the Kerberos support. The 
>> advantageous:
>> 
>> 1. The kerby-kerb library is all the need, which is purely in Java, 
>> SLF4J is the only dependency, the whole is rather small;
>> 
>> 2. There is a SimpleKDC in the library for test usage, which borrowed 
>> the MiniKDC idea and implemented all the support existing in MiniKDC.
>> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works 
>> fine;
>> 
>> 3. Full Kerberos encryption types (many of them are not available in 
>> JRE but supported 

Re: Introduce Apache Kerby to Hadoop

2016-02-27 Thread Haohui Mai
Have we evaluated GRPC? A robust RPC requires significant effort. Migrating
to GRPC can save ourselves a lot of headache.

Haohui
On Sat, Feb 27, 2016 at 1:35 AM Andrew Purtell <andrew.purt...@gmail.com>
wrote:

> I get a excited thinking about the prospect of better performance with
> auth-conf QoP. HBase RPC is an increasingly distant fork but still close
> enough to Hadoop in that respect. Our bulk data transfer protocol isn't a
> separate thing like in HDFS, which avoids a SASL wrapped implementation, so
> we really suffer when auth-conf is negotiated. You'll see the same impact
> where there might be a high frequency of NameNode RPC calls or similar
> still. Throughput drops 3-4x, or worse.
>
> > On Feb 22, 2016, at 4:56 PM, Zheng, Kai <kai.zh...@intel.com> wrote:
> >
> > Thanks for the confirm and further inputs, Steve.
> >
> >>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> > Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby
> can help with, it's possible because we may hook Chimera or AES-NI thing
> into the Kerberos layer by leveraging the Kerberos library. As it may be
> noted, HADOOP-12725 is on the going for this aspect. There may be good
> result and further update on this recently.
> >
> >>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see
> how it works.
> > Yes, starting with this initial steps upgrading MiniKDC to use Kerby is
> the right thing we could do. After some interactions with Kerby project, we
> may have more ideas how to proceed on the followings.
> >
> >>> Long term, I'd like Hadoop 3 to be Kerby-ized
> > This sounds great! With necessary support from the community like
> feedback and patch reviewing, we can speed up the related work.
> >
> > Regards,
> > Kai
> >
> > -----Original Message-----
> > From: Steve Loughran [mailto:ste...@hortonworks.com]
> > Sent: Monday, February 22, 2016 6:51 PM
> > To: common-dev@hadoop.apache.org
> > Subject: Re: Introduce Apache Kerby to Hadoop
> >
> >
> >
> > I've discussed this offline with Kai, as part of the "let's fix
> kerberos" project. Not only is it a better Kerberos engine, we can do more
> diagnostics, get better algorithms and ultimately get better APIs for doing
> Kerberos and SASL —the latter would dramatically reduce the cost of
> wire-encrypting IPC.
> >
> > For now, I'd like to see basic steps -upgrading minkdc to krypto, see
> how it works.
> >
> > Long term, I'd like Hadoop 3 to be Kerby-ized
> >
> >
> >> On 22 Feb 2016, at 06:41, Zheng, Kai <kai.zh...@intel.com> wrote:
> >>
> >> Hi folks,
> >>
> >> I'd like to mention Apache Kerby [1] here to the community and propose
> to introduce the project to Hadoop, a sub project of Apache Directory
> project.
> >>
> >> Apache Kerby is a Kerberos centric project and aims to provide a first
> Java Kerberos library that contains both client and server supports. The
> relevant features include:
> >> It supports full Kerberos encryption types aligned with both MIT KDC
> >> and MS AD; Client APIs to allow to login via password, credential
> >> cache, keytab file and etc.; Utilities for generate, operate and
> >> inspect keytab and credential cache files; A simple KDC server that
> >> borrows some ideas from Hadoop-MiniKDC and can be used in tests but
> >> with minimal overhead in external dependencies; A brand new token
> mechanism is provided, can be experimentally used, using it a JWT token can
> be used to exchange a TGT or service ticket; Anonymous PKINIT support, can
> be experientially used, as the first Java library that supports the
> Kerberos major extension.
> >>
> >> The project stands alone and is ensured to only depend on JRE for
> easier usage. It has made the first release (1.0.0-RC1) and 2nd release
> (RC2) is upcoming.
> >>
> >>
> >> As an initial step, this proposal suggests using Apache Kerby to
> upgrade the existing codes related to ApacheDS for the Kerberos support.
> The advantageous:
> >>
> >> 1. The kerby-kerb library is all the need, which is purely in Java,
> >> SLF4J is the only dependency, the whole is rather small;
> >>
> >> 2. There is a SimpleKDC in the library for test usage, which borrowed
> >> the MiniKDC idea and implemented all the support existing in MiniKDC.
> >> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works
> >> fine;
> >>
> >> 3. Full Kerberos encryption types (many of them are not avai

Re: Introduce Apache Kerby to Hadoop

2016-02-27 Thread Andrew Purtell
I get a excited thinking about the prospect of better performance with 
auth-conf QoP. HBase RPC is an increasingly distant fork but still close enough 
to Hadoop in that respect. Our bulk data transfer protocol isn't a separate 
thing like in HDFS, which avoids a SASL wrapped implementation, so we really 
suffer when auth-conf is negotiated. You'll see the same impact where there 
might be a high frequency of NameNode RPC calls or similar still. Throughput 
drops 3-4x, or worse. 

> On Feb 22, 2016, at 4:56 PM, Zheng, Kai <kai.zh...@intel.com> wrote:
> 
> Thanks for the confirm and further inputs, Steve. 
> 
>>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby can 
> help with, it's possible because we may hook Chimera or AES-NI thing into the 
> Kerberos layer by leveraging the Kerberos library. As it may be noted, 
> HADOOP-12725 is on the going for this aspect. There may be good result and 
> further update on this recently.
> 
>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how 
>>> it works.
> Yes, starting with this initial steps upgrading MiniKDC to use Kerby is the 
> right thing we could do. After some interactions with Kerby project, we may 
> have more ideas how to proceed on the followings.
> 
>>> Long term, I'd like Hadoop 3 to be Kerby-ized
> This sounds great! With necessary support from the community like feedback 
> and patch reviewing, we can speed up the related work.
> 
> Regards,
> Kai
> 
> -Original Message-
> From: Steve Loughran [mailto:ste...@hortonworks.com] 
> Sent: Monday, February 22, 2016 6:51 PM
> To: common-dev@hadoop.apache.org
> Subject: Re: Introduce Apache Kerby to Hadoop
> 
> 
> 
> I've discussed this offline with Kai, as part of the "let's fix kerberos" 
> project. Not only is it a better Kerberos engine, we can do more diagnostics, 
> get better algorithms and ultimately get better APIs for doing Kerberos and 
> SASL —the latter would dramatically reduce the cost of wire-encrypting IPC.
> 
> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it 
> works.
> 
> Long term, I'd like Hadoop 3 to be Kerby-ized
> 
> 
>> On 22 Feb 2016, at 06:41, Zheng, Kai <kai.zh...@intel.com> wrote:
>> 
>> Hi folks,
>> 
>> I'd like to mention Apache Kerby [1] here to the community and propose to 
>> introduce the project to Hadoop, a sub project of Apache Directory project.
>> 
>> Apache Kerby is a Kerberos centric project and aims to provide a first Java 
>> Kerberos library that contains both client and server supports. The relevant 
>> features include:
>> It supports full Kerberos encryption types aligned with both MIT KDC 
>> and MS AD; Client APIs to allow to login via password, credential 
>> cache, keytab file and etc.; Utilities for generate, operate and 
>> inspect keytab and credential cache files; A simple KDC server that 
>> borrows some ideas from Hadoop-MiniKDC and can be used in tests but 
>> with minimal overhead in external dependencies; A brand new token mechanism 
>> is provided, can be experimentally used, using it a JWT token can be used to 
>> exchange a TGT or service ticket; Anonymous PKINIT support, can be 
>> experientially used, as the first Java library that supports the Kerberos 
>> major extension.
>> 
>> The project stands alone and is ensured to only depend on JRE for easier 
>> usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is 
>> upcoming.
>> 
>> 
>> As an initial step, this proposal suggests using Apache Kerby to upgrade the 
>> existing codes related to ApacheDS for the Kerberos support. The 
>> advantageous:
>> 
>> 1. The kerby-kerb library is all the need, which is purely in Java, 
>> SLF4J is the only dependency, the whole is rather small;
>> 
>> 2. There is a SimpleKDC in the library for test usage, which borrowed 
>> the MiniKDC idea and implemented all the support existing in MiniKDC. 
>> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works 
>> fine;
>> 
>> 3. Full Kerberos encryption types (many of them are not available in 
>> JRE but supported by major Kerberos vendors) and more functionalities 
>> like credential cache support;
>> 
>> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the 
>> old Kerberos implementation in Directory Server project, but the 
>> implementation is stopped being maintained. Directory project has a 
>> plan to replace the implementation using Kerby. MiniKDC 

RE: Introduce Apache Kerby to Hadoop

2016-02-22 Thread Zheng, Kai
Thanks Larry for your thoughts and inputs.

>> Replacing MiniKDC with kerby certainly makes sense.
Thanks.

>> Kerby-izing Hadoop 3 needs to be defined carefully.
Fully agree. We're still working to make the relevant Kerberos support come to 
the ideal state, either in Kerby project or outside of it. When appropriate and 
sounds good, we can think about what's next steps, come up design and discuss 
this then. Maybe we can discuss about these inputs separately after the initial 
things done?

Regards,
Kai

-Original Message-
From: larry mccay [mailto:lmc...@apache.org] 
Sent: Monday, February 22, 2016 9:05 PM
To: common-dev@hadoop.apache.org
Subject: Re: Introduce Apache Kerby to Hadoop

Replacing MiniKDC with kerby certainly makes sense.

Kerby-izing Hadoop 3 needs to be defined carefully.
As much as a JWT proponent that I am, I don't know that that taking up 
non-standard features such as the JWT token would necessarily serve us well.
If we are talking about client side only uptake in Hadoop 3 as a better 
diagnosable client library that completely makes sense.

Better algorithms and APIs would require server side compliance as well - no?
These decisions would need to align deployment usecases that want to go 
directly to AD/MIT.
Perhaps, it just means careful configuration of algorithms to match the server 
side in those cases.

+1 on the baby step of replacing MiniKDC - as this is really just 
+alignment
with the directory project roadmap anyway.

On Mon, Feb 22, 2016 at 5:51 AM, Steve Loughran <ste...@hortonworks.com>
wrote:

>
>
> I've discussed this offline with Kai, as part of the "let's fix kerberos"
> project. Not only is it a better Kerberos engine, we can do more 
> diagnostics, get better algorithms and ultimately get better APIs for 
> doing Kerberos and SASL —the latter would dramatically reduce the cost 
> of wire-encrypting IPC.
>
> For now, I'd like to see basic steps -upgrading minkdc to krypto, see 
> how it works.
>
> Long term, I'd like Hadoop 3 to be Kerby-ized
>
>
> > On 22 Feb 2016, at 06:41, Zheng, Kai <kai.zh...@intel.com> wrote:
> >
> > Hi folks,
> >
> > I'd like to mention Apache Kerby [1] here to the community and 
> > propose
> to introduce the project to Hadoop, a sub project of Apache Directory 
> project.
> >
> > Apache Kerby is a Kerberos centric project and aims to provide a 
> > first
> Java Kerberos library that contains both client and server supports. 
> The relevant features include:
> > It supports full Kerberos encryption types aligned with both MIT KDC 
> > and
> MS AD;
> > Client APIs to allow to login via password, credential cache, keytab
> file and etc.;
> > Utilities for generate, operate and inspect keytab and credential 
> > cache
> files;
> > A simple KDC server that borrows some ideas from Hadoop-MiniKDC and 
> > can
> be used in tests but with minimal overhead in external dependencies;
> > A brand new token mechanism is provided, can be experimentally used,
> using it a JWT token can be used to exchange a TGT or service ticket;
> > Anonymous PKINIT support, can be experientially used, as the first 
> > Java
> library that supports the Kerberos major extension.
> >
> > The project stands alone and is ensured to only depend on JRE for 
> > easier
> usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) 
> is upcoming.
> >
> >
> > As an initial step, this proposal suggests using Apache Kerby to 
> > upgrade
> the existing codes related to ApacheDS for the Kerberos support. The
> advantageous:
> >
> > 1. The kerby-kerb library is all the need, which is purely in Java,
> SLF4J is the only dependency, the whole is rather small;
> >
> > 2. There is a SimpleKDC in the library for test usage, which 
> > borrowed
> the MiniKDC idea and implemented all the support existing in MiniKDC. 
> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works 
> fine;
> >
> > 3. Full Kerberos encryption types (many of them are not available in 
> > JRE
> but supported by major Kerberos vendors) and more functionalities like 
> credential cache support;
> >
> > 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the 
> > old
> Kerberos implementation in Directory Server project, but the 
> implementation is stopped being maintained. Directory project has a 
> plan to replace the implementation using Kerby. MiniKDC can use Kerby 
> directly to simplify the deps;
> >
> > 5. Extensively tested with all kinds of unit tests, already being 
> > used
> for some time (like PSU), even in production environment;
> >
> > 6. Actively developed, an

Re: Introduce Apache Kerby to Hadoop

2016-02-22 Thread larry mccay
Replacing MiniKDC with kerby certainly makes sense.

Kerby-izing Hadoop 3 needs to be defined carefully.
As much as a JWT proponent that I am, I don't know that that taking up
non-standard features such as the JWT token would necessarily serve us well.
If we are talking about client side only uptake in Hadoop 3 as a better
diagnosable client library that completely makes sense.

Better algorithms and APIs would require server side compliance as well -
no?
These decisions would need to align deployment usecases that want to go
directly to AD/MIT.
Perhaps, it just means careful configuration of algorithms to match the
server side in those cases.

+1 on the baby step of replacing MiniKDC - as this is really just alignment
with the directory project roadmap anyway.

On Mon, Feb 22, 2016 at 5:51 AM, Steve Loughran 
wrote:

>
>
> I've discussed this offline with Kai, as part of the "let's fix kerberos"
> project. Not only is it a better Kerberos engine, we can do more
> diagnostics, get better algorithms and ultimately get better APIs for doing
> Kerberos and SASL —the latter would dramatically reduce the cost of
> wire-encrypting IPC.
>
> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how
> it works.
>
> Long term, I'd like Hadoop 3 to be Kerby-ized
>
>
> > On 22 Feb 2016, at 06:41, Zheng, Kai  wrote:
> >
> > Hi folks,
> >
> > I'd like to mention Apache Kerby [1] here to the community and propose
> to introduce the project to Hadoop, a sub project of Apache Directory
> project.
> >
> > Apache Kerby is a Kerberos centric project and aims to provide a first
> Java Kerberos library that contains both client and server supports. The
> relevant features include:
> > It supports full Kerberos encryption types aligned with both MIT KDC and
> MS AD;
> > Client APIs to allow to login via password, credential cache, keytab
> file and etc.;
> > Utilities for generate, operate and inspect keytab and credential cache
> files;
> > A simple KDC server that borrows some ideas from Hadoop-MiniKDC and can
> be used in tests but with minimal overhead in external dependencies;
> > A brand new token mechanism is provided, can be experimentally used,
> using it a JWT token can be used to exchange a TGT or service ticket;
> > Anonymous PKINIT support, can be experientially used, as the first Java
> library that supports the Kerberos major extension.
> >
> > The project stands alone and is ensured to only depend on JRE for easier
> usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is
> upcoming.
> >
> >
> > As an initial step, this proposal suggests using Apache Kerby to upgrade
> the existing codes related to ApacheDS for the Kerberos support. The
> advantageous:
> >
> > 1. The kerby-kerb library is all the need, which is purely in Java,
> SLF4J is the only dependency, the whole is rather small;
> >
> > 2. There is a SimpleKDC in the library for test usage, which borrowed
> the MiniKDC idea and implemented all the support existing in MiniKDC. We
> had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works fine;
> >
> > 3. Full Kerberos encryption types (many of them are not available in JRE
> but supported by major Kerberos vendors) and more functionalities like
> credential cache support;
> >
> > 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the old
> Kerberos implementation in Directory Server project, but the implementation
> is stopped being maintained. Directory project has a plan to replace the
> implementation using Kerby. MiniKDC can use Kerby directly to simplify the
> deps;
> >
> > 5. Extensively tested with all kinds of unit tests, already being used
> for some time (like PSU), even in production environment;
> >
> > 6. Actively developed, and can be fixed and released in time if
> necessary, separately and independently from other components in Apache
> Directory project. By actively developing Apache Kerby and now applying it
> to Hadoop, our side wish to make the Kerberos deploying, troubleshooting
> and further enhancement can  be much easier and thereafter possible.
> >
> >
> >
> > Wish this is a good beginning, and eventually Apache Kerby can benefit
> other projects in the ecosystem as well.
> >
> >
> >
> > This Kerberos related work is actually a long time effort led by Weihua
> Jiang in Intel, and had been kindly encouraged by Andrew Purtell, Steve
> Loughran, Gangumalla Uma, Andrew Wang and etc., thanks a lot for their
> great discussions and inputs in the past.
> >
> >
> >
> > Your feedback is very welcome. Thanks in advance.
> >
> >
> >
> > [1] https://github.com/apache/directory-kerby
> >
> >
> >
> > Regards,
> >
> > Kai
>
>


Re: Introduce Apache Kerby to Hadoop

2016-02-22 Thread Steve Loughran


I've discussed this offline with Kai, as part of the "let's fix kerberos" 
project. Not only is it a better Kerberos engine, we can do more diagnostics, 
get better algorithms and ultimately get better APIs for doing Kerberos and 
SASL —the latter would dramatically reduce the cost of wire-encrypting IPC.

For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it 
works.

Long term, I'd like Hadoop 3 to be Kerby-ized


> On 22 Feb 2016, at 06:41, Zheng, Kai  wrote:
> 
> Hi folks,
> 
> I'd like to mention Apache Kerby [1] here to the community and propose to 
> introduce the project to Hadoop, a sub project of Apache Directory project.
> 
> Apache Kerby is a Kerberos centric project and aims to provide a first Java 
> Kerberos library that contains both client and server supports. The relevant 
> features include:
> It supports full Kerberos encryption types aligned with both MIT KDC and MS 
> AD;
> Client APIs to allow to login via password, credential cache, keytab file and 
> etc.;
> Utilities for generate, operate and inspect keytab and credential cache files;
> A simple KDC server that borrows some ideas from Hadoop-MiniKDC and can be 
> used in tests but with minimal overhead in external dependencies;
> A brand new token mechanism is provided, can be experimentally used, using it 
> a JWT token can be used to exchange a TGT or service ticket;
> Anonymous PKINIT support, can be experientially used, as the first Java 
> library that supports the Kerberos major extension.
> 
> The project stands alone and is ensured to only depend on JRE for easier 
> usage. It has made the first release (1.0.0-RC1) and 2nd release (RC2) is 
> upcoming.
> 
> 
> As an initial step, this proposal suggests using Apache Kerby to upgrade the 
> existing codes related to ApacheDS for the Kerberos support. The advantageous:
> 
> 1. The kerby-kerb library is all the need, which is purely in Java, SLF4J is 
> the only dependency, the whole is rather small;
> 
> 2. There is a SimpleKDC in the library for test usage, which borrowed the 
> MiniKDC idea and implemented all the support existing in MiniKDC. We had a 
> POC that rewrote MiniKDC using Kerby SimpleKDC and it works fine;
> 
> 3. Full Kerberos encryption types (many of them are not available in JRE but 
> supported by major Kerberos vendors) and more functionalities like credential 
> cache support;
> 
> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the old 
> Kerberos implementation in Directory Server project, but the implementation 
> is stopped being maintained. Directory project has a plan to replace the 
> implementation using Kerby. MiniKDC can use Kerby directly to simplify the 
> deps;
> 
> 5. Extensively tested with all kinds of unit tests, already being used for 
> some time (like PSU), even in production environment;
> 
> 6. Actively developed, and can be fixed and released in time if necessary, 
> separately and independently from other components in Apache Directory 
> project. By actively developing Apache Kerby and now applying it to Hadoop, 
> our side wish to make the Kerberos deploying, troubleshooting and further 
> enhancement can  be much easier and thereafter possible.
> 
> 
> 
> Wish this is a good beginning, and eventually Apache Kerby can benefit other 
> projects in the ecosystem as well.
> 
> 
> 
> This Kerberos related work is actually a long time effort led by Weihua Jiang 
> in Intel, and had been kindly encouraged by Andrew Purtell, Steve Loughran, 
> Gangumalla Uma, Andrew Wang and etc., thanks a lot for their great 
> discussions and inputs in the past.
> 
> 
> 
> Your feedback is very welcome. Thanks in advance.
> 
> 
> 
> [1] https://github.com/apache/directory-kerby
> 
> 
> 
> Regards,
> 
> Kai



Introduce Apache Kerby to Hadoop

2016-02-21 Thread Zheng, Kai
Hi folks,

I'd like to mention Apache Kerby [1] here to the community and propose to 
introduce the project to Hadoop, a sub project of Apache Directory project.

Apache Kerby is a Kerberos centric project and aims to provide a first Java 
Kerberos library that contains both client and server supports. The relevant 
features include:
It supports full Kerberos encryption types aligned with both MIT KDC and MS AD;
Client APIs to allow to login via password, credential cache, keytab file and 
etc.;
Utilities for generate, operate and inspect keytab and credential cache files;
A simple KDC server that borrows some ideas from Hadoop-MiniKDC and can be used 
in tests but with minimal overhead in external dependencies;
A brand new token mechanism is provided, can be experimentally used, using it a 
JWT token can be used to exchange a TGT or service ticket;
Anonymous PKINIT support, can be experientially used, as the first Java library 
that supports the Kerberos major extension.

The project stands alone and is ensured to only depend on JRE for easier usage. 
It has made the first release (1.0.0-RC1) and 2nd release (RC2) is upcoming.


As an initial step, this proposal suggests using Apache Kerby to upgrade the 
existing codes related to ApacheDS for the Kerberos support. The advantageous:

1. The kerby-kerb library is all the need, which is purely in Java, SLF4J is 
the only dependency, the whole is rather small;

2. There is a SimpleKDC in the library for test usage, which borrowed the 
MiniKDC idea and implemented all the support existing in MiniKDC. We had a POC 
that rewrote MiniKDC using Kerby SimpleKDC and it works fine;

3. Full Kerberos encryption types (many of them are not available in JRE but 
supported by major Kerberos vendors) and more functionalities like credential 
cache support;

4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the old 
Kerberos implementation in Directory Server project, but the implementation is 
stopped being maintained. Directory project has a plan to replace the 
implementation using Kerby. MiniKDC can use Kerby directly to simplify the deps;

5. Extensively tested with all kinds of unit tests, already being used for some 
time (like PSU), even in production environment;

6. Actively developed, and can be fixed and released in time if necessary, 
separately and independently from other components in Apache Directory project. 
By actively developing Apache Kerby and now applying it to Hadoop, our side 
wish to make the Kerberos deploying, troubleshooting and further enhancement 
can  be much easier and thereafter possible.



Wish this is a good beginning, and eventually Apache Kerby can benefit other 
projects in the ecosystem as well.



This Kerberos related work is actually a long time effort led by Weihua Jiang 
in Intel, and had been kindly encouraged by Andrew Purtell, Steve Loughran, 
Gangumalla Uma, Andrew Wang and etc., thanks a lot for their great discussions 
and inputs in the past.



Your feedback is very welcome. Thanks in advance.



[1] https://github.com/apache/directory-kerby



Regards,

Kai