Hi Jay,

Thanks for putting together a spec for security.

Joe,

Looks "Securing zookeeper.." part has been deleted from assumptions
section. communication with zookeeper need to be secured as well to make
entire kafka cluster secure. It may or may not require changes to kafka.
But it's good to have it in spec.

I could not find a link to edit the page after login into wiki. Do I need
any special permission to make edits?

Thanks,
Raja.


On Wed, Jun 4, 2014 at 8:57 PM, Joe Stein <joe.st...@stealth.ly> wrote:

> I like the idea of working on the spec and prioritizing. I will update the
> wiki.
>
> - Joestein
>
>
> On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>
> > Hey Joe,
> >
> > Thanks for kicking this discussion off! I totally agree that for
> something
> > that acts as a central message broker security is critical feature. I
> think
> > a number of people have been interested in this topic and several people
> > have put effort into special purpose security efforts.
> >
> > Since most the LinkedIn folks are working on the consumer right now I
> think
> > this would be a great project for any other interested people to take on.
> > There are some challenges in doing these things distributed but it can
> also
> > be a lot of fun.
> >
> > I think a good first step would be to get a written plan we can all agree
> > on for how things should work. Then we can break things down into chunks
> > that can be done independently while still aiming at a good end state.
> >
> > I had tried to write up some notes that summarized at least the thoughts
> I
> > had had on security:
> > https://cwiki.apache.org/confluence/display/KAFKA/Security
> >
> > What do you think of that?
> >
> > One assumption I had (which may be incorrect) is that although we want
> all
> > the things in your list, the two most pressing would be authentication
> and
> > authorization, and that was all that write up covered. You have more
> > experience in this domain, so I wonder how you would prioritize?
> >
> > Those notes are really sketchy, so I think the first goal I would have
> > would be to get to a real spec we can all agree on and discuss. A lot of
> > the security stuff has a high human interaction element and needs to work
> > in pretty different domains and different companies so getting this kind
> of
> > review is important.
> >
> > -Jay
> >
> >
> > On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein <joe.st...@stealth.ly> wrote:
> >
> > > Hi,I wanted to re-ignite the discussion around Apache Kafka Security.
> >  This
> > > is a huge bottleneck (non-starter in some cases) for a lot of
> > organizations
> > > (due to regulatory, compliance and other requirements). Below are my
> > > suggestions for specific changes in Kafka to accommodate security
> > > requirements.  This comes from what folks are doing "in the wild" to
> > > workaround and implement security with Kafka as it is today and also
> > what I
> > > have discovered from organizations about their blockers. It also picks
> up
> > > from the wiki (which I should have time to update later in the week
> based
> > > on the below and feedback from the thread).
> > >
> > > 1) Transport Layer Security (i.e. SSL)
> > >
> > > This also includes client authentication in addition to in-transit
> > security
> > > layer.  This work has been picked up here
> > > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any
> > > thoughts, comments, feedback, tomatoes, whatever for this patch.  It
> is a
> > > pickup from the fork of the work first done here
> > > https://github.com/relango/kafka/tree/kafka_security.
> > >
> > > 2) Data encryption at rest.
> > >
> > > This is very important and something that can be facilitated within the
> > > wire protocol. It requires an additional map data structure for the
> > > "encrypted [data encryption key]". With this map (either in your object
> > or
> > > in the wire protocol) you can store the dynamically generated symmetric
> > key
> > > (for each message) and then encrypt the data using that dynamically
> > > generated key.  You then encrypt the encryption key using each public
> key
> > > for whom is expected to be able to decrypt the encryption key to then
> > > decrypt the message.  For each public key encrypted symmetric key
> (which
> > is
> > > now the "encrypted [data encryption key]" along with which public key
> it
> > > was encrypted with for (so a map of [publicKey] =
> > > encryptedDataEncryptionKey) as a chain.   Other patterns can be
> > implemented
> > > but this is a pretty standard digital enveloping [0] pattern with only
> 1
> > > field added. Other patterns should be able to use that field to-do
> their
> > > implementation too.
> > >
> > > 3) Non-repudiation and long term non-repudiation.
> > >
> > > Non-repudiation is proving data hasn't changed.  This is often (if not
> > > always) done with x509 public certificates (chained to a certificate
> > > authority).
> > >
> > > Long term non-repudiation is what happens when the certificates of the
> > > certificate authority are expired (or revoked) and everything ever
> signed
> > > (ever) with that certificate's public key then becomes "no longer
> > provable
> > > as ever being authentic".  That is where RFC3126 [1] and RFC3161 [2]
> come
> > > in (or worm drives [hardware], etc).
> > >
> > > For either (or both) of these it is an operation of the encryptor to
> > > sign/hash the data (with or without third party trusted timestap of the
> > > signing event) and encrypt that with their own private key and
> distribute
> > > the results (before and after encrypting if required) along with their
> > > public key. This structure is a bit more complex but feasible, it is a
> > map
> > > of digital signature formats and the chain of dig sig attestations.
>  The
> > > map's key being the method (i.e. CRC32, PKCS7 [3], XmlDigSig [4]) and
> > then
> > > a list of map where that key is "purpose" of signature (what your
> > attesting
> > > too).  As a sibling field to the list another field for "the attester"
> as
> > > bytes (e.g. their PKCS12 [5] for the map of PKCS7 signatures).
> > >
> > > 4) Authorization
> > >
> > > We should have a policy of "404" for data, topics, partitions (etc) if
> > > authenticated connections do not have access.  In "secure mode" any non
> > > authenticated connections should get a "404" type message on
> everything.
> > > Knowing "something is there" is a security risk in many uses cases.  So
> > if
> > > you don't have access you don't even see it.  Baking "that" into Kafka
> > > along with some interface for entitlement (access management) systems
> > > (pretty standard) is all that I think needs to be done to the core
> > project.
> > >  I want to tackle item later in the year after summer after the other
> > three
> > > are complete.
> > >
> > > I look forward to thoughts on this and anyone else interested in
> working
> > > with us on these items.
> > >
> > > [0]
> > >
> > >
> >
> http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/what-is-a-digital-envelope.htm
> > > [1] http://tools.ietf.org/html/rfc3126
> > > [2] http://tools.ietf.org/html/rfc3161
> > > [3]
> > >
> > >
> >
> http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/pkcs-7-cryptographic-message-syntax-standar.htm
> > > [4] http://en.wikipedia.org/wiki/XML_Signature
> > > [5] http://en.wikipedia.org/wiki/PKCS_12
> > >
> > > /*******************************************
> > >  Joe Stein
> > >  Founder, Principal Consultant
> > >  Big Data Open Source Security LLC
> > >  http://www.stealth.ly
> > >  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> > > ********************************************/
> > >
> >
>



-- 
Thanks,
Raja.

Reply via email to