Re: [DISCUSS] Kafka Security Specific Features

Rob Withers Fri, 06 Jun 2014 10:50:03 -0700

On consideration, if we have 3 different access groups (1 forproduction WRITE and 2 consumers) they all need to decode the sameencryption and so all need the same public/private key....certs won'twork, unless you write a CertAuthority to build multiple certs withthe same keys. Better seems to not use certs and wrap the encryptionspecification with an ACL capabilities for each group of access.


On Jun 6, 2014, at 11:43 AM, Rob Withers wrote:

This is quite interesting to me and it is an excelent opportunity topromote a slightly different security scheme. Object-capabilitiesare perfect for online security and would use ACL styleauthentication to gain capabilities filtered to those allowedresources for allow actions (READ/WRITE/DELETE/LIST/SCAN).Erights.org has the quitenscential (??) object capabilities modeland capnproto is impleemting this for C++. I have a javaimplementation at http://github.com/pauwau/pauwau but the master isbroken. 0.2 works, basically. B asically a TLS connection with nocertificate server, it is peer to peer. It has some advancedfeatures, but the lining of capabilities with authorization so thatyou can only invoke correct services is extended to the secure user.
Regarding non-repudiation, on disk, why not prepend a CRC?
Regarding on-disk encryption, multiple users/groups may need toaccess, with different capabilities. Sounds like zookeeper needs tostore a cert for each class of access so that a group member canaccess the decrypted data from disk. Use cert-based asyncdecryption. The only isue is storing the private key in zookeeper.Perhaps some hash magic could be used.
Thanks for kafka,
Rob

On Jun 5, 2014, at 3:01 PM, Jay Kreps wrote:
Hey Joe,

I don't really understand the sections you added to the wiki. Can you
clarify them?
Is non-repudiation what SASL would call integrity checks? If sodon't SSL
and and many of the SASL schemes already support this as well as
on-the-wire encryption?
Or are you proposing an on-disk encryption scheme? Is this actuallyneeded?Isn't a on-the-wire encryption when combined with mutualauthentication and
permissions sufficient for most uses?
On-disk encryption seems unnecessary because if an attacker can getroot onthe kafka boxes it can potentially modify Kafka to do anything heor she
wants with data. So this seems to break any security model.

I understand the problem of a large organization not really having a
trusted network and wanting to secure data transfer and limit andauditdata access. The uses for these other things I don't totallyunderstand.
Also it would be worth understanding the state of other messaging and
storage systems (Hadoop, dbs, etc). What features do they support.I thinkthere is a sense in which you don't have to run faster than thebear, but
only faster then your friends. :-)

-Jay
On Wed, Jun 4, 2014 at 5:57 PM, Joe Stein <[email protected]>wrote:
I like the idea of working on the spec and prioritizing. I willupdate the
wiki.

- Joestein
On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps <[email protected]>wrote:
Hey Joe,

Thanks for kicking this discussion off! I totally agree that for
something
that acts as a central message broker security is criticalfeature. I
think
a number of people have been interested in this topic and severalpeople
have put effort into special purpose security efforts.
Since most the LinkedIn folks are working on the consumer rightnow I
think
this would be a great project for any other interested people totake on.There are some challenges in doing these things distributed butit can
also
be a lot of fun.
I think a good first step would be to get a written plan we canall agreeon for how things should work. Then we can break things down intochunksthat can be done independently while still aiming at a good endstate.
I had tried to write up some notes that summarized at least thethoughts
I
had had on security:
https://cwiki.apache.org/confluence/display/KAFKA/Security

What do you think of that?
One assumption I had (which may be incorrect) is that although wewant
all
the things in your list, the two most pressing would beauthentication
and
authorization, and that was all that write up covered. You havemore
experience in this domain, so I wonder how you would prioritize?
Those notes are really sketchy, so I think the first goal I wouldhavewould be to get to a real spec we can all agree on and discuss. Alot ofthe security stuff has a high human interaction element and needsto workin pretty different domains and different companies so gettingthis kind
of
review is important.

-Jay
On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein <[email protected]>wrote:
Hi,I wanted to re-ignite the discussion around Apache KafkaSecurity.
This
is a huge bottleneck (non-starter in some cases) for a lot of
organizations
(due to regulatory, compliance and other requirements). Beloware my
suggestions for specific changes in Kafka to accommodate security
requirements. This comes from what folks are doing "in thewild" toworkaround and implement security with Kafka as it is today andalso
what I
have discovered from organizations about their blockers. It alsopicks
up
from the wiki (which I should have time to update later in theweek
based
on the below and feedback from the thread).

1) Transport Layer Security (i.e. SSL)

This also includes client authentication in addition to in-transit
security
layer.  This work has been picked up here
https://issues.apache.org/jira/browse/KAFKA-1477 and doappreciate anythoughts, comments, feedback, tomatoes, whatever for thispatch. It
is a
pickup from the fork of the work first done here
https://github.com/relango/kafka/tree/kafka_security.

2) Data encryption at rest.
This is very important and something that can be facilitatedwithin thewire protocol. It requires an additional map data structure forthe"encrypted [data encryption key]". With this map (either in yourobject
or
in the wire protocol) you can store the dynamically generatedsymmetric
key
(for each message) and then encrypt the data using thatdynamicallygenerated key. You then encrypt the encryption key using eachpublic
key
for whom is expected to be able to decrypt the encryption key tothen
decrypt the message.  For each public key encrypted symmetric key
(which
is
now the "encrypted [data encryption key]" along with whichpublic key
it
was encrypted with for (so a map of [publicKey] =
encryptedDataEncryptionKey) as a chain.   Other patterns can be
implemented
but this is a pretty standard digital enveloping [0] patternwith only
1
field added. Other patterns should be able to use that field to-do
their
implementation too.

3) Non-repudiation and long term non-repudiation.
Non-repudiation is proving data hasn't changed. This is often(if notalways) done with x509 public certificates (chained to acertificate
authority).
Long term non-repudiation is what happens when the certificatesof the
certificate authority are expired (or revoked) and everything ever
signed
(ever) with that certificate's public key then becomes "no longer
provable
as ever being authentic". That is where RFC3126 [1] and RFC3161[2]
come
in (or worm drives [hardware], etc).
For either (or both) of these it is an operation of theencryptor tosign/hash the data (with or without third party trusted timestapof the
signing event) and encrypt that with their own private key and
distribute
the results (before and after encrypting if required) along withtheirpublic key. This structure is a bit more complex but feasible,it is a
map
of digital signature formats and the chain of dig sigattestations.
The
map's key being the method (i.e. CRC32, PKCS7 [3], XmlDigSig[4]) and
then
a list of map where that key is "purpose" of signature (what your
attesting
too). As a sibling field to the list another field for "theattester"
as
bytes (e.g. their PKCS12 [5] for the map of PKCS7 signatures).

4) Authorization
We should have a policy of "404" for data, topics, partitions(etc) ifauthenticated connections do not have access. In "secure mode"any non
authenticated connections should get a "404" type message on
everything.
Knowing "something is there" is a security risk in many usescases. So
if
you don't have access you don't even see it. Baking "that" intoKafkaalong with some interface for entitlement (access management)systems
(pretty standard) is all that I think needs to be done to the core
project.
I want to tackle item later in the year after summer after theother
three
are complete.

I look forward to thoughts on this and anyone else interested in
working
with us on these items.

[0]
http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/what-is-a-digital-envelope.htm
[1] http://tools.ietf.org/html/rfc3126
[2] http://tools.ietf.org/html/rfc3161
[3]
http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/pkcs-7-cryptographic-message-syntax-standar.htm
[4] http://en.wikipedia.org/wiki/XML_Signature
[5] http://en.wikipedia.org/wiki/PKCS_12

/*******************************************
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/

Re: [DISCUSS] Kafka Security Specific Features

Reply via email to