Re: [DISCUSS] Kafka Security Specific Features

2014-07-22 Thread Pramod Deshmukh
Anyone getting this issue. Is it something related to environment or it is
the code. Producer works fine when run with secure=false (no security) mode.


pdeshmukh$ bin/kafka-console-producer.sh --broker-list localhost:9092:true
--topic secureTopic

[2014-07-18 13:12:29,817] WARN Property topic is not valid
(kafka.utils.VerifiableProperties)

Hare Krishna

[2014-07-18 13:12:45,256] WARN Fetching topic metadata with correlation id
0 for topics [Set(secureTopic)] from broker
[id:0,host:localhost,port:9092,secure:true] failed
(kafka.client.ClientUtils$)

java.io.EOFException: Received -1 when reading from channel, socket has
likely been closed.

at kafka.utils.Utils$.read(Utils.scala:381)

at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)

at kafka.network.Receive$class.readCompletely(Transmission.scala:56)

at
kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)

at kafka.network.BlockingChannel.receive(BlockingChannel.scala:102)

at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:79)

at
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:76)

at kafka.producer.SyncProducer.send(SyncProducer.scala:117)

at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58)

at
kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)

at
kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:67)

at kafka.utils.Utils$.swallow(Utils.scala:172)

at kafka.utils.Logging$class.swallowError(Logging.scala:106)

at kafka.utils.Utils$.swallowError(Utils.scala:45)

at
kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:67)

at
kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:104)

at
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:87)

at
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:67)

at scala.collection.immutable.Stream.foreach(Stream.scala:526)

at
kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:66)

at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:44)


On Fri, Jul 18, 2014 at 1:20 PM, Pramod Deshmukh  wrote:

> Thanks Joe, I don't see any Out of memory error. Now I get exception when
> Producer fetches metadata for a topic
>
> Here is how I created the topic and run producer
>
> pdeshmukh$ bin/kafka-topics.sh --create --zookeeper localhost:2181
> --replication-factor 1 --partitions 1 --topic secureTopic
> Created topic "secureTopic".
>
> pdeshmukh$ bin/kafka-topics.sh --list --zookeeper localhost:2181
>
> secure.test
>
> secureTopic
>
> >> Run producer, tried both localhost:9092:true and localhost:9092
>
> pdeshmukh$ bin/kafka-console-producer.sh --broker-list localhost:9092:true
> --topic secureTopic
>
> [2014-07-18 13:12:29,817] WARN Property topic is not valid
> (kafka.utils.VerifiableProperties)
>
> Hare Krishna
>
> [2014-07-18 13:12:45,256] WARN Fetching topic metadata with correlation id
> 0 for topics [Set(secureTopic)] from broker
> [id:0,host:localhost,port:9092,secure:true] failed
> (kafka.client.ClientUtils$)
>
> java.io.EOFException: Received -1 when reading from channel, socket has
> likely been closed.
>
> at kafka.utils.Utils$.read(Utils.scala:381)
>
> at
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)
>
> at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>
> at
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>
> at kafka.network.BlockingChannel.receive(BlockingChannel.scala:102)
>
> at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:79)
>
> at
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:76)
>
> at kafka.producer.SyncProducer.send(SyncProducer.scala:117)
>
> at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58)
>
> at
> kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
>
> at
> kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:67)
>
> at kafka.utils.Utils$.swallow(Utils.scala:172)
>
> at kafka.utils.Logging$class.swallowError(Logging.scala:106)
>
> at kafka.utils.Utils$.swallowError(Utils.scala:45)
>
> at
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:67)
>
> at
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:104)
>
> at
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:87)
>
> at
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:67)
>
> at scala.collection.immutable.Stream.foreach(Stream.scala:526)
>
> at
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:66)
>
> at kafka.producer.async.ProducerSendThr

Re: [DISCUSS] Kafka Security Specific Features

2014-07-18 Thread Pramod Deshmukh
Thanks Joe, I don't see any Out of memory error. Now I get exception when
Producer fetches metadata for a topic

Here is how I created the topic and run producer

pdeshmukh$ bin/kafka-topics.sh --create --zookeeper localhost:2181
--replication-factor 1 --partitions 1 --topic secureTopic
Created topic "secureTopic".

pdeshmukh$ bin/kafka-topics.sh --list --zookeeper localhost:2181

secure.test

secureTopic

>> Run producer, tried both localhost:9092:true and localhost:9092

pdeshmukh$ bin/kafka-console-producer.sh --broker-list localhost:9092:true
--topic secureTopic

[2014-07-18 13:12:29,817] WARN Property topic is not valid
(kafka.utils.VerifiableProperties)

Hare Krishna

[2014-07-18 13:12:45,256] WARN Fetching topic metadata with correlation id
0 for topics [Set(secureTopic)] from broker
[id:0,host:localhost,port:9092,secure:true] failed
(kafka.client.ClientUtils$)

java.io.EOFException: Received -1 when reading from channel, socket has
likely been closed.

at kafka.utils.Utils$.read(Utils.scala:381)

at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)

at kafka.network.Receive$class.readCompletely(Transmission.scala:56)

at
kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)

at kafka.network.BlockingChannel.receive(BlockingChannel.scala:102)

at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:79)

at
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:76)

at kafka.producer.SyncProducer.send(SyncProducer.scala:117)

at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58)

at
kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)

at
kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:67)

at kafka.utils.Utils$.swallow(Utils.scala:172)

at kafka.utils.Logging$class.swallowError(Logging.scala:106)

at kafka.utils.Utils$.swallowError(Utils.scala:45)

at
kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:67)

at
kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:104)

at
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:87)

at
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:67)

at scala.collection.immutable.Stream.foreach(Stream.scala:526)

at
kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:66)

at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:44)

[2014-07-18 13:12:45,258] ERROR fetching topic metadata for topics
[Set(secureTopic)] from broker
[ArrayBuffer(id:0,host:localhost,port:9092,secure:true)] failed
(kafka.utils.Utils$)

kafka.common.KafkaException: fetching topic metadata for topics
[Set(secureTopic)] from broker
[ArrayBuffer(id:0,host:localhost,port:9092,secure:true)] failed

at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:72)

at
kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)

at
kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:67)

at kafka.utils.Utils$.swallow(Utils.scala:172)

at kafka.utils.Logging$class.swallowError(Logging.scala:106)

at kafka.utils.Utils$.swallowError(Utils.scala:45)

at
kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:67)

at
kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:104)

at
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:87)

at
kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:67)

at scala.collection.immutable.Stream.foreach(Stream.scala:526)

at
kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:66)

at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:44)

Caused by: java.io.EOFException: Received -1 when reading from channel,
socket has likely been closed.

at kafka.utils.Utils$.read(Utils.scala:381)

at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)

at kafka.network.Receive$class.readCompletely(Transmission.scala:56)

at
kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)

at kafka.network.BlockingChannel.receive(BlockingChannel.scala:102)

at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:79)

at
kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:76)

at kafka.producer.SyncProducer.send(SyncProducer.scala:117)

at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58)

... 12 more
[2014-07-18 13:12:45,337] WARN Fetching topic metadata with correlation id
1 for topics [Set(secureTopic)] from broker
[id:0,host:localhost,port:9092,secure:true] failed
(kafka.client.ClientUtils$)

2014-07-18 13:12:46,282] ERROR Failed to send requests for topics
secureTopic 

Re: [DISCUSS] Kafka Security Specific Features

2014-07-18 Thread Joe Stein
Hi Pramod,

Can you increase KAFKA_HEAP_OPTS to lets say -Xmx1G in the
kafka-console-producer.sh to see if that gets you further along please in
your testing?

Thanks!

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


On Fri, Jul 18, 2014 at 10:24 AM, Pramod Deshmukh 
wrote:

> Hello Raja/Joe,
> When I turn on security, i still get out of memory error on producer. Is
> this something to do with keys? Is there any other way I can connect to
> broker?
>
> *producer log*
> [2014-07-17 15:38:14,186] ERROR OOME with size 352518400 (kafka.network.
> BoundedByteBufferReceive)
> java.lang.OutOfMemoryError: Java heap space
>
> *broker log*
>
> INFO begin ssl handshake for localhost/127.0.0.1:50199//127.0.0.1:9092
>
>
>
>
>
> On Thu, Jul 17, 2014 at 6:07 PM, Pramod Deshmukh 
> wrote:
>
> > Correct, I don't see any exceptions when i turn off security. Consumer is
> > able to consume the message.
> >
> > I still see warning for topic property.
> >
> > [2014-07-17 18:04:38,360] WARN Property topic is not valid
> > (kafka.utils.VerifiableProperties)
> >
> >
> >
> >
> >
> > On Thu, Jul 17, 2014 at 5:49 PM, Rajasekar Elango <
> rela...@salesforce.com>
> > wrote:
> >
> >> Can you try with turning off security to check if this error happens
> only
> >> on secure mode?
> >>
> >> Thanks,
> >> Raja.
> >>
> >>
> >>
> >>
> >> On Thu, Jul 17, 2014 at 3:51 PM, Pramod Deshmukh 
> >> wrote:
> >>
> >> > Thanks Raja, it was helpful
> >> >
> >> > Now I am able to start zookeeper and broker in secure mode ready for
> SSL
> >> > handshake. I get *java.lang.OutOfMemoryError: Java heap space* on
> >> producer.
> >> >
> >> > I using the default configuration and keystore. Is there anything
> >> missing
> >> >
> >> > *Start broker:*
> >> >
> >> > *bin/kafka-server-start.sh config/server.properties*
> >> >
> >> >
> >> >
> >> > *broker.log:*
> >> >
> >> > [2014-07-17 15:34:46,281] INFO zookeeper state changed (SyncConnected)
> >> > (org.I0Itec.zkclient.ZkClient)
> >> >
> >> > [2014-07-17 15:34:46,523] INFO Loading log 'secure.test-0'
> >> > (kafka.log.LogManager)
> >> >
> >> > [2014-07-17 15:34:46,558] INFO Recovering unflushed segment 0 in log
> >> > secure.test-0. (kafka.log.Log)
> >> >
> >> > [2014-07-17 15:34:46,571] INFO Completed load of log secure.test-0
> with
> >> log
> >> > end offset 0 (kafka.log.Log)
> >> >
> >> > [2014-07-17 15:34:46,582] INFO Starting log cleanup with a period of
> >> 6
> >> > ms. (kafka.log.LogManager)
> >> >
> >> > [2014-07-17 15:34:46,587] INFO Starting log flusher with a default
> >> period
> >> > of 9223372036854775807 ms. (kafka.log.LogManager)
> >> >
> >> > [2014-07-17 15:34:46,614] INFO Initializing secure authentication
> >> > (kafka.network.security.SecureAuth$)
> >> >
> >> > [2014-07-17 15:34:46,678] INFO Secure authentication initialization
> has
> >> > been successfully completed (kafka.network.security.SecureAuth$)
> >> >
> >> > [2014-07-17 15:34:46,691] INFO Awaiting socket connections on
> >> 0.0.0.0:9092
> >> > .
> >> > (kafka.network.Acceptor)
> >> >
> >> > [2014-07-17 15:34:46,692] INFO [Socket Server on Broker 0], Started
> >> > (kafka.network.SocketServer)
> >> >
> >> > [2014-07-17 15:34:46,794] INFO Will not load MX4J, mx4j-tools.jar is
> >> not in
> >> > the classpath (kafka.utils.Mx4jLoader$)
> >> >
> >> > [2014-07-17 15:34:46,837] INFO 0 successfully elected as leader
> >> > (kafka.server.ZookeeperLeaderElector)
> >> >
> >> > [2014-07-17 15:34:47,057] INFO Registered broker 0 at path
> >> /brokers/ids/0
> >> > with address 10.1.100.130:9092. (kafka.utils.ZkUtils$)
> >> >
> >> > [2014-07-17 15:34:47,059] INFO New leader is 0
> >> > (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
> >> >
> >> > *[2014-07-17 15:34:47,068] INFO [Kafka Server 0], started
> >> > (kafka.server.KafkaServer)*
> >> >
> >> > *[2014-07-17 15:34:47,383] INFO begin ssl handshake for
> >> > /10.1.100.130:9092//10.1.100.130:51685
> >> > 
> >> > (kafka.network.security.SSLSocketChannel)*
> >> >
> >> > *[2014-07-17 15:34:47,392] INFO begin ssl handshake for
> >> > 10.1.100.130/10.1.100.130:51685//10.1.100.130:9092
> >> > 
> >> > (kafka.network.security.SSLSocketChannel)*
> >> >
> >> > *[2014-07-17 15:34:47,465] INFO finished ssl handshake for
> >> > 10.1.100.130/10.1.100.130:51685//10.1.100.130:9092
> >> > 
> >> > (kafka.network.security.SSLSocketChannel)*
> >> >
> >> > *[2014-07-17 15:34:47,465] INFO finished ssl handshake for
> >> > /10.1.100.130:9092//10.1.100.130:51685
> >> > 
> >> > (kafka.network.security.SSLSocketChannel)*
> >> >
> >> > *[2014-07-17 15:34:47,617] INFO [ReplicaFetcherManag

Re: [DISCUSS] Kafka Security Specific Features

2014-07-18 Thread Pramod Deshmukh
Hello Raja/Joe,
When I turn on security, i still get out of memory error on producer. Is
this something to do with keys? Is there any other way I can connect to
broker?

*producer log*
[2014-07-17 15:38:14,186] ERROR OOME with size 352518400 (kafka.network.
BoundedByteBufferReceive)
java.lang.OutOfMemoryError: Java heap space

*broker log*

INFO begin ssl handshake for localhost/127.0.0.1:50199//127.0.0.1:9092





On Thu, Jul 17, 2014 at 6:07 PM, Pramod Deshmukh  wrote:

> Correct, I don't see any exceptions when i turn off security. Consumer is
> able to consume the message.
>
> I still see warning for topic property.
>
> [2014-07-17 18:04:38,360] WARN Property topic is not valid
> (kafka.utils.VerifiableProperties)
>
>
>
>
>
> On Thu, Jul 17, 2014 at 5:49 PM, Rajasekar Elango 
> wrote:
>
>> Can you try with turning off security to check if this error happens only
>> on secure mode?
>>
>> Thanks,
>> Raja.
>>
>>
>>
>>
>> On Thu, Jul 17, 2014 at 3:51 PM, Pramod Deshmukh 
>> wrote:
>>
>> > Thanks Raja, it was helpful
>> >
>> > Now I am able to start zookeeper and broker in secure mode ready for SSL
>> > handshake. I get *java.lang.OutOfMemoryError: Java heap space* on
>> producer.
>> >
>> > I using the default configuration and keystore. Is there anything
>> missing
>> >
>> > *Start broker:*
>> >
>> > *bin/kafka-server-start.sh config/server.properties*
>> >
>> >
>> >
>> > *broker.log:*
>> >
>> > [2014-07-17 15:34:46,281] INFO zookeeper state changed (SyncConnected)
>> > (org.I0Itec.zkclient.ZkClient)
>> >
>> > [2014-07-17 15:34:46,523] INFO Loading log 'secure.test-0'
>> > (kafka.log.LogManager)
>> >
>> > [2014-07-17 15:34:46,558] INFO Recovering unflushed segment 0 in log
>> > secure.test-0. (kafka.log.Log)
>> >
>> > [2014-07-17 15:34:46,571] INFO Completed load of log secure.test-0 with
>> log
>> > end offset 0 (kafka.log.Log)
>> >
>> > [2014-07-17 15:34:46,582] INFO Starting log cleanup with a period of
>> 6
>> > ms. (kafka.log.LogManager)
>> >
>> > [2014-07-17 15:34:46,587] INFO Starting log flusher with a default
>> period
>> > of 9223372036854775807 ms. (kafka.log.LogManager)
>> >
>> > [2014-07-17 15:34:46,614] INFO Initializing secure authentication
>> > (kafka.network.security.SecureAuth$)
>> >
>> > [2014-07-17 15:34:46,678] INFO Secure authentication initialization has
>> > been successfully completed (kafka.network.security.SecureAuth$)
>> >
>> > [2014-07-17 15:34:46,691] INFO Awaiting socket connections on
>> 0.0.0.0:9092
>> > .
>> > (kafka.network.Acceptor)
>> >
>> > [2014-07-17 15:34:46,692] INFO [Socket Server on Broker 0], Started
>> > (kafka.network.SocketServer)
>> >
>> > [2014-07-17 15:34:46,794] INFO Will not load MX4J, mx4j-tools.jar is
>> not in
>> > the classpath (kafka.utils.Mx4jLoader$)
>> >
>> > [2014-07-17 15:34:46,837] INFO 0 successfully elected as leader
>> > (kafka.server.ZookeeperLeaderElector)
>> >
>> > [2014-07-17 15:34:47,057] INFO Registered broker 0 at path
>> /brokers/ids/0
>> > with address 10.1.100.130:9092. (kafka.utils.ZkUtils$)
>> >
>> > [2014-07-17 15:34:47,059] INFO New leader is 0
>> > (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
>> >
>> > *[2014-07-17 15:34:47,068] INFO [Kafka Server 0], started
>> > (kafka.server.KafkaServer)*
>> >
>> > *[2014-07-17 15:34:47,383] INFO begin ssl handshake for
>> > /10.1.100.130:9092//10.1.100.130:51685
>> > 
>> > (kafka.network.security.SSLSocketChannel)*
>> >
>> > *[2014-07-17 15:34:47,392] INFO begin ssl handshake for
>> > 10.1.100.130/10.1.100.130:51685//10.1.100.130:9092
>> > 
>> > (kafka.network.security.SSLSocketChannel)*
>> >
>> > *[2014-07-17 15:34:47,465] INFO finished ssl handshake for
>> > 10.1.100.130/10.1.100.130:51685//10.1.100.130:9092
>> > 
>> > (kafka.network.security.SSLSocketChannel)*
>> >
>> > *[2014-07-17 15:34:47,465] INFO finished ssl handshake for
>> > /10.1.100.130:9092//10.1.100.130:51685
>> > 
>> > (kafka.network.security.SSLSocketChannel)*
>> >
>> > *[2014-07-17 15:34:47,617] INFO [ReplicaFetcherManager on broker 0]
>> Removed
>> > fetcher for partitions  (kafka.server.ReplicaFetcherManager)*
>> >
>> > *[2014-07-17 15:34:47,627] INFO [ReplicaFetcherManager on broker 0]
>> Added
>> > fetcher for partitions List() (kafka.server.ReplicaFetcherManager)*
>> >
>> > *[2014-07-17 15:34:47,656] INFO [ReplicaFetcherManager on broker 0]
>> Removed
>> > fetcher for partitions [secure.test,0]
>> > (kafka.server.ReplicaFetcherManager)*
>> >
>> > [2014-07-17 15:37:15,970] INFO begin ssl handshake for
>> > 10.1.100.130/10.1.100.130:51689//10.1.100.130:9092
>> > (kafka.network.security.SSLSocketChannel)
>> >
>> > [2014-07-17 15:37:16,075] INFO begin ssl handshake for
>> > 10.1.100.130/10.1.100.130:51690//10.1.100.130:9092
>> > (kafka.network.security.SSLSocketChannel)
>> >
>> > [2014-07-17 15:

Re: [DISCUSS] Kafka Security Specific Features

2014-07-17 Thread Pramod Deshmukh
Correct, I don't see any exceptions when i turn off security. Consumer is
able to consume the message.

I still see warning for topic property.

[2014-07-17 18:04:38,360] WARN Property topic is not valid
(kafka.utils.VerifiableProperties)





On Thu, Jul 17, 2014 at 5:49 PM, Rajasekar Elango 
wrote:

> Can you try with turning off security to check if this error happens only
> on secure mode?
>
> Thanks,
> Raja.
>
>
>
>
> On Thu, Jul 17, 2014 at 3:51 PM, Pramod Deshmukh 
> wrote:
>
> > Thanks Raja, it was helpful
> >
> > Now I am able to start zookeeper and broker in secure mode ready for SSL
> > handshake. I get *java.lang.OutOfMemoryError: Java heap space* on
> producer.
> >
> > I using the default configuration and keystore. Is there anything missing
> >
> > *Start broker:*
> >
> > *bin/kafka-server-start.sh config/server.properties*
> >
> >
> >
> > *broker.log:*
> >
> > [2014-07-17 15:34:46,281] INFO zookeeper state changed (SyncConnected)
> > (org.I0Itec.zkclient.ZkClient)
> >
> > [2014-07-17 15:34:46,523] INFO Loading log 'secure.test-0'
> > (kafka.log.LogManager)
> >
> > [2014-07-17 15:34:46,558] INFO Recovering unflushed segment 0 in log
> > secure.test-0. (kafka.log.Log)
> >
> > [2014-07-17 15:34:46,571] INFO Completed load of log secure.test-0 with
> log
> > end offset 0 (kafka.log.Log)
> >
> > [2014-07-17 15:34:46,582] INFO Starting log cleanup with a period of
> 6
> > ms. (kafka.log.LogManager)
> >
> > [2014-07-17 15:34:46,587] INFO Starting log flusher with a default period
> > of 9223372036854775807 ms. (kafka.log.LogManager)
> >
> > [2014-07-17 15:34:46,614] INFO Initializing secure authentication
> > (kafka.network.security.SecureAuth$)
> >
> > [2014-07-17 15:34:46,678] INFO Secure authentication initialization has
> > been successfully completed (kafka.network.security.SecureAuth$)
> >
> > [2014-07-17 15:34:46,691] INFO Awaiting socket connections on
> 0.0.0.0:9092
> > .
> > (kafka.network.Acceptor)
> >
> > [2014-07-17 15:34:46,692] INFO [Socket Server on Broker 0], Started
> > (kafka.network.SocketServer)
> >
> > [2014-07-17 15:34:46,794] INFO Will not load MX4J, mx4j-tools.jar is not
> in
> > the classpath (kafka.utils.Mx4jLoader$)
> >
> > [2014-07-17 15:34:46,837] INFO 0 successfully elected as leader
> > (kafka.server.ZookeeperLeaderElector)
> >
> > [2014-07-17 15:34:47,057] INFO Registered broker 0 at path /brokers/ids/0
> > with address 10.1.100.130:9092. (kafka.utils.ZkUtils$)
> >
> > [2014-07-17 15:34:47,059] INFO New leader is 0
> > (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
> >
> > *[2014-07-17 15:34:47,068] INFO [Kafka Server 0], started
> > (kafka.server.KafkaServer)*
> >
> > *[2014-07-17 15:34:47,383] INFO begin ssl handshake for
> > /10.1.100.130:9092//10.1.100.130:51685
> > 
> > (kafka.network.security.SSLSocketChannel)*
> >
> > *[2014-07-17 15:34:47,392] INFO begin ssl handshake for
> > 10.1.100.130/10.1.100.130:51685//10.1.100.130:9092
> > 
> > (kafka.network.security.SSLSocketChannel)*
> >
> > *[2014-07-17 15:34:47,465] INFO finished ssl handshake for
> > 10.1.100.130/10.1.100.130:51685//10.1.100.130:9092
> > 
> > (kafka.network.security.SSLSocketChannel)*
> >
> > *[2014-07-17 15:34:47,465] INFO finished ssl handshake for
> > /10.1.100.130:9092//10.1.100.130:51685
> > 
> > (kafka.network.security.SSLSocketChannel)*
> >
> > *[2014-07-17 15:34:47,617] INFO [ReplicaFetcherManager on broker 0]
> Removed
> > fetcher for partitions  (kafka.server.ReplicaFetcherManager)*
> >
> > *[2014-07-17 15:34:47,627] INFO [ReplicaFetcherManager on broker 0] Added
> > fetcher for partitions List() (kafka.server.ReplicaFetcherManager)*
> >
> > *[2014-07-17 15:34:47,656] INFO [ReplicaFetcherManager on broker 0]
> Removed
> > fetcher for partitions [secure.test,0]
> > (kafka.server.ReplicaFetcherManager)*
> >
> > [2014-07-17 15:37:15,970] INFO begin ssl handshake for
> > 10.1.100.130/10.1.100.130:51689//10.1.100.130:9092
> > (kafka.network.security.SSLSocketChannel)
> >
> > [2014-07-17 15:37:16,075] INFO begin ssl handshake for
> > 10.1.100.130/10.1.100.130:51690//10.1.100.130:9092
> > (kafka.network.security.SSLSocketChannel)
> >
> > [2014-07-17 15:37:16,434] INFO begin ssl handshake for
> > 10.1.100.130/10.1.100.130:51691//10.1.100.130:9092
> > (kafka.network.security.SSLSocketChannel)
> >
> > [2014-07-17 15:37:16,530] INFO begin ssl handshake for
> > 10.1.100.130/10.1.100.130:51692//10.1.100.130:9092
> > (kafka.network.security.SSLSocketChannel)
> >
> > [2014-07-17 15:37:16,743] INFO begin ssl handshake for
> > 10.1.100.130/10.1.100.130:51693//10.1.100.130:9092
> > (kafka.network.security.SSLSocketChannel)
> >
> > [2014-07-17 15:37:16,834] INFO begin ssl handshake for
> > 10.1.100.130/10.1.100.130:51694//10.1.100.130:9092
> > (kafka.network.security.SSLSocketChanne

Re: [DISCUSS] Kafka Security Specific Features

2014-07-17 Thread Rajasekar Elango
Can you try with turning off security to check if this error happens only
on secure mode?

Thanks,
Raja.




On Thu, Jul 17, 2014 at 3:51 PM, Pramod Deshmukh  wrote:

> Thanks Raja, it was helpful
>
> Now I am able to start zookeeper and broker in secure mode ready for SSL
> handshake. I get *java.lang.OutOfMemoryError: Java heap space* on producer.
>
> I using the default configuration and keystore. Is there anything missing
>
> *Start broker:*
>
> *bin/kafka-server-start.sh config/server.properties*
>
>
>
> *broker.log:*
>
> [2014-07-17 15:34:46,281] INFO zookeeper state changed (SyncConnected)
> (org.I0Itec.zkclient.ZkClient)
>
> [2014-07-17 15:34:46,523] INFO Loading log 'secure.test-0'
> (kafka.log.LogManager)
>
> [2014-07-17 15:34:46,558] INFO Recovering unflushed segment 0 in log
> secure.test-0. (kafka.log.Log)
>
> [2014-07-17 15:34:46,571] INFO Completed load of log secure.test-0 with log
> end offset 0 (kafka.log.Log)
>
> [2014-07-17 15:34:46,582] INFO Starting log cleanup with a period of 6
> ms. (kafka.log.LogManager)
>
> [2014-07-17 15:34:46,587] INFO Starting log flusher with a default period
> of 9223372036854775807 ms. (kafka.log.LogManager)
>
> [2014-07-17 15:34:46,614] INFO Initializing secure authentication
> (kafka.network.security.SecureAuth$)
>
> [2014-07-17 15:34:46,678] INFO Secure authentication initialization has
> been successfully completed (kafka.network.security.SecureAuth$)
>
> [2014-07-17 15:34:46,691] INFO Awaiting socket connections on 0.0.0.0:9092
> .
> (kafka.network.Acceptor)
>
> [2014-07-17 15:34:46,692] INFO [Socket Server on Broker 0], Started
> (kafka.network.SocketServer)
>
> [2014-07-17 15:34:46,794] INFO Will not load MX4J, mx4j-tools.jar is not in
> the classpath (kafka.utils.Mx4jLoader$)
>
> [2014-07-17 15:34:46,837] INFO 0 successfully elected as leader
> (kafka.server.ZookeeperLeaderElector)
>
> [2014-07-17 15:34:47,057] INFO Registered broker 0 at path /brokers/ids/0
> with address 10.1.100.130:9092. (kafka.utils.ZkUtils$)
>
> [2014-07-17 15:34:47,059] INFO New leader is 0
> (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
>
> *[2014-07-17 15:34:47,068] INFO [Kafka Server 0], started
> (kafka.server.KafkaServer)*
>
> *[2014-07-17 15:34:47,383] INFO begin ssl handshake for
> /10.1.100.130:9092//10.1.100.130:51685
> 
> (kafka.network.security.SSLSocketChannel)*
>
> *[2014-07-17 15:34:47,392] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51685//10.1.100.130:9092
> 
> (kafka.network.security.SSLSocketChannel)*
>
> *[2014-07-17 15:34:47,465] INFO finished ssl handshake for
> 10.1.100.130/10.1.100.130:51685//10.1.100.130:9092
> 
> (kafka.network.security.SSLSocketChannel)*
>
> *[2014-07-17 15:34:47,465] INFO finished ssl handshake for
> /10.1.100.130:9092//10.1.100.130:51685
> 
> (kafka.network.security.SSLSocketChannel)*
>
> *[2014-07-17 15:34:47,617] INFO [ReplicaFetcherManager on broker 0] Removed
> fetcher for partitions  (kafka.server.ReplicaFetcherManager)*
>
> *[2014-07-17 15:34:47,627] INFO [ReplicaFetcherManager on broker 0] Added
> fetcher for partitions List() (kafka.server.ReplicaFetcherManager)*
>
> *[2014-07-17 15:34:47,656] INFO [ReplicaFetcherManager on broker 0] Removed
> fetcher for partitions [secure.test,0]
> (kafka.server.ReplicaFetcherManager)*
>
> [2014-07-17 15:37:15,970] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51689//10.1.100.130:9092
> (kafka.network.security.SSLSocketChannel)
>
> [2014-07-17 15:37:16,075] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51690//10.1.100.130:9092
> (kafka.network.security.SSLSocketChannel)
>
> [2014-07-17 15:37:16,434] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51691//10.1.100.130:9092
> (kafka.network.security.SSLSocketChannel)
>
> [2014-07-17 15:37:16,530] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51692//10.1.100.130:9092
> (kafka.network.security.SSLSocketChannel)
>
> [2014-07-17 15:37:16,743] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51693//10.1.100.130:9092
> (kafka.network.security.SSLSocketChannel)
>
> [2014-07-17 15:37:16,834] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51694//10.1.100.130:9092
> (kafka.network.security.SSLSocketChannel)
>
> [2014-07-17 15:37:17,043] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51695//10.1.100.130:9092
> (kafka.network.security.SSLSocketChannel)
>
> [2014-07-17 15:37:17,137] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51696//10.1.100.130:9092
> (kafka.network.security.SSLSocketChannel)
>
> [2014-07-17 15:37:17,342] INFO begin ssl handshake for
> 10.1.100.130/10.1.100.130:51697//10.1.100.130:9092
> (kafka.network.security.SSLSocketChannel)
>
>
> *Start producer*
>
> *bin/kafka-console-producer.sh --broker-list 10.1.100.130:9092:tr

Re: [DISCUSS] Kafka Security Specific Features

2014-07-17 Thread Pramod Deshmukh
Thanks Raja, it was helpful

Now I am able to start zookeeper and broker in secure mode ready for SSL
handshake. I get *java.lang.OutOfMemoryError: Java heap space* on producer.

I using the default configuration and keystore. Is there anything missing

*Start broker:*

*bin/kafka-server-start.sh config/server.properties*



*broker.log:*

[2014-07-17 15:34:46,281] INFO zookeeper state changed (SyncConnected)
(org.I0Itec.zkclient.ZkClient)

[2014-07-17 15:34:46,523] INFO Loading log 'secure.test-0'
(kafka.log.LogManager)

[2014-07-17 15:34:46,558] INFO Recovering unflushed segment 0 in log
secure.test-0. (kafka.log.Log)

[2014-07-17 15:34:46,571] INFO Completed load of log secure.test-0 with log
end offset 0 (kafka.log.Log)

[2014-07-17 15:34:46,582] INFO Starting log cleanup with a period of 6
ms. (kafka.log.LogManager)

[2014-07-17 15:34:46,587] INFO Starting log flusher with a default period
of 9223372036854775807 ms. (kafka.log.LogManager)

[2014-07-17 15:34:46,614] INFO Initializing secure authentication
(kafka.network.security.SecureAuth$)

[2014-07-17 15:34:46,678] INFO Secure authentication initialization has
been successfully completed (kafka.network.security.SecureAuth$)

[2014-07-17 15:34:46,691] INFO Awaiting socket connections on 0.0.0.0:9092.
(kafka.network.Acceptor)

[2014-07-17 15:34:46,692] INFO [Socket Server on Broker 0], Started
(kafka.network.SocketServer)

[2014-07-17 15:34:46,794] INFO Will not load MX4J, mx4j-tools.jar is not in
the classpath (kafka.utils.Mx4jLoader$)

[2014-07-17 15:34:46,837] INFO 0 successfully elected as leader
(kafka.server.ZookeeperLeaderElector)

[2014-07-17 15:34:47,057] INFO Registered broker 0 at path /brokers/ids/0
with address 10.1.100.130:9092. (kafka.utils.ZkUtils$)

[2014-07-17 15:34:47,059] INFO New leader is 0
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)

*[2014-07-17 15:34:47,068] INFO [Kafka Server 0], started
(kafka.server.KafkaServer)*

*[2014-07-17 15:34:47,383] INFO begin ssl handshake for
/10.1.100.130:9092//10.1.100.130:51685

(kafka.network.security.SSLSocketChannel)*

*[2014-07-17 15:34:47,392] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51685//10.1.100.130:9092

(kafka.network.security.SSLSocketChannel)*

*[2014-07-17 15:34:47,465] INFO finished ssl handshake for
10.1.100.130/10.1.100.130:51685//10.1.100.130:9092

(kafka.network.security.SSLSocketChannel)*

*[2014-07-17 15:34:47,465] INFO finished ssl handshake for
/10.1.100.130:9092//10.1.100.130:51685

(kafka.network.security.SSLSocketChannel)*

*[2014-07-17 15:34:47,617] INFO [ReplicaFetcherManager on broker 0] Removed
fetcher for partitions  (kafka.server.ReplicaFetcherManager)*

*[2014-07-17 15:34:47,627] INFO [ReplicaFetcherManager on broker 0] Added
fetcher for partitions List() (kafka.server.ReplicaFetcherManager)*

*[2014-07-17 15:34:47,656] INFO [ReplicaFetcherManager on broker 0] Removed
fetcher for partitions [secure.test,0] (kafka.server.ReplicaFetcherManager)*

[2014-07-17 15:37:15,970] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51689//10.1.100.130:9092
(kafka.network.security.SSLSocketChannel)

[2014-07-17 15:37:16,075] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51690//10.1.100.130:9092
(kafka.network.security.SSLSocketChannel)

[2014-07-17 15:37:16,434] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51691//10.1.100.130:9092
(kafka.network.security.SSLSocketChannel)

[2014-07-17 15:37:16,530] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51692//10.1.100.130:9092
(kafka.network.security.SSLSocketChannel)

[2014-07-17 15:37:16,743] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51693//10.1.100.130:9092
(kafka.network.security.SSLSocketChannel)

[2014-07-17 15:37:16,834] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51694//10.1.100.130:9092
(kafka.network.security.SSLSocketChannel)

[2014-07-17 15:37:17,043] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51695//10.1.100.130:9092
(kafka.network.security.SSLSocketChannel)

[2014-07-17 15:37:17,137] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51696//10.1.100.130:9092
(kafka.network.security.SSLSocketChannel)

[2014-07-17 15:37:17,342] INFO begin ssl handshake for
10.1.100.130/10.1.100.130:51697//10.1.100.130:9092
(kafka.network.security.SSLSocketChannel)


*Start producer*

*bin/kafka-console-producer.sh --broker-list 10.1.100.130:9092:true --topic
secure.test*


*producer.log:*

bin/kafka-console-producer.sh --broker-list 10.1.100.130:9092:true --topic
secure.test

[2014-07-17 15:37:46,889] WARN Property topic is not valid
(kafka.utils.VerifiableProperties)

Hello Secure Kafka

*[2014-07-17 15:38:14,186] ERROR OOME with size 352518400
(kafka.network.BoundedByteBufferReceive)*

*java.lang.OutOfMemoryError: Java heap spa

Re: [DISCUSS] Kafka Security Specific Features

2014-07-16 Thread Pramod Deshmukh
Thanks Joe for this,

I cloned this branch and tried to run zookeeper but I get

Error: Could not find or load main class
org.apache.zookeeper.server.quorum.QuorumPeerMain


I see scala version is still set to 2.8.0

if [ -z "$SCALA_VERSION" ]; then

SCALA_VERSION=2.8.0

fi



Then I installed sbt and scala and followed your instructions for different
scala versions. I was able to bring zookeeper up but brokers fail to start
with error

Error: Could not find or load main class kafka.Kafka

I think I am doing something wrong. Can you please help me?

Our current production setup is with 2.8.0 and want to stick to it.

Thanks,

Pramod


On Tue, Jun 3, 2014 at 3:57 PM, Joe Stein  wrote:

> Hi,I wanted to re-ignite the discussion around Apache Kafka Security.  This
> is a huge bottleneck (non-starter in some cases) for a lot of organizations
> (due to regulatory, compliance and other requirements). Below are my
> suggestions for specific changes in Kafka to accommodate security
> requirements.  This comes from what folks are doing "in the wild" to
> workaround and implement security with Kafka as it is today and also what I
> have discovered from organizations about their blockers. It also picks up
> from the wiki (which I should have time to update later in the week based
> on the below and feedback from the thread).
>
> 1) Transport Layer Security (i.e. SSL)
>
> This also includes client authentication in addition to in-transit security
> layer.  This work has been picked up here
> https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any
> thoughts, comments, feedback, tomatoes, whatever for this patch.  It is a
> pickup from the fork of the work first done here
> https://github.com/relango/kafka/tree/kafka_security.
>
> 2) Data encryption at rest.
>
> This is very important and something that can be facilitated within the
> wire protocol. It requires an additional map data structure for the
> "encrypted [data encryption key]". With this map (either in your object or
> in the wire protocol) you can store the dynamically generated symmetric key
> (for each message) and then encrypt the data using that dynamically
> generated key.  You then encrypt the encryption key using each public key
> for whom is expected to be able to decrypt the encryption key to then
> decrypt the message.  For each public key encrypted symmetric key (which is
> now the "encrypted [data encryption key]" along with which public key it
> was encrypted with for (so a map of [publicKey] =
> encryptedDataEncryptionKey) as a chain.   Other patterns can be implemented
> but this is a pretty standard digital enveloping [0] pattern with only 1
> field added. Other patterns should be able to use that field to-do their
> implementation too.
>
> 3) Non-repudiation and long term non-repudiation.
>
> Non-repudiation is proving data hasn't changed.  This is often (if not
> always) done with x509 public certificates (chained to a certificate
> authority).
>
> Long term non-repudiation is what happens when the certificates of the
> certificate authority are expired (or revoked) and everything ever signed
> (ever) with that certificate's public key then becomes "no longer provable
> as ever being authentic".  That is where RFC3126 [1] and RFC3161 [2] come
> in (or worm drives [hardware], etc).
>
> For either (or both) of these it is an operation of the encryptor to
> sign/hash the data (with or without third party trusted timestap of the
> signing event) and encrypt that with their own private key and distribute
> the results (before and after encrypting if required) along with their
> public key. This structure is a bit more complex but feasible, it is a map
> of digital signature formats and the chain of dig sig attestations.  The
> map's key being the method (i.e. CRC32, PKCS7 [3], XmlDigSig [4]) and then
> a list of map where that key is "purpose" of signature (what your attesting
> too).  As a sibling field to the list another field for "the attester" as
> bytes (e.g. their PKCS12 [5] for the map of PKCS7 signatures).
>
> 4) Authorization
>
> We should have a policy of "404" for data, topics, partitions (etc) if
> authenticated connections do not have access.  In "secure mode" any non
> authenticated connections should get a "404" type message on everything.
> Knowing "something is there" is a security risk in many uses cases.  So if
> you don't have access you don't even see it.  Baking "that" into Kafka
> along with some interface for entitlement (access management) systems
> (pretty standard) is all that I think needs to be done to the core project.
>  I want to tackle item later in the year after summer after the other three
> are complete.
>
> I look forward to thoughts on this and anyone else interested in working
> with us on these items.
>
> [0]
>
> http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/what-is-a-digital-envelope.htm
> [1] http://tools.ietf.org/html/rfc3126
> [2] http://tools.ietf.org/html/rfc3161

Re: [DISCUSS] Kafka Security Specific Features

2014-07-16 Thread Pramod Deshmukh
Hello Joe,

Is there a configuration or example to test Kafka security piece?

Thanks,

Pramod


On Wed, Jul 16, 2014 at 5:20 PM, Pramod Deshmukh  wrote:

> Thanks Joe,
>
> This branch works. I was able to proceed. I still had to set scala version
> to 2.9.2 in kafka-run-class.sh.
>
>
>
> On Wed, Jul 16, 2014 at 3:57 PM, Joe Stein  wrote:
>
>> That is a very old branch.
>>
>> Here is a more up to date one
>> https://github.com/stealthly/kafka/tree/v0.8.2_KAFKA-1477 (needs to be
>> updated to latest trunk might have a chance to-do that next week).
>>
>> You should be using gradle now as per the README.
>>
>> /***
>>  Joe Stein
>>  Founder, Principal Consultant
>>  Big Data Open Source Security LLC
>>  http://www.stealth.ly
>>  Twitter: @allthingshadoop 
>> /
>>
>>
>> On Wed, Jul 16, 2014 at 3:49 PM, Pramod Deshmukh 
>> wrote:
>>
>> > Thanks Joe for this,
>> >
>> > I cloned this branch and tried to run zookeeper but I get
>> >
>> > Error: Could not find or load main class
>> > org.apache.zookeeper.server.quorum.QuorumPeerMain
>> >
>> >
>> > I see scala version is still set to 2.8.0
>> >
>> > if [ -z "$SCALA_VERSION" ]; then
>> >
>> > SCALA_VERSION=2.8.0
>> >
>> > fi
>> >
>> >
>> >
>> > Then I installed sbt and scala and followed your instructions for
>> different
>> > scala versions. I was able to bring zookeeper up but brokers fail to
>> start
>> > with error
>> >
>> > Error: Could not find or load main class kafka.Kafka
>> >
>> > I think I am doing something wrong. Can you please help me?
>> >
>> > Our current production setup is with 2.8.0 and want to stick to it.
>> >
>> > Thanks,
>> >
>> > Pramod
>> >
>> >
>> > On Tue, Jun 3, 2014 at 3:57 PM, Joe Stein  wrote:
>> >
>> > > Hi,I wanted to re-ignite the discussion around Apache Kafka Security.
>> >  This
>> > > is a huge bottleneck (non-starter in some cases) for a lot of
>> > organizations
>> > > (due to regulatory, compliance and other requirements). Below are my
>> > > suggestions for specific changes in Kafka to accommodate security
>> > > requirements.  This comes from what folks are doing "in the wild" to
>> > > workaround and implement security with Kafka as it is today and also
>> > what I
>> > > have discovered from organizations about their blockers. It also
>> picks up
>> > > from the wiki (which I should have time to update later in the week
>> based
>> > > on the below and feedback from the thread).
>> > >
>> > > 1) Transport Layer Security (i.e. SSL)
>> > >
>> > > This also includes client authentication in addition to in-transit
>> > security
>> > > layer.  This work has been picked up here
>> > > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate
>> any
>> > > thoughts, comments, feedback, tomatoes, whatever for this patch.  It
>> is a
>> > > pickup from the fork of the work first done here
>> > > https://github.com/relango/kafka/tree/kafka_security.
>> > >
>> > > 2) Data encryption at rest.
>> > >
>> > > This is very important and something that can be facilitated within
>> the
>> > > wire protocol. It requires an additional map data structure for the
>> > > "encrypted [data encryption key]". With this map (either in your
>> object
>> > or
>> > > in the wire protocol) you can store the dynamically generated
>> symmetric
>> > key
>> > > (for each message) and then encrypt the data using that dynamically
>> > > generated key.  You then encrypt the encryption key using each public
>> key
>> > > for whom is expected to be able to decrypt the encryption key to then
>> > > decrypt the message.  For each public key encrypted symmetric key
>> (which
>> > is
>> > > now the "encrypted [data encryption key]" along with which public key
>> it
>> > > was encrypted with for (so a map of [publicKey] =
>> > > encryptedDataEncryptionKey) as a chain.   Other patterns can be
>> > implemented
>> > > but this is a pretty standard digital enveloping [0] pattern with
>> only 1
>> > > field added. Other patterns should be able to use that field to-do
>> their
>> > > implementation too.
>> > >
>> > > 3) Non-repudiation and long term non-repudiation.
>> > >
>> > > Non-repudiation is proving data hasn't changed.  This is often (if not
>> > > always) done with x509 public certificates (chained to a certificate
>> > > authority).
>> > >
>> > > Long term non-repudiation is what happens when the certificates of the
>> > > certificate authority are expired (or revoked) and everything ever
>> signed
>> > > (ever) with that certificate's public key then becomes "no longer
>> > provable
>> > > as ever being authentic".  That is where RFC3126 [1] and RFC3161 [2]
>> come
>> > > in (or worm drives [hardware], etc).
>> > >
>> > > For either (or both) of these it is an operation of the encryptor to
>> > > sign/hash the data (with or without third party trusted timestap of
>> the
>> > > signing event) and encrypt that with their own p

Re: [DISCUSS] Kafka Security Specific Features

2014-07-16 Thread Rajasekar Elango
Pramod,


I presented secure kafka configuration and usage at last meet up. So hope this
video recording would help. You
can skip to about 59 min to jump to security talk.

Thanks,
Raja.


On Wed, Jul 16, 2014 at 5:57 PM, Pramod Deshmukh  wrote:

> Hello Joe,
>
> Is there a configuration or example to test Kafka security piece?
>
> Thanks,
>
> Pramod
>
>
> On Wed, Jul 16, 2014 at 5:20 PM, Pramod Deshmukh 
> wrote:
>
> > Thanks Joe,
> >
> > This branch works. I was able to proceed. I still had to set scala
> version
> > to 2.9.2 in kafka-run-class.sh.
> >
> >
> >
> > On Wed, Jul 16, 2014 at 3:57 PM, Joe Stein  wrote:
> >
> >> That is a very old branch.
> >>
> >> Here is a more up to date one
> >> https://github.com/stealthly/kafka/tree/v0.8.2_KAFKA-1477 (needs to be
> >> updated to latest trunk might have a chance to-do that next week).
> >>
> >> You should be using gradle now as per the README.
> >>
> >> /***
> >>  Joe Stein
> >>  Founder, Principal Consultant
> >>  Big Data Open Source Security LLC
> >>  http://www.stealth.ly
> >>  Twitter: @allthingshadoop 
> >> /
> >>
> >>
> >> On Wed, Jul 16, 2014 at 3:49 PM, Pramod Deshmukh 
> >> wrote:
> >>
> >> > Thanks Joe for this,
> >> >
> >> > I cloned this branch and tried to run zookeeper but I get
> >> >
> >> > Error: Could not find or load main class
> >> > org.apache.zookeeper.server.quorum.QuorumPeerMain
> >> >
> >> >
> >> > I see scala version is still set to 2.8.0
> >> >
> >> > if [ -z "$SCALA_VERSION" ]; then
> >> >
> >> > SCALA_VERSION=2.8.0
> >> >
> >> > fi
> >> >
> >> >
> >> >
> >> > Then I installed sbt and scala and followed your instructions for
> >> different
> >> > scala versions. I was able to bring zookeeper up but brokers fail to
> >> start
> >> > with error
> >> >
> >> > Error: Could not find or load main class kafka.Kafka
> >> >
> >> > I think I am doing something wrong. Can you please help me?
> >> >
> >> > Our current production setup is with 2.8.0 and want to stick to it.
> >> >
> >> > Thanks,
> >> >
> >> > Pramod
> >> >
> >> >
> >> > On Tue, Jun 3, 2014 at 3:57 PM, Joe Stein 
> wrote:
> >> >
> >> > > Hi,I wanted to re-ignite the discussion around Apache Kafka
> Security.
> >> >  This
> >> > > is a huge bottleneck (non-starter in some cases) for a lot of
> >> > organizations
> >> > > (due to regulatory, compliance and other requirements). Below are my
> >> > > suggestions for specific changes in Kafka to accommodate security
> >> > > requirements.  This comes from what folks are doing "in the wild" to
> >> > > workaround and implement security with Kafka as it is today and also
> >> > what I
> >> > > have discovered from organizations about their blockers. It also
> >> picks up
> >> > > from the wiki (which I should have time to update later in the week
> >> based
> >> > > on the below and feedback from the thread).
> >> > >
> >> > > 1) Transport Layer Security (i.e. SSL)
> >> > >
> >> > > This also includes client authentication in addition to in-transit
> >> > security
> >> > > layer.  This work has been picked up here
> >> > > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate
> >> any
> >> > > thoughts, comments, feedback, tomatoes, whatever for this patch.  It
> >> is a
> >> > > pickup from the fork of the work first done here
> >> > > https://github.com/relango/kafka/tree/kafka_security.
> >> > >
> >> > > 2) Data encryption at rest.
> >> > >
> >> > > This is very important and something that can be facilitated within
> >> the
> >> > > wire protocol. It requires an additional map data structure for the
> >> > > "encrypted [data encryption key]". With this map (either in your
> >> object
> >> > or
> >> > > in the wire protocol) you can store the dynamically generated
> >> symmetric
> >> > key
> >> > > (for each message) and then encrypt the data using that dynamically
> >> > > generated key.  You then encrypt the encryption key using each
> public
> >> key
> >> > > for whom is expected to be able to decrypt the encryption key to
> then
> >> > > decrypt the message.  For each public key encrypted symmetric key
> >> (which
> >> > is
> >> > > now the "encrypted [data encryption key]" along with which public
> key
> >> it
> >> > > was encrypted with for (so a map of [publicKey] =
> >> > > encryptedDataEncryptionKey) as a chain.   Other patterns can be
> >> > implemented
> >> > > but this is a pretty standard digital enveloping [0] pattern with
> >> only 1
> >> > > field added. Other patterns should be able to use that field to-do
> >> their
> >> > > implementation too.
> >> > >
> >> > > 3) Non-repudiation and long term non-repudiation.
> >> > >
> >> > > Non-repudiation is proving data hasn't changed.  This is often (if
> not
> >> > > always) done with x509 public certificates (chained to a certificate
> >> > > authority).
> >> > >
> >> > > Long term no

Re: [DISCUSS] Kafka Security Specific Features

2014-07-16 Thread Pramod Deshmukh
Thanks Joe,

This branch works. I was able to proceed. I still had to set scala version
to 2.9.2 in kafka-run-class.sh.



On Wed, Jul 16, 2014 at 3:57 PM, Joe Stein  wrote:

> That is a very old branch.
>
> Here is a more up to date one
> https://github.com/stealthly/kafka/tree/v0.8.2_KAFKA-1477 (needs to be
> updated to latest trunk might have a chance to-do that next week).
>
> You should be using gradle now as per the README.
>
> /***
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop 
> /
>
>
> On Wed, Jul 16, 2014 at 3:49 PM, Pramod Deshmukh 
> wrote:
>
> > Thanks Joe for this,
> >
> > I cloned this branch and tried to run zookeeper but I get
> >
> > Error: Could not find or load main class
> > org.apache.zookeeper.server.quorum.QuorumPeerMain
> >
> >
> > I see scala version is still set to 2.8.0
> >
> > if [ -z "$SCALA_VERSION" ]; then
> >
> > SCALA_VERSION=2.8.0
> >
> > fi
> >
> >
> >
> > Then I installed sbt and scala and followed your instructions for
> different
> > scala versions. I was able to bring zookeeper up but brokers fail to
> start
> > with error
> >
> > Error: Could not find or load main class kafka.Kafka
> >
> > I think I am doing something wrong. Can you please help me?
> >
> > Our current production setup is with 2.8.0 and want to stick to it.
> >
> > Thanks,
> >
> > Pramod
> >
> >
> > On Tue, Jun 3, 2014 at 3:57 PM, Joe Stein  wrote:
> >
> > > Hi,I wanted to re-ignite the discussion around Apache Kafka Security.
> >  This
> > > is a huge bottleneck (non-starter in some cases) for a lot of
> > organizations
> > > (due to regulatory, compliance and other requirements). Below are my
> > > suggestions for specific changes in Kafka to accommodate security
> > > requirements.  This comes from what folks are doing "in the wild" to
> > > workaround and implement security with Kafka as it is today and also
> > what I
> > > have discovered from organizations about their blockers. It also picks
> up
> > > from the wiki (which I should have time to update later in the week
> based
> > > on the below and feedback from the thread).
> > >
> > > 1) Transport Layer Security (i.e. SSL)
> > >
> > > This also includes client authentication in addition to in-transit
> > security
> > > layer.  This work has been picked up here
> > > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any
> > > thoughts, comments, feedback, tomatoes, whatever for this patch.  It
> is a
> > > pickup from the fork of the work first done here
> > > https://github.com/relango/kafka/tree/kafka_security.
> > >
> > > 2) Data encryption at rest.
> > >
> > > This is very important and something that can be facilitated within the
> > > wire protocol. It requires an additional map data structure for the
> > > "encrypted [data encryption key]". With this map (either in your object
> > or
> > > in the wire protocol) you can store the dynamically generated symmetric
> > key
> > > (for each message) and then encrypt the data using that dynamically
> > > generated key.  You then encrypt the encryption key using each public
> key
> > > for whom is expected to be able to decrypt the encryption key to then
> > > decrypt the message.  For each public key encrypted symmetric key
> (which
> > is
> > > now the "encrypted [data encryption key]" along with which public key
> it
> > > was encrypted with for (so a map of [publicKey] =
> > > encryptedDataEncryptionKey) as a chain.   Other patterns can be
> > implemented
> > > but this is a pretty standard digital enveloping [0] pattern with only
> 1
> > > field added. Other patterns should be able to use that field to-do
> their
> > > implementation too.
> > >
> > > 3) Non-repudiation and long term non-repudiation.
> > >
> > > Non-repudiation is proving data hasn't changed.  This is often (if not
> > > always) done with x509 public certificates (chained to a certificate
> > > authority).
> > >
> > > Long term non-repudiation is what happens when the certificates of the
> > > certificate authority are expired (or revoked) and everything ever
> signed
> > > (ever) with that certificate's public key then becomes "no longer
> > provable
> > > as ever being authentic".  That is where RFC3126 [1] and RFC3161 [2]
> come
> > > in (or worm drives [hardware], etc).
> > >
> > > For either (or both) of these it is an operation of the encryptor to
> > > sign/hash the data (with or without third party trusted timestap of the
> > > signing event) and encrypt that with their own private key and
> distribute
> > > the results (before and after encrypting if required) along with their
> > > public key. This structure is a bit more complex but feasible, it is a
> > map
> > > of digital signature formats and the chain of dig sig attestations.
>  The
> > > map's key being the method (i.e. 

Re: [DISCUSS] Kafka Security Specific Features

2014-07-16 Thread Joe Stein
That is a very old branch.

Here is a more up to date one
https://github.com/stealthly/kafka/tree/v0.8.2_KAFKA-1477 (needs to be
updated to latest trunk might have a chance to-do that next week).

You should be using gradle now as per the README.

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


On Wed, Jul 16, 2014 at 3:49 PM, Pramod Deshmukh  wrote:

> Thanks Joe for this,
>
> I cloned this branch and tried to run zookeeper but I get
>
> Error: Could not find or load main class
> org.apache.zookeeper.server.quorum.QuorumPeerMain
>
>
> I see scala version is still set to 2.8.0
>
> if [ -z "$SCALA_VERSION" ]; then
>
> SCALA_VERSION=2.8.0
>
> fi
>
>
>
> Then I installed sbt and scala and followed your instructions for different
> scala versions. I was able to bring zookeeper up but brokers fail to start
> with error
>
> Error: Could not find or load main class kafka.Kafka
>
> I think I am doing something wrong. Can you please help me?
>
> Our current production setup is with 2.8.0 and want to stick to it.
>
> Thanks,
>
> Pramod
>
>
> On Tue, Jun 3, 2014 at 3:57 PM, Joe Stein  wrote:
>
> > Hi,I wanted to re-ignite the discussion around Apache Kafka Security.
>  This
> > is a huge bottleneck (non-starter in some cases) for a lot of
> organizations
> > (due to regulatory, compliance and other requirements). Below are my
> > suggestions for specific changes in Kafka to accommodate security
> > requirements.  This comes from what folks are doing "in the wild" to
> > workaround and implement security with Kafka as it is today and also
> what I
> > have discovered from organizations about their blockers. It also picks up
> > from the wiki (which I should have time to update later in the week based
> > on the below and feedback from the thread).
> >
> > 1) Transport Layer Security (i.e. SSL)
> >
> > This also includes client authentication in addition to in-transit
> security
> > layer.  This work has been picked up here
> > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any
> > thoughts, comments, feedback, tomatoes, whatever for this patch.  It is a
> > pickup from the fork of the work first done here
> > https://github.com/relango/kafka/tree/kafka_security.
> >
> > 2) Data encryption at rest.
> >
> > This is very important and something that can be facilitated within the
> > wire protocol. It requires an additional map data structure for the
> > "encrypted [data encryption key]". With this map (either in your object
> or
> > in the wire protocol) you can store the dynamically generated symmetric
> key
> > (for each message) and then encrypt the data using that dynamically
> > generated key.  You then encrypt the encryption key using each public key
> > for whom is expected to be able to decrypt the encryption key to then
> > decrypt the message.  For each public key encrypted symmetric key (which
> is
> > now the "encrypted [data encryption key]" along with which public key it
> > was encrypted with for (so a map of [publicKey] =
> > encryptedDataEncryptionKey) as a chain.   Other patterns can be
> implemented
> > but this is a pretty standard digital enveloping [0] pattern with only 1
> > field added. Other patterns should be able to use that field to-do their
> > implementation too.
> >
> > 3) Non-repudiation and long term non-repudiation.
> >
> > Non-repudiation is proving data hasn't changed.  This is often (if not
> > always) done with x509 public certificates (chained to a certificate
> > authority).
> >
> > Long term non-repudiation is what happens when the certificates of the
> > certificate authority are expired (or revoked) and everything ever signed
> > (ever) with that certificate's public key then becomes "no longer
> provable
> > as ever being authentic".  That is where RFC3126 [1] and RFC3161 [2] come
> > in (or worm drives [hardware], etc).
> >
> > For either (or both) of these it is an operation of the encryptor to
> > sign/hash the data (with or without third party trusted timestap of the
> > signing event) and encrypt that with their own private key and distribute
> > the results (before and after encrypting if required) along with their
> > public key. This structure is a bit more complex but feasible, it is a
> map
> > of digital signature formats and the chain of dig sig attestations.  The
> > map's key being the method (i.e. CRC32, PKCS7 [3], XmlDigSig [4]) and
> then
> > a list of map where that key is "purpose" of signature (what your
> attesting
> > too).  As a sibling field to the list another field for "the attester" as
> > bytes (e.g. their PKCS12 [5] for the map of PKCS7 signatures).
> >
> > 4) Authorization
> >
> > We should have a policy of "404" for data, topics, partitions (etc) if
> > authenticated connections do not have access.  In "secure 

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Todd Palino
My concern is specifically around the rules for SOX compliance, or rules
around PII, PCI, or HIPAA compliance. The audits get very complication,
but my understanding is that the general rule is that sensitive data
should be encrypted at rest and only decrypted when needed. And we don¹t
just need to be concerned about a malicious user. Consider a ³typical²
technology environment where many people have administrative access to
systems. This is the one where you need to not have the data visible to
anyone unless they have a specific use for it, which means having it
encrypted. In almost any audit situation, you need to be able to show a
trail of exactly who modified the data, and exactly who viewed the data.

Now, I do agree that not everything has to be done within Kafka, and the
producers and consumers can coordinate their own encryption. But I think
it¹s useful to have the concept of an envelope for a message within Kafka.
This can be used to hold all sorts of useful information, such as hashes
of the encryption keys that were used to encrypt a message, or the
signature of the message itself (so that you can have both confidentiality
and integrity). It can also be used to hold things like the time a message
was received into your infrastructure, or the specific Kafka cluster it
was stored in. A special consumer and producer, such as the mirror maker,
would be able to preserve this envelope across clusters.

-Todd


On 6/5/14, 2:18 PM, "Jay Kreps"  wrote:

>Hey Todd,
>
>Can you elaborate on this? Certainly restricting access to and
>modification
>of data is important. But this doesn't imply storing the data encrypted.
>Are we assuming the attacker can (1) get on the network, (2) get on the
>kafka server as a non-root and non-kafka user or (3) get root on the Kafka
>server? If we assume (3) then it seems we are in a pretty bad state as
>almost any facility Kafka provides can be subverted by the root user just
>changing the Kafka code to not enforce that facility. Which of these
>levels
>of access are we assuming?
>
>Also which things actually need to be done inside Kafka and which can be
>done externally? Nothing prevents users from encrypting data they put into
>Kafka today, it is just that Kafka doesn't do this for you. But is there a
>reason you want Kafka to do this?
>
>The reason I am pushing on these things a bit is because I want to make
>sure we don't end up with a set of requirements so broad we can never
>really get them implemented...
>
>-Jay
>
>
>
>
>On Thu, Jun 5, 2014 at 2:05 PM, Todd Palino 
>wrote:
>
>> No, at-rest encryption is definitely important. When you start talking
>> about data that is used for financial reporting, restricting access to
>>it
>> (both modification and visibility) is a critical component.
>>
>> -Todd
>>
>>
>> On 6/5/14, 2:01 PM, "Jay Kreps"  wrote:
>>
>> >Hey Joe,
>> >
>> >I don't really understand the sections you added to the wiki. Can you
>> >clarify them?
>> >
>> >Is non-repudiation what SASL would call integrity checks? If so don't
>>SSL
>> >and and many of the SASL schemes already support this as well as
>> >on-the-wire encryption?
>> >
>> >Or are you proposing an on-disk encryption scheme? Is this actually
>> >needed?
>> >Isn't a on-the-wire encryption when combined with mutual authentication
>> >and
>> >permissions sufficient for most uses?
>> >
>> >On-disk encryption seems unnecessary because if an attacker can get
>>root
>> >on
>> >the kafka boxes it can potentially modify Kafka to do anything he or
>>she
>> >wants with data. So this seems to break any security model.
>> >
>> >I understand the problem of a large organization not really having a
>> >trusted network and wanting to secure data transfer and limit and audit
>> >data access. The uses for these other things I don't totally
>>understand.
>> >
>> >Also it would be worth understanding the state of other messaging and
>> >storage systems (Hadoop, dbs, etc). What features do they support. I
>>think
>> >there is a sense in which you don't have to run faster than the bear,
>>but
>> >only faster then your friends. :-)
>> >
>> >-Jay
>> >
>> >
>> >On Wed, Jun 4, 2014 at 5:57 PM, Joe Stein  wrote:
>> >
>> >> I like the idea of working on the spec and prioritizing. I will
>>update
>> >>the
>> >> wiki.
>> >>
>> >> - Joestein
>> >>
>> >>
>> >> On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps 
>>wrote:
>> >>
>> >> > Hey Joe,
>> >> >
>> >> > Thanks for kicking this discussion off! I totally agree that for
>> >> something
>> >> > that acts as a central message broker security is critical
>>feature. I
>> >> think
>> >> > a number of people have been interested in this topic and several
>> >>people
>> >> > have put effort into special purpose security efforts.
>> >> >
>> >> > Since most the LinkedIn folks are working on the consumer right
>>now I
>> >> think
>> >> > this would be a great project for any other interested people to
>>take
>> >>on.
>> >> > There are some challenges in doing these things distributed but it
>>can

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Jay Kreps
Hey Todd,

Can you elaborate on this? Certainly restricting access to and modification
of data is important. But this doesn't imply storing the data encrypted.
Are we assuming the attacker can (1) get on the network, (2) get on the
kafka server as a non-root and non-kafka user or (3) get root on the Kafka
server? If we assume (3) then it seems we are in a pretty bad state as
almost any facility Kafka provides can be subverted by the root user just
changing the Kafka code to not enforce that facility. Which of these levels
of access are we assuming?

Also which things actually need to be done inside Kafka and which can be
done externally? Nothing prevents users from encrypting data they put into
Kafka today, it is just that Kafka doesn't do this for you. But is there a
reason you want Kafka to do this?

The reason I am pushing on these things a bit is because I want to make
sure we don't end up with a set of requirements so broad we can never
really get them implemented...

-Jay




On Thu, Jun 5, 2014 at 2:05 PM, Todd Palino 
wrote:

> No, at-rest encryption is definitely important. When you start talking
> about data that is used for financial reporting, restricting access to it
> (both modification and visibility) is a critical component.
>
> -Todd
>
>
> On 6/5/14, 2:01 PM, "Jay Kreps"  wrote:
>
> >Hey Joe,
> >
> >I don't really understand the sections you added to the wiki. Can you
> >clarify them?
> >
> >Is non-repudiation what SASL would call integrity checks? If so don't SSL
> >and and many of the SASL schemes already support this as well as
> >on-the-wire encryption?
> >
> >Or are you proposing an on-disk encryption scheme? Is this actually
> >needed?
> >Isn't a on-the-wire encryption when combined with mutual authentication
> >and
> >permissions sufficient for most uses?
> >
> >On-disk encryption seems unnecessary because if an attacker can get root
> >on
> >the kafka boxes it can potentially modify Kafka to do anything he or she
> >wants with data. So this seems to break any security model.
> >
> >I understand the problem of a large organization not really having a
> >trusted network and wanting to secure data transfer and limit and audit
> >data access. The uses for these other things I don't totally understand.
> >
> >Also it would be worth understanding the state of other messaging and
> >storage systems (Hadoop, dbs, etc). What features do they support. I think
> >there is a sense in which you don't have to run faster than the bear, but
> >only faster then your friends. :-)
> >
> >-Jay
> >
> >
> >On Wed, Jun 4, 2014 at 5:57 PM, Joe Stein  wrote:
> >
> >> I like the idea of working on the spec and prioritizing. I will update
> >>the
> >> wiki.
> >>
> >> - Joestein
> >>
> >>
> >> On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps  wrote:
> >>
> >> > Hey Joe,
> >> >
> >> > Thanks for kicking this discussion off! I totally agree that for
> >> something
> >> > that acts as a central message broker security is critical feature. I
> >> think
> >> > a number of people have been interested in this topic and several
> >>people
> >> > have put effort into special purpose security efforts.
> >> >
> >> > Since most the LinkedIn folks are working on the consumer right now I
> >> think
> >> > this would be a great project for any other interested people to take
> >>on.
> >> > There are some challenges in doing these things distributed but it can
> >> also
> >> > be a lot of fun.
> >> >
> >> > I think a good first step would be to get a written plan we can all
> >>agree
> >> > on for how things should work. Then we can break things down into
> >>chunks
> >> > that can be done independently while still aiming at a good end state.
> >> >
> >> > I had tried to write up some notes that summarized at least the
> >>thoughts
> >> I
> >> > had had on security:
> >> > https://cwiki.apache.org/confluence/display/KAFKA/Security
> >> >
> >> > What do you think of that?
> >> >
> >> > One assumption I had (which may be incorrect) is that although we want
> >> all
> >> > the things in your list, the two most pressing would be authentication
> >> and
> >> > authorization, and that was all that write up covered. You have more
> >> > experience in this domain, so I wonder how you would prioritize?
> >> >
> >> > Those notes are really sketchy, so I think the first goal I would have
> >> > would be to get to a real spec we can all agree on and discuss. A lot
> >>of
> >> > the security stuff has a high human interaction element and needs to
> >>work
> >> > in pretty different domains and different companies so getting this
> >>kind
> >> of
> >> > review is important.
> >> >
> >> > -Jay
> >> >
> >> >
> >> > On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein 
> >>wrote:
> >> >
> >> > > Hi,I wanted to re-ignite the discussion around Apache Kafka
> >>Security.
> >> >  This
> >> > > is a huge bottleneck (non-starter in some cases) for a lot of
> >> > organizations
> >> > > (due to regulatory, compliance and other requirements). Below are my
> >> > > s

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Todd Palino
No, at-rest encryption is definitely important. When you start talking
about data that is used for financial reporting, restricting access to it
(both modification and visibility) is a critical component.

-Todd


On 6/5/14, 2:01 PM, "Jay Kreps"  wrote:

>Hey Joe,
>
>I don't really understand the sections you added to the wiki. Can you
>clarify them?
>
>Is non-repudiation what SASL would call integrity checks? If so don't SSL
>and and many of the SASL schemes already support this as well as
>on-the-wire encryption?
>
>Or are you proposing an on-disk encryption scheme? Is this actually
>needed?
>Isn't a on-the-wire encryption when combined with mutual authentication
>and
>permissions sufficient for most uses?
>
>On-disk encryption seems unnecessary because if an attacker can get root
>on
>the kafka boxes it can potentially modify Kafka to do anything he or she
>wants with data. So this seems to break any security model.
>
>I understand the problem of a large organization not really having a
>trusted network and wanting to secure data transfer and limit and audit
>data access. The uses for these other things I don't totally understand.
>
>Also it would be worth understanding the state of other messaging and
>storage systems (Hadoop, dbs, etc). What features do they support. I think
>there is a sense in which you don't have to run faster than the bear, but
>only faster then your friends. :-)
>
>-Jay
>
>
>On Wed, Jun 4, 2014 at 5:57 PM, Joe Stein  wrote:
>
>> I like the idea of working on the spec and prioritizing. I will update
>>the
>> wiki.
>>
>> - Joestein
>>
>>
>> On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps  wrote:
>>
>> > Hey Joe,
>> >
>> > Thanks for kicking this discussion off! I totally agree that for
>> something
>> > that acts as a central message broker security is critical feature. I
>> think
>> > a number of people have been interested in this topic and several
>>people
>> > have put effort into special purpose security efforts.
>> >
>> > Since most the LinkedIn folks are working on the consumer right now I
>> think
>> > this would be a great project for any other interested people to take
>>on.
>> > There are some challenges in doing these things distributed but it can
>> also
>> > be a lot of fun.
>> >
>> > I think a good first step would be to get a written plan we can all
>>agree
>> > on for how things should work. Then we can break things down into
>>chunks
>> > that can be done independently while still aiming at a good end state.
>> >
>> > I had tried to write up some notes that summarized at least the
>>thoughts
>> I
>> > had had on security:
>> > https://cwiki.apache.org/confluence/display/KAFKA/Security
>> >
>> > What do you think of that?
>> >
>> > One assumption I had (which may be incorrect) is that although we want
>> all
>> > the things in your list, the two most pressing would be authentication
>> and
>> > authorization, and that was all that write up covered. You have more
>> > experience in this domain, so I wonder how you would prioritize?
>> >
>> > Those notes are really sketchy, so I think the first goal I would have
>> > would be to get to a real spec we can all agree on and discuss. A lot
>>of
>> > the security stuff has a high human interaction element and needs to
>>work
>> > in pretty different domains and different companies so getting this
>>kind
>> of
>> > review is important.
>> >
>> > -Jay
>> >
>> >
>> > On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein 
>>wrote:
>> >
>> > > Hi,I wanted to re-ignite the discussion around Apache Kafka
>>Security.
>> >  This
>> > > is a huge bottleneck (non-starter in some cases) for a lot of
>> > organizations
>> > > (due to regulatory, compliance and other requirements). Below are my
>> > > suggestions for specific changes in Kafka to accommodate security
>> > > requirements.  This comes from what folks are doing "in the wild" to
>> > > workaround and implement security with Kafka as it is today and also
>> > what I
>> > > have discovered from organizations about their blockers. It also
>>picks
>> up
>> > > from the wiki (which I should have time to update later in the week
>> based
>> > > on the below and feedback from the thread).
>> > >
>> > > 1) Transport Layer Security (i.e. SSL)
>> > >
>> > > This also includes client authentication in addition to in-transit
>> > security
>> > > layer.  This work has been picked up here
>> > > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate
>>any
>> > > thoughts, comments, feedback, tomatoes, whatever for this patch.  It
>> is a
>> > > pickup from the fork of the work first done here
>> > > https://github.com/relango/kafka/tree/kafka_security.
>> > >
>> > > 2) Data encryption at rest.
>> > >
>> > > This is very important and something that can be facilitated within
>>the
>> > > wire protocol. It requires an additional map data structure for the
>> > > "encrypted [data encryption key]". With this map (either in your
>>object
>> > or
>> > > in the wire protocol) you can store the dynam

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Jay Kreps
Hey Joe,

I don't really understand the sections you added to the wiki. Can you
clarify them?

Is non-repudiation what SASL would call integrity checks? If so don't SSL
and and many of the SASL schemes already support this as well as
on-the-wire encryption?

Or are you proposing an on-disk encryption scheme? Is this actually needed?
Isn't a on-the-wire encryption when combined with mutual authentication and
permissions sufficient for most uses?

On-disk encryption seems unnecessary because if an attacker can get root on
the kafka boxes it can potentially modify Kafka to do anything he or she
wants with data. So this seems to break any security model.

I understand the problem of a large organization not really having a
trusted network and wanting to secure data transfer and limit and audit
data access. The uses for these other things I don't totally understand.

Also it would be worth understanding the state of other messaging and
storage systems (Hadoop, dbs, etc). What features do they support. I think
there is a sense in which you don't have to run faster than the bear, but
only faster then your friends. :-)

-Jay


On Wed, Jun 4, 2014 at 5:57 PM, Joe Stein  wrote:

> I like the idea of working on the spec and prioritizing. I will update the
> wiki.
>
> - Joestein
>
>
> On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps  wrote:
>
> > Hey Joe,
> >
> > Thanks for kicking this discussion off! I totally agree that for
> something
> > that acts as a central message broker security is critical feature. I
> think
> > a number of people have been interested in this topic and several people
> > have put effort into special purpose security efforts.
> >
> > Since most the LinkedIn folks are working on the consumer right now I
> think
> > this would be a great project for any other interested people to take on.
> > There are some challenges in doing these things distributed but it can
> also
> > be a lot of fun.
> >
> > I think a good first step would be to get a written plan we can all agree
> > on for how things should work. Then we can break things down into chunks
> > that can be done independently while still aiming at a good end state.
> >
> > I had tried to write up some notes that summarized at least the thoughts
> I
> > had had on security:
> > https://cwiki.apache.org/confluence/display/KAFKA/Security
> >
> > What do you think of that?
> >
> > One assumption I had (which may be incorrect) is that although we want
> all
> > the things in your list, the two most pressing would be authentication
> and
> > authorization, and that was all that write up covered. You have more
> > experience in this domain, so I wonder how you would prioritize?
> >
> > Those notes are really sketchy, so I think the first goal I would have
> > would be to get to a real spec we can all agree on and discuss. A lot of
> > the security stuff has a high human interaction element and needs to work
> > in pretty different domains and different companies so getting this kind
> of
> > review is important.
> >
> > -Jay
> >
> >
> > On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein  wrote:
> >
> > > Hi,I wanted to re-ignite the discussion around Apache Kafka Security.
> >  This
> > > is a huge bottleneck (non-starter in some cases) for a lot of
> > organizations
> > > (due to regulatory, compliance and other requirements). Below are my
> > > suggestions for specific changes in Kafka to accommodate security
> > > requirements.  This comes from what folks are doing "in the wild" to
> > > workaround and implement security with Kafka as it is today and also
> > what I
> > > have discovered from organizations about their blockers. It also picks
> up
> > > from the wiki (which I should have time to update later in the week
> based
> > > on the below and feedback from the thread).
> > >
> > > 1) Transport Layer Security (i.e. SSL)
> > >
> > > This also includes client authentication in addition to in-transit
> > security
> > > layer.  This work has been picked up here
> > > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any
> > > thoughts, comments, feedback, tomatoes, whatever for this patch.  It
> is a
> > > pickup from the fork of the work first done here
> > > https://github.com/relango/kafka/tree/kafka_security.
> > >
> > > 2) Data encryption at rest.
> > >
> > > This is very important and something that can be facilitated within the
> > > wire protocol. It requires an additional map data structure for the
> > > "encrypted [data encryption key]". With this map (either in your object
> > or
> > > in the wire protocol) you can store the dynamically generated symmetric
> > key
> > > (for each message) and then encrypt the data using that dynamically
> > > generated key.  You then encrypt the encryption key using each public
> key
> > > for whom is expected to be able to decrypt the encryption key to then
> > > decrypt the message.  For each public key encrypted symmetric key
> (which
> > is
> > > now the "encrypted [data encryption key]" alo

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Rajasekar Elango
Hi Jay,

Thanks for putting together a spec for security.

Joe,

Looks "Securing zookeeper.." part has been deleted from assumptions
section. communication with zookeeper need to be secured as well to make
entire kafka cluster secure. It may or may not require changes to kafka.
But it's good to have it in spec.

I could not find a link to edit the page after login into wiki. Do I need
any special permission to make edits?

Thanks,
Raja.


On Wed, Jun 4, 2014 at 8:57 PM, Joe Stein  wrote:

> I like the idea of working on the spec and prioritizing. I will update the
> wiki.
>
> - Joestein
>
>
> On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps  wrote:
>
> > Hey Joe,
> >
> > Thanks for kicking this discussion off! I totally agree that for
> something
> > that acts as a central message broker security is critical feature. I
> think
> > a number of people have been interested in this topic and several people
> > have put effort into special purpose security efforts.
> >
> > Since most the LinkedIn folks are working on the consumer right now I
> think
> > this would be a great project for any other interested people to take on.
> > There are some challenges in doing these things distributed but it can
> also
> > be a lot of fun.
> >
> > I think a good first step would be to get a written plan we can all agree
> > on for how things should work. Then we can break things down into chunks
> > that can be done independently while still aiming at a good end state.
> >
> > I had tried to write up some notes that summarized at least the thoughts
> I
> > had had on security:
> > https://cwiki.apache.org/confluence/display/KAFKA/Security
> >
> > What do you think of that?
> >
> > One assumption I had (which may be incorrect) is that although we want
> all
> > the things in your list, the two most pressing would be authentication
> and
> > authorization, and that was all that write up covered. You have more
> > experience in this domain, so I wonder how you would prioritize?
> >
> > Those notes are really sketchy, so I think the first goal I would have
> > would be to get to a real spec we can all agree on and discuss. A lot of
> > the security stuff has a high human interaction element and needs to work
> > in pretty different domains and different companies so getting this kind
> of
> > review is important.
> >
> > -Jay
> >
> >
> > On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein  wrote:
> >
> > > Hi,I wanted to re-ignite the discussion around Apache Kafka Security.
> >  This
> > > is a huge bottleneck (non-starter in some cases) for a lot of
> > organizations
> > > (due to regulatory, compliance and other requirements). Below are my
> > > suggestions for specific changes in Kafka to accommodate security
> > > requirements.  This comes from what folks are doing "in the wild" to
> > > workaround and implement security with Kafka as it is today and also
> > what I
> > > have discovered from organizations about their blockers. It also picks
> up
> > > from the wiki (which I should have time to update later in the week
> based
> > > on the below and feedback from the thread).
> > >
> > > 1) Transport Layer Security (i.e. SSL)
> > >
> > > This also includes client authentication in addition to in-transit
> > security
> > > layer.  This work has been picked up here
> > > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any
> > > thoughts, comments, feedback, tomatoes, whatever for this patch.  It
> is a
> > > pickup from the fork of the work first done here
> > > https://github.com/relango/kafka/tree/kafka_security.
> > >
> > > 2) Data encryption at rest.
> > >
> > > This is very important and something that can be facilitated within the
> > > wire protocol. It requires an additional map data structure for the
> > > "encrypted [data encryption key]". With this map (either in your object
> > or
> > > in the wire protocol) you can store the dynamically generated symmetric
> > key
> > > (for each message) and then encrypt the data using that dynamically
> > > generated key.  You then encrypt the encryption key using each public
> key
> > > for whom is expected to be able to decrypt the encryption key to then
> > > decrypt the message.  For each public key encrypted symmetric key
> (which
> > is
> > > now the "encrypted [data encryption key]" along with which public key
> it
> > > was encrypted with for (so a map of [publicKey] =
> > > encryptedDataEncryptionKey) as a chain.   Other patterns can be
> > implemented
> > > but this is a pretty standard digital enveloping [0] pattern with only
> 1
> > > field added. Other patterns should be able to use that field to-do
> their
> > > implementation too.
> > >
> > > 3) Non-repudiation and long term non-repudiation.
> > >
> > > Non-repudiation is proving data hasn't changed.  This is often (if not
> > > always) done with x509 public certificates (chained to a certificate
> > > authority).
> > >
> > > Long term non-repudiation is what happens when the certificates of the
> > > certifi

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Joe Stein
Raja, you need to sign an ICLA http://www.apache.org/licenses/icla.txt once
that is on file your user can get permed to contribute.

I think securing communication to "offset & broker management source" which
can be a zookeeper implementation is important.  I will elaborate more on
that with the other edits I have for the page.

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


On Thu, Jun 5, 2014 at 11:56 AM, Rajasekar Elango 
wrote:

> Hi Jay,
>
> Thanks for putting together a spec for security.
>
> Joe,
>
> Looks "Securing zookeeper.." part has been deleted from assumptions
> section. communication with zookeeper need to be secured as well to make
> entire kafka cluster secure. It may or may not require changes to kafka.
> But it's good to have it in spec.
>
> I could not find a link to edit the page after login into wiki. Do I need
> any special permission to make edits?
>
> Thanks,
> Raja.
>
>
> On Wed, Jun 4, 2014 at 8:57 PM, Joe Stein  wrote:
>
> > I like the idea of working on the spec and prioritizing. I will update
> the
> > wiki.
> >
> > - Joestein
> >
> >
> > On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps  wrote:
> >
> > > Hey Joe,
> > >
> > > Thanks for kicking this discussion off! I totally agree that for
> > something
> > > that acts as a central message broker security is critical feature. I
> > think
> > > a number of people have been interested in this topic and several
> people
> > > have put effort into special purpose security efforts.
> > >
> > > Since most the LinkedIn folks are working on the consumer right now I
> > think
> > > this would be a great project for any other interested people to take
> on.
> > > There are some challenges in doing these things distributed but it can
> > also
> > > be a lot of fun.
> > >
> > > I think a good first step would be to get a written plan we can all
> agree
> > > on for how things should work. Then we can break things down into
> chunks
> > > that can be done independently while still aiming at a good end state.
> > >
> > > I had tried to write up some notes that summarized at least the
> thoughts
> > I
> > > had had on security:
> > > https://cwiki.apache.org/confluence/display/KAFKA/Security
> > >
> > > What do you think of that?
> > >
> > > One assumption I had (which may be incorrect) is that although we want
> > all
> > > the things in your list, the two most pressing would be authentication
> > and
> > > authorization, and that was all that write up covered. You have more
> > > experience in this domain, so I wonder how you would prioritize?
> > >
> > > Those notes are really sketchy, so I think the first goal I would have
> > > would be to get to a real spec we can all agree on and discuss. A lot
> of
> > > the security stuff has a high human interaction element and needs to
> work
> > > in pretty different domains and different companies so getting this
> kind
> > of
> > > review is important.
> > >
> > > -Jay
> > >
> > >
> > > On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein 
> wrote:
> > >
> > > > Hi,I wanted to re-ignite the discussion around Apache Kafka Security.
> > >  This
> > > > is a huge bottleneck (non-starter in some cases) for a lot of
> > > organizations
> > > > (due to regulatory, compliance and other requirements). Below are my
> > > > suggestions for specific changes in Kafka to accommodate security
> > > > requirements.  This comes from what folks are doing "in the wild" to
> > > > workaround and implement security with Kafka as it is today and also
> > > what I
> > > > have discovered from organizations about their blockers. It also
> picks
> > up
> > > > from the wiki (which I should have time to update later in the week
> > based
> > > > on the below and feedback from the thread).
> > > >
> > > > 1) Transport Layer Security (i.e. SSL)
> > > >
> > > > This also includes client authentication in addition to in-transit
> > > security
> > > > layer.  This work has been picked up here
> > > > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate
> any
> > > > thoughts, comments, feedback, tomatoes, whatever for this patch.  It
> > is a
> > > > pickup from the fork of the work first done here
> > > > https://github.com/relango/kafka/tree/kafka_security.
> > > >
> > > > 2) Data encryption at rest.
> > > >
> > > > This is very important and something that can be facilitated within
> the
> > > > wire protocol. It requires an additional map data structure for the
> > > > "encrypted [data encryption key]". With this map (either in your
> object
> > > or
> > > > in the wire protocol) you can store the dynamically generated
> symmetric
> > > key
> > > > (for each message) and then encrypt the data using that dynamically
> > > > generated key.  You then encrypt the encryption key using each public
> > key
> > > > for wh

Re: [DISCUSS] Kafka Security Specific Features

2014-06-04 Thread Joe Stein
I like the idea of working on the spec and prioritizing. I will update the
wiki.

- Joestein


On Wed, Jun 4, 2014 at 1:11 PM, Jay Kreps  wrote:

> Hey Joe,
>
> Thanks for kicking this discussion off! I totally agree that for something
> that acts as a central message broker security is critical feature. I think
> a number of people have been interested in this topic and several people
> have put effort into special purpose security efforts.
>
> Since most the LinkedIn folks are working on the consumer right now I think
> this would be a great project for any other interested people to take on.
> There are some challenges in doing these things distributed but it can also
> be a lot of fun.
>
> I think a good first step would be to get a written plan we can all agree
> on for how things should work. Then we can break things down into chunks
> that can be done independently while still aiming at a good end state.
>
> I had tried to write up some notes that summarized at least the thoughts I
> had had on security:
> https://cwiki.apache.org/confluence/display/KAFKA/Security
>
> What do you think of that?
>
> One assumption I had (which may be incorrect) is that although we want all
> the things in your list, the two most pressing would be authentication and
> authorization, and that was all that write up covered. You have more
> experience in this domain, so I wonder how you would prioritize?
>
> Those notes are really sketchy, so I think the first goal I would have
> would be to get to a real spec we can all agree on and discuss. A lot of
> the security stuff has a high human interaction element and needs to work
> in pretty different domains and different companies so getting this kind of
> review is important.
>
> -Jay
>
>
> On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein  wrote:
>
> > Hi,I wanted to re-ignite the discussion around Apache Kafka Security.
>  This
> > is a huge bottleneck (non-starter in some cases) for a lot of
> organizations
> > (due to regulatory, compliance and other requirements). Below are my
> > suggestions for specific changes in Kafka to accommodate security
> > requirements.  This comes from what folks are doing "in the wild" to
> > workaround and implement security with Kafka as it is today and also
> what I
> > have discovered from organizations about their blockers. It also picks up
> > from the wiki (which I should have time to update later in the week based
> > on the below and feedback from the thread).
> >
> > 1) Transport Layer Security (i.e. SSL)
> >
> > This also includes client authentication in addition to in-transit
> security
> > layer.  This work has been picked up here
> > https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any
> > thoughts, comments, feedback, tomatoes, whatever for this patch.  It is a
> > pickup from the fork of the work first done here
> > https://github.com/relango/kafka/tree/kafka_security.
> >
> > 2) Data encryption at rest.
> >
> > This is very important and something that can be facilitated within the
> > wire protocol. It requires an additional map data structure for the
> > "encrypted [data encryption key]". With this map (either in your object
> or
> > in the wire protocol) you can store the dynamically generated symmetric
> key
> > (for each message) and then encrypt the data using that dynamically
> > generated key.  You then encrypt the encryption key using each public key
> > for whom is expected to be able to decrypt the encryption key to then
> > decrypt the message.  For each public key encrypted symmetric key (which
> is
> > now the "encrypted [data encryption key]" along with which public key it
> > was encrypted with for (so a map of [publicKey] =
> > encryptedDataEncryptionKey) as a chain.   Other patterns can be
> implemented
> > but this is a pretty standard digital enveloping [0] pattern with only 1
> > field added. Other patterns should be able to use that field to-do their
> > implementation too.
> >
> > 3) Non-repudiation and long term non-repudiation.
> >
> > Non-repudiation is proving data hasn't changed.  This is often (if not
> > always) done with x509 public certificates (chained to a certificate
> > authority).
> >
> > Long term non-repudiation is what happens when the certificates of the
> > certificate authority are expired (or revoked) and everything ever signed
> > (ever) with that certificate's public key then becomes "no longer
> provable
> > as ever being authentic".  That is where RFC3126 [1] and RFC3161 [2] come
> > in (or worm drives [hardware], etc).
> >
> > For either (or both) of these it is an operation of the encryptor to
> > sign/hash the data (with or without third party trusted timestap of the
> > signing event) and encrypt that with their own private key and distribute
> > the results (before and after encrypting if required) along with their
> > public key. This structure is a bit more complex but feasible, it is a
> map
> > of digital signature formats and the chain of dig sig attest

Re: [DISCUSS] Kafka Security Specific Features

2014-06-04 Thread Jay Kreps
Hey Joe,

Thanks for kicking this discussion off! I totally agree that for something
that acts as a central message broker security is critical feature. I think
a number of people have been interested in this topic and several people
have put effort into special purpose security efforts.

Since most the LinkedIn folks are working on the consumer right now I think
this would be a great project for any other interested people to take on.
There are some challenges in doing these things distributed but it can also
be a lot of fun.

I think a good first step would be to get a written plan we can all agree
on for how things should work. Then we can break things down into chunks
that can be done independently while still aiming at a good end state.

I had tried to write up some notes that summarized at least the thoughts I
had had on security:
https://cwiki.apache.org/confluence/display/KAFKA/Security

What do you think of that?

One assumption I had (which may be incorrect) is that although we want all
the things in your list, the two most pressing would be authentication and
authorization, and that was all that write up covered. You have more
experience in this domain, so I wonder how you would prioritize?

Those notes are really sketchy, so I think the first goal I would have
would be to get to a real spec we can all agree on and discuss. A lot of
the security stuff has a high human interaction element and needs to work
in pretty different domains and different companies so getting this kind of
review is important.

-Jay


On Tue, Jun 3, 2014 at 12:57 PM, Joe Stein  wrote:

> Hi,I wanted to re-ignite the discussion around Apache Kafka Security.  This
> is a huge bottleneck (non-starter in some cases) for a lot of organizations
> (due to regulatory, compliance and other requirements). Below are my
> suggestions for specific changes in Kafka to accommodate security
> requirements.  This comes from what folks are doing "in the wild" to
> workaround and implement security with Kafka as it is today and also what I
> have discovered from organizations about their blockers. It also picks up
> from the wiki (which I should have time to update later in the week based
> on the below and feedback from the thread).
>
> 1) Transport Layer Security (i.e. SSL)
>
> This also includes client authentication in addition to in-transit security
> layer.  This work has been picked up here
> https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any
> thoughts, comments, feedback, tomatoes, whatever for this patch.  It is a
> pickup from the fork of the work first done here
> https://github.com/relango/kafka/tree/kafka_security.
>
> 2) Data encryption at rest.
>
> This is very important and something that can be facilitated within the
> wire protocol. It requires an additional map data structure for the
> "encrypted [data encryption key]". With this map (either in your object or
> in the wire protocol) you can store the dynamically generated symmetric key
> (for each message) and then encrypt the data using that dynamically
> generated key.  You then encrypt the encryption key using each public key
> for whom is expected to be able to decrypt the encryption key to then
> decrypt the message.  For each public key encrypted symmetric key (which is
> now the "encrypted [data encryption key]" along with which public key it
> was encrypted with for (so a map of [publicKey] =
> encryptedDataEncryptionKey) as a chain.   Other patterns can be implemented
> but this is a pretty standard digital enveloping [0] pattern with only 1
> field added. Other patterns should be able to use that field to-do their
> implementation too.
>
> 3) Non-repudiation and long term non-repudiation.
>
> Non-repudiation is proving data hasn't changed.  This is often (if not
> always) done with x509 public certificates (chained to a certificate
> authority).
>
> Long term non-repudiation is what happens when the certificates of the
> certificate authority are expired (or revoked) and everything ever signed
> (ever) with that certificate's public key then becomes "no longer provable
> as ever being authentic".  That is where RFC3126 [1] and RFC3161 [2] come
> in (or worm drives [hardware], etc).
>
> For either (or both) of these it is an operation of the encryptor to
> sign/hash the data (with or without third party trusted timestap of the
> signing event) and encrypt that with their own private key and distribute
> the results (before and after encrypting if required) along with their
> public key. This structure is a bit more complex but feasible, it is a map
> of digital signature formats and the chain of dig sig attestations.  The
> map's key being the method (i.e. CRC32, PKCS7 [3], XmlDigSig [4]) and then
> a list of map where that key is "purpose" of signature (what your attesting
> too).  As a sibling field to the list another field for "the attester" as
> bytes (e.g. their PKCS12 [5] for the map of PKCS7 signatures).
>
> 4) Authorization
>
> We sh

Re: [DISCUSS] Kafka Security Specific Features

2014-06-04 Thread Joe Stein
Hey Todd, I think you are right on both points.

Maybe instead of modularizing authorization we could instead support some
feature like being able to associate "labels" for the application specific
items (topic name, reads/writes, delete topic, change config, rate
limiting, etc)  and then accept a file or something with a list
entitlements (where name is the "label" and value is some class that gets
run with the ability to extend your own or integrate with some other
system).   e.g. you associate with a topic named "hasCreditCardDataInIt" a
label == "PCIDSS".  Your config would be "PCIDSS" = "CheckForPCIDSS" and
in CheckForPCIDSS code you could do functions like "verify the topic is
going over an encrypted channel return true else false", etc, whatever.

- Joe Stein


On Tue, Jun 3, 2014 at 6:12 PM, Todd Palino 
wrote:

> I think that¹s one option. What I would offer here is that we need to
> separate out the concepts of authorization and authentication.
> Authentication should definitely be modular, so that we can plug in
> appropriate schemes depending on the organization. For example, you may
> want client certificates, I may want radius, and someone else is going to
> want LDAP.
>
> Authorization is the other piece that¹s needed, and that could be
> internal. Since what you¹re authorizing (topic name, read or write, may
> rate limiting) is specific to the application, it may not make sense to
> modularize it.
>
> -Todd
>
> On 6/3/14, 1:03 PM, "Robert Rodgers"  wrote:
>
> >... client specific presented information, signed in some way, listing
> >topic permissions.  read, write, list.
> >
> >TLS lends itself to client certificates.
> >
> >
> >On Jun 3, 2014, at 12:57 PM, Joe Stein  wrote:
> >
> >> 4) Authorization
> >>
> >> We should have a policy of "404" for data, topics, partitions (etc) if
> >> authenticated connections do not have access.  In "secure mode" any non
> >> authenticated connections should get a "404" type message on everything.
> >> Knowing "something is there" is a security risk in many uses cases.  So
> >>if
> >> you don't have access you don't even see it.  Baking "that" into Kafka
> >> along with some interface for entitlement (access management) systems
> >> (pretty standard) is all that I think needs to be done to the core
> >>project.
> >> I want to tackle item later in the year after summer after the other
> >>three
> >> are complete.
> >
>
>


Re: [DISCUSS] Kafka Security Specific Features

2014-06-03 Thread Todd Palino
I think that¹s one option. What I would offer here is that we need to
separate out the concepts of authorization and authentication.
Authentication should definitely be modular, so that we can plug in
appropriate schemes depending on the organization. For example, you may
want client certificates, I may want radius, and someone else is going to
want LDAP.

Authorization is the other piece that¹s needed, and that could be
internal. Since what you¹re authorizing (topic name, read or write, may
rate limiting) is specific to the application, it may not make sense to
modularize it.

-Todd

On 6/3/14, 1:03 PM, "Robert Rodgers"  wrote:

>... client specific presented information, signed in some way, listing
>topic permissions.  read, write, list.
>
>TLS lends itself to client certificates.
>
>
>On Jun 3, 2014, at 12:57 PM, Joe Stein  wrote:
>
>> 4) Authorization
>> 
>> We should have a policy of "404" for data, topics, partitions (etc) if
>> authenticated connections do not have access.  In "secure mode" any non
>> authenticated connections should get a "404" type message on everything.
>> Knowing "something is there" is a security risk in many uses cases.  So
>>if
>> you don't have access you don't even see it.  Baking "that" into Kafka
>> along with some interface for entitlement (access management) systems
>> (pretty standard) is all that I think needs to be done to the core
>>project.
>> I want to tackle item later in the year after summer after the other
>>three
>> are complete.
>



Re: [DISCUSS] Kafka Security Specific Features

2014-06-03 Thread Robert Rodgers
... client specific presented information, signed in some way, listing topic 
permissions.  read, write, list.

TLS lends itself to client certificates.


On Jun 3, 2014, at 12:57 PM, Joe Stein  wrote:

> 4) Authorization
> 
> We should have a policy of "404" for data, topics, partitions (etc) if
> authenticated connections do not have access.  In "secure mode" any non
> authenticated connections should get a "404" type message on everything.
> Knowing "something is there" is a security risk in many uses cases.  So if
> you don't have access you don't even see it.  Baking "that" into Kafka
> along with some interface for entitlement (access management) systems
> (pretty standard) is all that I think needs to be done to the core project.
> I want to tackle item later in the year after summer after the other three
> are complete.



[DISCUSS] Kafka Security Specific Features

2014-06-03 Thread Joe Stein
Hi,I wanted to re-ignite the discussion around Apache Kafka Security.  This
is a huge bottleneck (non-starter in some cases) for a lot of organizations
(due to regulatory, compliance and other requirements). Below are my
suggestions for specific changes in Kafka to accommodate security
requirements.  This comes from what folks are doing "in the wild" to
workaround and implement security with Kafka as it is today and also what I
have discovered from organizations about their blockers. It also picks up
from the wiki (which I should have time to update later in the week based
on the below and feedback from the thread).

1) Transport Layer Security (i.e. SSL)

This also includes client authentication in addition to in-transit security
layer.  This work has been picked up here
https://issues.apache.org/jira/browse/KAFKA-1477 and do appreciate any
thoughts, comments, feedback, tomatoes, whatever for this patch.  It is a
pickup from the fork of the work first done here
https://github.com/relango/kafka/tree/kafka_security.

2) Data encryption at rest.

This is very important and something that can be facilitated within the
wire protocol. It requires an additional map data structure for the
"encrypted [data encryption key]". With this map (either in your object or
in the wire protocol) you can store the dynamically generated symmetric key
(for each message) and then encrypt the data using that dynamically
generated key.  You then encrypt the encryption key using each public key
for whom is expected to be able to decrypt the encryption key to then
decrypt the message.  For each public key encrypted symmetric key (which is
now the "encrypted [data encryption key]" along with which public key it
was encrypted with for (so a map of [publicKey] =
encryptedDataEncryptionKey) as a chain.   Other patterns can be implemented
but this is a pretty standard digital enveloping [0] pattern with only 1
field added. Other patterns should be able to use that field to-do their
implementation too.

3) Non-repudiation and long term non-repudiation.

Non-repudiation is proving data hasn't changed.  This is often (if not
always) done with x509 public certificates (chained to a certificate
authority).

Long term non-repudiation is what happens when the certificates of the
certificate authority are expired (or revoked) and everything ever signed
(ever) with that certificate's public key then becomes "no longer provable
as ever being authentic".  That is where RFC3126 [1] and RFC3161 [2] come
in (or worm drives [hardware], etc).

For either (or both) of these it is an operation of the encryptor to
sign/hash the data (with or without third party trusted timestap of the
signing event) and encrypt that with their own private key and distribute
the results (before and after encrypting if required) along with their
public key. This structure is a bit more complex but feasible, it is a map
of digital signature formats and the chain of dig sig attestations.  The
map's key being the method (i.e. CRC32, PKCS7 [3], XmlDigSig [4]) and then
a list of map where that key is "purpose" of signature (what your attesting
too).  As a sibling field to the list another field for "the attester" as
bytes (e.g. their PKCS12 [5] for the map of PKCS7 signatures).

4) Authorization

We should have a policy of "404" for data, topics, partitions (etc) if
authenticated connections do not have access.  In "secure mode" any non
authenticated connections should get a "404" type message on everything.
Knowing "something is there" is a security risk in many uses cases.  So if
you don't have access you don't even see it.  Baking "that" into Kafka
along with some interface for entitlement (access management) systems
(pretty standard) is all that I think needs to be done to the core project.
 I want to tackle item later in the year after summer after the other three
are complete.

I look forward to thoughts on this and anyone else interested in working
with us on these items.

[0]
http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/what-is-a-digital-envelope.htm
[1] http://tools.ietf.org/html/rfc3126
[2] http://tools.ietf.org/html/rfc3161
[3]
http://www.emc.com/emc-plus/rsa-labs/standards-initiatives/pkcs-7-cryptographic-message-syntax-standar.htm
[4] http://en.wikipedia.org/wiki/XML_Signature
[5] http://en.wikipedia.org/wiki/PKCS_12

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/