Returning an error code to the producer when it tries writing to topic being deleted

2016-03-19 Thread eugene miretsky
Hi,

When a topic is marked for deletion it gets put into a queue. and then
processed by the controller's delete topic thread. It may take a while for
the topic to actually get deleted  (if a broker hosting one of it's
replicas is down), but during that time the producer can continue writing
messages to the topic. Moreover, if one deletes a topic manually (and
forgets to clear /admin/delete_topic in Zookeeper), it's possible to get
into a state where the topic doesn't exist, but is still marked for
deletion. A producer writing to the same topic will auto-create the topic,
but the Controller will not create any partitions for it because it's
marked for deletion (PartitionStateMachine.scala:513). This will result in
a topic with no partitions, and a lot of weird errors
(LeaderNotAvilableException for example).

Generally, when would it make sense to continue writing to a topic that is
marked for deletion?

I propose adding 2 options to:

1) Notify the producer when a topic it's writing to is marked for deletion.
2) Prevent a topic from being re-created for some time after getting
deleted.

I suppose both would require adding new error codes, which is a fairly big
change.

Thoughts?

Cheers,
Eugene


Gauging Interest in adding Encryption to Kafka

2015-07-30 Thread eugene miretsky
Hi,

Based on the security wiki page
 encryption of
data at rest is out of scope for the time being. However, we are
 implementing  encryption in Kafka and would like to see if there is
interest in submitting a patch got it.

I suppose that one way to implement  encryption would be to add an
'encrypted key' field to the Message/MessageSet  structures in the
wire protocole - however, this is a very big and fundamental change.

A simpler way to add encryption support would be:
1) Custom Serializer, but it wouldn't be compatible with other  custom
serializers (Avro, etc. )
2)  Add a step in KafkaProducer after serialization to encrypt the data
before it's being submitted to the accumulator (encryption is done in the
submitting thread, not in the producer io thread)

Is there interest in adding #2 to Kafka?

Cheers,
Eugene


Re: Gauging Interest in adding Encryption to Kafka

2015-07-31 Thread eugene miretsky
I think that Hadoop and Cassandra do [1] (Transparent Encryption)

We're doing [2] (on a side note, for [2] you still need authentication on
the producer side - you don't want an unauthorized user writing garbage).
Right now we have the 'user' doing the  encryption and submitting raw bytes
to the producer. I was suggesting implementing an encryptor in the
producer itself - I think it's cleaner and can be reused by other users
(instead of having to do their own encryption)

Cheers,
Eugene

On Fri, Jul 31, 2015 at 4:04 PM, Jiangjie Qin 
wrote:

> I think the goal here is to make the actual message stored on broker to be
> encrypted, because after we have SSL, the transmission would be encrypted.
>
> In general there might be tow approaches:
> 1. Broker do the encryption/decryption
> 2. Client do the encryption/decryption
>
> From performance point of view, I would prefer [2]. It is just in that
> case, maybe user does not necessarily need to use SSL anymore because the
> data would be encrypted anyway.
>
> If we let client do the encryption, there are also two ways to do so -
> either we let producer take an encryptor or users can do
> serialization/encryption outside the producer and send raw bytes. The only
> difference between the two might be flexibility. For example, if someone
> wants to know the actual bytes of a message that got sent over the wire,
> doing it outside the producer would probably more preferable.
>
> Jiangjie (Becket) Qin
>
> On Thu, Jul 30, 2015 at 12:16 PM, eugene miretsky <
> eugene.miret...@gmail.com
> > wrote:
>
> > Hi,
> >
> > Based on the security wiki page
> > <https://cwiki.apache.org/confluence/display/KAFKA/Security> encryption
> of
> > data at rest is out of scope for the time being. However, we are
> >  implementing  encryption in Kafka and would like to see if there is
> > interest in submitting a patch got it.
> >
> > I suppose that one way to implement  encryption would be to add an
> > 'encrypted key' field to the Message/MessageSet  structures in the
> > wire protocole - however, this is a very big and fundamental change.
> >
> > A simpler way to add encryption support would be:
> > 1) Custom Serializer, but it wouldn't be compatible with other  custom
> > serializers (Avro, etc. )
> > 2)  Add a step in KafkaProducer after serialization to encrypt the data
> > before it's being submitted to the accumulator (encryption is done in the
> > submitting thread, not in the producer io thread)
> >
> > Is there interest in adding #2 to Kafka?
> >
> > Cheers,
> > Eugene
> >
>


When to use advertised.host.name ?

2016-02-18 Thread eugene miretsky
The FAQ says:

"When a broker starts up, it registers its ip/port in ZK. You need to make
sure the registered ip is consistent with what's listed in
metadata.broker.list in the producer config. By default, the registered ip
is given by InetAddress.getLocalHost.getHostAddress. Typically, this should
return the real ip of the host. However, sometimes (e.g., in EC2), the
returned ip is an internal one and can't be connected to from outside. The
solution is to explicitly set the host ip to be registered in ZK by setting
the "hostname" property in server.properties. In another rare case where
the binding host/port is different from the host/port for client
connection, you can set advertised.host.name and advertised.port for client
connection."

Can somebody give an example for that "rare case" where the binding
host/port is different from the host/port for client connection?

Cheers,
Eugene


How to move a broker to a new host/ip

2016-02-18 Thread eugene miretsky
Hi!

If I change the host.name, or the port properties of a broker, what's the
easiest way to get Zookeeper to pick up the changes. Note that that no data
is being moved, just the configurations of the broker changes to use a
different port, network interface or DNS address.

Cheers,
Eugene


[jira] [Commented] (KAFKA-1683) Implement a "session" concept in the socket server

2015-08-10 Thread Eugene Miretsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681149#comment-14681149
 ] 

Eugene Miretsky commented on KAFKA-1683:


Would this patch include the ability to authorizer as different users? Or will 
it be handled in another JIRA?

> Implement a "session" concept in the socket server
> --
>
> Key: KAFKA-1683
> URL: https://issues.apache.org/jira/browse/KAFKA-1683
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 0.9.0
>Reporter: Jay Kreps
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1683.patch, KAFKA-1683.patch
>
>
> To implement authentication we need a way to keep track of some things 
> between requests. The initial use for this would be remembering the 
> authenticated user/principle info, but likely more uses would come up (for 
> example we will also need to remember whether and which encryption or 
> integrity measures are in place on the socket so we can wrap and unwrap 
> writes and reads).
> I was thinking we could just add a Session object that might have a user 
> field. The session object would need to get added to RequestChannel.Request 
> so it is passed down to the API layer with each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1683) Implement a "session" concept in the socket server

2015-08-12 Thread Eugene Miretsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693676#comment-14693676
 ] 

Eugene Miretsky commented on KAFKA-1683:


My apologies, didn't word the question properly. I think that KAFKA-1686 
solvers it - Kerberos support will allow authenticating as a specific user, and 
storing the user identity in a session for later authorization.

> Implement a "session" concept in the socket server
> --
>
> Key: KAFKA-1683
> URL: https://issues.apache.org/jira/browse/KAFKA-1683
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 0.9.0
>Reporter: Jay Kreps
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1683.patch, KAFKA-1683.patch
>
>
> To implement authentication we need a way to keep track of some things 
> between requests. The initial use for this would be remembering the 
> authenticated user/principle info, but likely more uses would come up (for 
> example we will also need to remember whether and which encryption or 
> integrity measures are in place on the socket so we can wrap and unwrap 
> writes and reads).
> I was thinking we could just add a Session object that might have a user 
> field. The session object would need to get added to RequestChannel.Request 
> so it is passed down to the API layer with each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1683) Implement a "session" concept in the socket server

2015-08-12 Thread Eugene Miretsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693676#comment-14693676
 ] 

Eugene Miretsky edited comment on KAFKA-1683 at 8/12/15 3:36 PM:
-

My apologies, poorly worded question. I think that KAFKA-1686 solves it - 
Kerberos support will allow authenticating as a specific user, and storing the 
user identity in a session for later authorization.


was (Author: emiretsk):
My apologies, didn't word the question properly. I think that KAFKA-1686 
solvers it - Kerberos support will allow authenticating as a specific user, and 
storing the user identity in a session for later authorization.

> Implement a "session" concept in the socket server
> --
>
> Key: KAFKA-1683
> URL: https://issues.apache.org/jira/browse/KAFKA-1683
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 0.9.0
>Reporter: Jay Kreps
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1683.patch, KAFKA-1683.patch
>
>
> To implement authentication we need a way to keep track of some things 
> between requests. The initial use for this would be remembering the 
> authenticated user/principle info, but likely more uses would come up (for 
> example we will also need to remember whether and which encryption or 
> integrity measures are in place on the socket so we can wrap and unwrap 
> writes and reads).
> I was thinking we could just add a Session object that might have a user 
> field. The session object would need to get added to RequestChannel.Request 
> so it is passed down to the API layer with each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1683) Implement a "session" concept in the socket server

2015-08-12 Thread Eugene Miretsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693676#comment-14693676
 ] 

Eugene Miretsky edited comment on KAFKA-1683 at 8/12/15 3:39 PM:
-

My apologies, poorly worded question. Basically was asking where the user 
identity in the session will come from - 1-way SSL doesn't authenticate the 
client.  I think that KAFKA-1686 will solve it - Kerberos support will allow 
authenticating as a specific user, and storing the user identity in a session 
for later authorization.


was (Author: emiretsk):
My apologies, poorly worded question. I think that KAFKA-1686 solves it - 
Kerberos support will allow authenticating as a specific user, and storing the 
user identity in a session for later authorization.

> Implement a "session" concept in the socket server
> --
>
> Key: KAFKA-1683
> URL: https://issues.apache.org/jira/browse/KAFKA-1683
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 0.9.0
>Reporter: Jay Kreps
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1683.patch, KAFKA-1683.patch
>
>
> To implement authentication we need a way to keep track of some things 
> between requests. The initial use for this would be remembering the 
> authenticated user/principle info, but likely more uses would come up (for 
> example we will also need to remember whether and which encryption or 
> integrity measures are in place on the socket so we can wrap and unwrap 
> writes and reads).
> I was thinking we could just add a Session object that might have a user 
> field. The session object would need to get added to RequestChannel.Request 
> so it is passed down to the API layer with each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1683) Implement a "session" concept in the socket server

2015-08-12 Thread Eugene Miretsky (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693676#comment-14693676
 ] 

Eugene Miretsky edited comment on KAFKA-1683 at 8/12/15 3:40 PM:
-

My apologies, poorly worded question. Basically was asking where the 
user/client identity in the session will come from - 1-way SSL (KAFKA-1690) 
doesn't authenticate the client.  I think that KAFKA-1686 will solve it - 
Kerberos support will allow authenticating as a specific user, and storing the 
user identity in a session for later authorization.


was (Author: emiretsk):
My apologies, poorly worded question. Basically was asking where the user 
identity in the session will come from - 1-way SSL doesn't authenticate the 
client.  I think that KAFKA-1686 will solve it - Kerberos support will allow 
authenticating as a specific user, and storing the user identity in a session 
for later authorization.

> Implement a "session" concept in the socket server
> --
>
> Key: KAFKA-1683
> URL: https://issues.apache.org/jira/browse/KAFKA-1683
> Project: Kafka
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 0.9.0
>Reporter: Jay Kreps
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1683.patch, KAFKA-1683.patch
>
>
> To implement authentication we need a way to keep track of some things 
> between requests. The initial use for this would be remembering the 
> authenticated user/principle info, but likely more uses would come up (for 
> example we will also need to remember whether and which encryption or 
> integrity measures are in place on the socket so we can wrap and unwrap 
> writes and reads).
> I was thinking we could just add a Session object that might have a user 
> field. The session object would need to get added to RequestChannel.Request 
> so it is passed down to the API layer with each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)