Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

Gari Singh Fri, 24 Apr 2015 08:38:17 -0700

Sorry for jumping in late, but I have been trying to follow this chain as
well as the updates to the KIP.  I don't mean to seem critical and I may be
misunderstanding the proposed implementation, but there seems to be some
confusion around terminology (at least from my perspective) and I am not
sure I actually understand what is going to be implemented and where the
plugin point(s) will be.


The KIP does not really mention SASL interfaces in any detail.  The way I
read the KIP it seems as if if is more about providing a Kerberos mechanism
via GSSAPI than it is about providing pluggable SASL support.  Perhaps it
is the naming convention ("GSS" is used where I would have though SASL
would have been used).

Maybe I am missing something?

SASL leverages GSSAPI for the Kerberos mechanism, but SASL and the GSSAPI
are not the same thing.  Also, SSL/TLS is independent of both SASL and
GSSAPI although you can use either SASL or GSSAPI over TLS.

I would expect something more along the lines of having a SASLChannel and
SASL providers (along with pluggable Authentication providers which
enumerate which SASL mechanisms they support).

I have only ever attempted to really implement SASL support once, but I
have played with the SASL APIs and am familiar with how LDAP, SMTP and AMQP
use SASL.

This is my understanding of how SASL is typically implemented:

1) Client decides whether or not to use TLS or plain TCP  (of course this
depends on what the server provides).

My current understanding is that Kafka will support three types of server
sockets:

- current socket for backwards compatibility (i.e. no TLS and no SASL)
- TLS socket
- SASL socket

I would also have thought that SASL mechanism would be supported on the TLS
socket as well but that does not seem to be the case (or at least it is not
clear either way).  I know the decision was made to have separate TLS and
SASL sockets, but I think that we need to support SASL over TLS as well.
You can do this over a single socket if you use the "startTLS" metaphor.

2) There is typically some type of application protocol specific handshake

This is usually used to negotiate whether or not to use SASL and/or
negotiate which SASL mechanisms are supported by the server.  This is not
strictly required, although the SASL spec does mention that the client
should be able to get a list of SASL mechanisms supported by the server.

For example, SMTP does this with the client sending a EHLO and the server
sending an AUTH.

Personally I like the AMQP model (which by the way might also help with
backwards compatibility using a single socket).  For AMQP, the initial
frame is basically

AMQP.%d0.1.0.0  (AMPQ, TCP, AMQP protocol 1.0.0)
AMQP.%d3.1.0.0 (AMQP, SASL)

I think you get the idea.  So we could do something similar for Kafka

KAFKA.[protocol type].[protocol version major].[protocol version
minor].[protocol version revision]

So for example, we could have protocol types of

0 - open
1- SASL

and do this over either a TCP or TLS socket.

Of course, if you stick with having a dedicated SASL socket, you could just
start out with the option of the client sending something like "AUTH" as
its first message (with the option of appending the initial SASL payload as
well)

3) After the protocol handshake, there is an application specific wrapper
for carrying SASL frames for the challenges and responses.

If the mechanism selected is Kerberos, it is at this point that you that
SASL uses the GSSAPI for the exchange (of course wrapped in the app
specific SASL frames).  If you are using PLAIN, there is a defined format
to be used (RFC4616).

Java of course provides support for various mechanisms in the default SASL
client and server mechanisms.  For example, the client supports PLAIN, but
we would need to implement a "PlainSaslServer"  (which we could also tie to
a username/password based authentication provider as well).

In terms of message level integrity and confidentiality (not to be confused
with transport level security like TLS), SASL also provides for this
(assuming the mechanism supports it).  The SASL library supports this via
the "props" parameter in the "createSaslClient/Server" methods.  So it is
easily possible to support Kerberos with integrity (MIC) or confidentiality
(encryption) over TCP and without either over TLS.


Hopefully this makes sense and perhaps this is how things are proceeding,
but it was not clear to me that this is what is actually being implemented.

Sorry for the long note.

-- Gari












On Fri, Apr 24, 2015 at 9:34 AM, Sriharsha Chintalapani <[email protected]>
wrote:

> Rajini,
>         I am exploring this part right now. To support PLAINTEXT and SSL
> as protocols and Kerberos auth as authentication on top of plaintext or ssl
> (if users want to do encryption over an auth mechanism). This is mainly
> influenced by SASL or GSS-API performance issue when I enable encryption.
> I’ll update the KIP once I finalize this on my side .
> Thanks,
> Harsha
>
>
> On April 24, 2015 at 1:39:14 AM, Rajini Sivaram (
> [email protected]) wrote:
>
> Have there been any discussions around separating out authentication and
> encryption protocols for Kafka endpoints to enable different combinations?
> In our deployment environment, we would like to use TLS for encryption, but
> we don't necessarily want to use certificate-based authentication of
> clients. With the current design, if we want to use an authentication
> mechanism like SASL/plain, it looks like we need to define a new security
> protocol in Kafka which combines SASL/Plain authentication with TLS
> encryption. In KIP-12, it looks like the protocols defined are PLAINTEXT
> (no auth, no encryption), KERBEROS (Kerberos auth, no encryption/Kerberos)
> and SSL(SSL auth/no client auth, SSL encryption). While not all
> combinations of authentication and encryption protocols are likely to be
> useful, the ability to combine different mechanisms without modifying Kafka
> to create combined protocols would make it easier to grow the support for
> new protocols. I wanted to check if this has already been discussed in the
> past.
>
>
>
> Thank you,
>
> Rajini
>
>
>
> On Fri, Apr 24, 2015 at 9:26 AM, Rajini Sivaram <
> [email protected]> wrote:
>
> > Harsha,
> >
> > Thank you for the quick response. (Sorry had missed sending this reply to
> > the dev-list earlier)..
> >
> >
> > 1. I am not sure what the new server-side code is going to look like
> > after refactoring under KAFKA-1928. But I was assuming that there would
> be
> > only one Channel implementation that would be shared by both clients and
> > server. So the ability to run delegated tasks on a different thread would
> > be useful in any case. Even with the server, I imagine the Processor
> thread
> > is shared by multiple connections with thread affinity for connections,
> so
> > it might be better not to run potentially long running delegated tasks on
> > that thread.
> > 2. You may be right that Kafka doesn't need to support renegotiation.
> > The usecase I was thinking of was slightly different from the one you
> > described. Periodic renegotiation is used sometimes to refresh encryption
> > keys especially with ciphers that are weak. Kafka may not have a
> > requirement to support this at the moment.
> > 3. Graceful close needs close handshake messages to be be
> > sent/received to shutdown the SSL engine and this requires managing
> > selection interest based on SSL engine close state. It will be good if
> the
> > base channel/selector class didn't need to be aware of this.
> > 4. Yes, I agree that the choice is between bringing some
> > selection-related code into the channel or some channel related code into
> > selector. We found the code neater with the former when the three cases
> > above were implemented. But it is possible that you can handle it
> > differently with the latter, so I am happy to wait until your patch is
> > ready.
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Wed, Apr 22, 2015 at 4:00 PM, Sriharsha Chintalapani <[email protected]
> >
> > wrote:
> >
> >> 1. *Support for running potentially long-running delegated tasks
> >> outside
> >> the network thread*: It is recommended that delegated tasks indicated by
> >> a handshake status of NEED_TASK are run on a separate thread since they
> >> may
> >> block (
> >> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).
> >> It is easier to encapsulate this in SSLChannel without any changes to
> >> common code if selection keys are managed within the Channel.
> >>
> >>
> >> This makes sense I can change code to not do it on the network thread.
> >>
> >> Right now we are doing the handshake as part of the processor ( it
> >> shouldn’t be in acceptor) and we have multiple processors thread. Do we
> >> still see this as an issue if it happens on the same thread as
> processor? .
> >>
> >>
> >>
> >>
> >> --
> >> Harsha
> >> Sent with Airmail
> >>
> >> On April 22, 2015 at 7:18:17 AM, Sriharsha Chintalapani (
> >> [email protected]) wrote:
> >>
> >> Hi Rajini,
> >> Thanks for the details. I did go through your code . There was a
> >> discussion before about not having selector related code into the
> channel
> >> or extending the selector it self.
> >>
> >> 1. *Support for running potentially long-running delegated tasks
> >> outside
> >> the network thread*: It is recommended that delegated tasks indicated by
> >> a handshake status of NEED_TASK are run on a separate thread since they
> >> may
> >> block (
> >> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).
> >> It is easier to encapsulate this in SSLChannel without any changes to
> >> common code if selection keys are managed within the Channel.
> >>
> >>
> >> This makes sense I can change code to not do it on the network thread.
> >>
> >>
> >> 2. *Renegotiation handshake*: During a read operation, handshake status
> >> may indicate that renegotiation is required. It will be good to
> >> encapsulate
> >> this state change (and any knowledge of these SSL-specific state
> >> transitions) within SSLChannel. Our experience was that managing keys
> and
> >> state within the SSLChannel rather than in Selector made this code
> >> neater.
> >>
> >> Do we even want to support renegotiation. This is a case where
> >> user/client handshakes with server anonymously
> >>
> >> but later want to change and present their identity and establish a new
> >> SSL session. In our producer or consumers either present their identity
> (
> >> two -way auth) or not. Since these are long running processes I don’t
> see
> >> that there might be a case where they initially establish the session
> and
> >> later present their identity.
> >>
> >>
> >> *Graceful shutdown of the SSL connection*s: Our experience was that
> >> we could encapsulate all of the logic for shutting down SSLEngine
> >> gracefully within SSLChannel when the selection key and state are owned
> >> and
> >> managed by SSLChannel.
> >>
> >>
> >> Can’t this be done when channel.close() is called any reason to own the
> >> selection key.
> >>
> >> 4. *And finally a minor point:* We found that by managing selection key
> >> and selection interests within SSLChannel, protocol-independent Selector
> >> didn't need the concept of handshake at all and all channel state
> >> management and handshake related code could be held in protocol-specific
> >> classes. This may be worth taking into consideration since it makes it
> >> easier for common network layer code to be maintained without any
> >> understanding of the details of individual security protocols.
> >>
> >> The only thing network code( SocketServer) is aware of channel
> >> isHandshakeComplete if its not do the handshake
> >>
> >> or go about read/write from channel. Yes socketServer need to be aware
> of
> >> channel is ready to read or not. But on the other hand
> >>
> >> there isn’t too many details of handshake leaked into socketServer.
> >> Either we let server know that a channel needs handshake or we keep the
> >> selectionKey state into channel which means we are adding selector
> related
> >> code into channel.
> >>
> >>
> >> Thanks,
> >> Harsha
> >>
> >>
> >> On April 22, 2015 at 3:56:04 AM, Rajini Sivaram (
> >> [email protected]) wrote:
> >>
> >> When we were working on the client-side SSL implementation for Kafka, we
> >> found that returning selection interest from handshake() method wasn't
> >> sufficient to handle some of the SSL sequences. We resorted to managing
> >> the
> >> selection key and interest state within SSLChannel to avoid SSL-specific
> >> knowledge escaping out of SSL classes into protocol-independent network
> >> code. The current server-side SSL patch doesn't address these scenarios
> >> yet, but we may want to take these into account while designing the
> common
> >> Channel class/interface.
> >>
> >> 1. *Support for running potentially long-running delegated tasks outside
> >> the network thread*: It is recommended that delegated tasks indicated by
> >> a handshake status of NEED_TASK are run on a separate thread since they
> >> may
> >> block (
> >> http://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html).
> >> It is easier to encapsulate this in SSLChannel without any changes to
> >> common code if selection keys are managed within the Channel.
> >> 2. *Renegotiation handshake*: During a read operation, handshake status
> >> may indicate that renegotiation is required. It will be good to
> >> encapsulate
> >> this state change (and any knowledge of these SSL-specific state
> >> transitions) within SSLChannel. Our experience was that managing keys
> and
> >> state within the SSLChannel rather than in Selector made this code
> neater.
> >> 3. *Graceful shutdown of the SSL connection*s: Our experience was that
> >> we could encapsulate all of the logic for shutting down SSLEngine
> >> gracefully within SSLChannel when the selection key and state are owned
> >> and
> >> managed by SSLChannel.
> >> 4. *And finally a minor point:* We found that by managing selection key
> >> and selection interests within SSLChannel, protocol-independent Selector
> >> didn't need the concept of handshake at all and all channel state
> >> management and handshake related code could be held in protocol-specific
> >> classes. This may be worth taking into consideration since it makes it
> >> easier for common network layer code to be maintained without any
> >> understanding of the details of individual security protocols.
> >>
> >> The channel classes we used are included in the patch in
> >> https://issues.apache.org/jira/browse/KAFKA-1690. The patch contains
> unit
> >> tests to validate these scenarios as well as other buffer overflow
> >> conditions which may be useful for server-side code when the scenarios
> >> described above are implemented.
> >> Regards,
> >>
> >> Rajini
> >>
> >>
> >>
> >> On Tue, Apr 21, 2015 at 11:13 PM, Sriharsha Chintalapani <
> >> [email protected]> wrote:
> >>
> >> > Hi Jay,
> >> > Thanks for the review.
> >> >
> >> > 1. Isn't the blocking handshake going to be a performance concern? Can
> >> > we
> >> > do the handshake non-blocking instead? If anything that causes
> >> connections
> >> > to drop can incur blocking network roundtrips won't that eat up all
> the
> >> > network threads immediately? I guess I would have to look at that code
> >> to
> >> > know...
> >> > I’ve non-blocking handshake on the server side as well as for new
> >> > producer client. Blocking handshake is only done for
> >> BlockingChannel.scala
> >> > and it just loops over the non-blocking hand shake until the context
> is
> >> > established. So on the server side (SocketServer.scala) as it goes
> >> through
> >> > the steps and returns “READ or WRITE” signal for next step. For
> >> > BlockingChannel the worst case I look at is the connection timeout but
> >> most
> >> > times this handshake will finish up much quicker . I am cleaning up
> the
> >> > code will send up a patch in next few days .
> >> >
> >> > 2. Do we need to support blocking channel at all? That is just for the
> >> old
> >> > clients, and I think we should probably just leave those be to reduce
> >> > scope
> >> > here.
> >> > So blocking channel used not only by simple consumer but also
> >> > ControllerChannelManager and controlled shutdown also. Are we planning
> >> on
> >> > deprecating it. I think at least for ControllerChannelManager it makes
> >> > sense to have a blocking channel. If the users want to lock down the
> >> > cluster i.e no PLAINTEXT channels are allowed than all the
> communication
> >> > has to go through either SSL and KERBEROS so in this case we need add
> >> this
> >> > capability to BlockingChannel.
> >> >
> >> >
> >> >
> >> > 3. Can we change the APIs to drop the getters when that is not
> required
> >> by
> >> > the API being implemented. In general we don't use setters and getters
> >> as
> >> > a
> >> > naming convention.
> >> >
> >> > My bad on adding getters and setters :). I’ll work on removing it and
> >> > change the KIP accordingly. I still need some accessor methods though
> .
> >> >
> >> > Thanks,
> >> >
> >> > Harsha
> >> >
> >> >
> >> >
> >> > On April 21, 2015 at 2:51:15 PM, Jay Kreps ([email protected])
> wrote:
> >> >
> >> > Hey Sriharsha,
> >> >
> >> > Thanks for the excellent write-up.
> >> >
> >> > Couple of minor questions:
> >> >
> >> > 1. Isn't the blocking handshake going to be a performance concern? Can
> >> we
> >> > do the handshake non-blocking instead? If anything that causes
> >> connections
> >> > to drop can incur blocking network roundtrips won't that eat up all
> the
> >> > network threads immediately? I guess I would have to look at that code
> >> to
> >> > know...
> >> >
> >> > 2. Do we need to support blocking channel at all? That is just for the
> >> old
> >> > clients, and I think we should probably just leave those be to reduce
> >> scope
> >> > here.
> >> >
> >> > 3. Can we change the APIs to drop the getters when that is not
> required
> >> by
> >> > the API being implemented. In general we don't use setters and getters
> >> as a
> >> > naming convention.
> >> >
> >> > The long explanation on that is that setters/getters kind of imply a
> >> style
> >> > of java programming where you have simple structs with getters and
> >> setters
> >> > for each field. In general we try to have access methods only when
> >> > necessary, and rather than setters model the full change or action
> being
> >> > carried out, and if possible disallow change entirely. This is more in
> >> line
> >> > with modern java style I think. We aren't perfect in following this,
> but
> >> > once you start with getters and setters people start just adding them
> >> > everywhere and then using them.
> >> >
> >> > -Jay
> >> >
> >> >
> >> > On Mon, Apr 20, 2015 at 10:42 AM, Sriharsha Chintalapani <
> >> [email protected]>
> >> > wrote:
> >> >
> >> > > Hi,
> >> > > I updated the KIP-12 with more details. Please take a look
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=51809888
> >> > >
> >> > > Thanks,
> >> > > Harsha
> >> > >
> >> > >
> >> > > On February 11, 2015 at 10:02:43 AM, Harsha ([email protected])
> wrote:
> >> > >
> >> > > Thanks Joe. It will be part of KafkaServer and will run on its own
> >> > > thread. Since each kafka server will run with a keytab we should
> make
> >> > > sure they are all getting renewed.
> >> > >
> >> > > On Wed, Feb 11, 2015, at 10:00 AM, Joe Stein wrote:
> >> > > > Thanks Harsha, looks good so far. How were you thinking of running
> >> > > > the KerberosTicketManager as a standalone process or like
> >> controller or
> >> > > > is
> >> > > > it a layer of code that does the plumbing pieces everywhere?
> >> > > >
> >> > > > ~ Joestein
> >> > > >
> >> > > > On Wed, Feb 11, 2015 at 12:18 PM, Harsha <[email protected]> wrote:
> >> > > >
> >> > > > > Hi,
> >> > > > > Here is the initial proposal for sasl/kerberos implementation
> for
> >> > > > > kafka https://cwiki.apache.org/confluence/x/YI4WAw
> >> > > > > and JIRA https://issues.apache.org/jira/browse/KAFKA-1686. I am
> >> > > > > currently working on prototype which will add more details to
> the
> >> > KIP.
> >> > > > > Just opening the thread to say the work is in progress. I'll
> >> update
> >> > the
> >> > > > > thread with a initial prototype patch.
> >> > > > > Thanks,
> >> > > > > Harsha
> >> > > > >
> >> > >
> >> >
> >>
> >>
> >
> >
> >
> >
> > --
> > Thank you...
> >
> > Regards,
> >
> > Rajini
> >
>
>
>
> --
> Thank you...
>
> Regards,
>
> Rajini
>

Re: [DISCUSS] KIP-12 - Kafka Sasl/Kerberos implementation

Reply via email to