Here is the client side in ZK: https://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java
Note how they have a special Zookeeper request API that is used to send the SASL bytes (e.g. see ZooKeeperSaslClient.sendSaslPacket). This API follows the same protocol and rpc mechanism all their other request/response types follow but it just has a simple byte[] entry for the SASL token in both the request and response. -Jay On Wed, Oct 1, 2014 at 9:46 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > Hey Michael, > > WRT question 2, I think for SASL you do need the mechanism information > but what I was talking about was the challenge/response byte[] that is > sent back and forth from the client to the server. My understanding is > that SASL gives you an api for the client and server to use to produce > these byte[]'s but doesn't actually specify any way of exchanging them > (that is protocol specific). I could be wrong here since my knowledge > of this stuff is pretty weak. But according to my understanding you > must be imagining some protocol for exchanging challenge/response > information. This protocol would have to be clearly documented for > client implementors. What is that protocol? > > -Jay > > On Wed, Oct 1, 2014 at 2:36 PM, Michael Herstine > <mherst...@linkedin.com.invalid> wrote: >> Regarding question #1, I’m not sure I follow you, Joe: you’re proposing (I >> think) that the API take a byte[], but what will be in that array? A >> serialized certificate if the client authenticated via SSL and the >> principal name (perhaps normalized) if the client authenticated via >> Kerberos? >> >> Regarding question #2, I think I was unclear in the meeting yesterday: I >> was proposing a separate port for each authentication method (including >> none). That is, if a client wants no authentication, then they would >> connect to port N on the broker. If they wanted to talk over SSL, then >> they connect to port N+1 (say). Kerberos: N+2. This would remove the need >> for a new request, since the authentication type would be implicit in the >> port on which the client connected (and it was my understanding that it >> was desirable to not introduce any new messages). >> >> Perhaps the confusion comes from the fact, correctly pointed out by Jay, >> that when you want to use SASL on a single port, there does of course need >> to be a way for the incoming client to signal which mechanism it wants to >> use, and that’s out of scope of the SASL spec. I didn’t see there being a >> desire to add new SASL mechanisms going forward, but perhaps I was >> incorrect? >> >> In any event, I’d like to suggest we keep the “open” or “no auth” port >> separate, both to make it easy for admins to force the use of security (by >> shutting down that port) and to avoid downgrade attacks (where an attacker >> intercepts the opening packet from a client requesting security & alters >> it to request none). >> >> I’ll update the Wiki with my notes from yesterday’s meeting this afternoon. >> >> Thanks, >> >> On 10/1/14, 9:35 AM, "Jonathan Creasy" <jonathan.cre...@turn.com> wrote: >> >>>This is not nearly as deep as the discussion so far, but I did want to >>>throw this idea out there to make sure we¹ve thought about it. >>> >>>The Kafka project should make sure that when deployed alongside a Hadoop >>>cluster from any major distributions that it can tie seamlessly into the >>>authentication and authorization used within that cluster. For example, >>>Apache Sentry. >>> >>>This may present additional difficulties that means a decision is made to >>>not do that or alternatively the Kerberos authentication and the >>>authorization schemes we are already working on may be sufficient. >>> >>>I¹m not sure that anything I¹ve read so far in this discussion actually >>>poses a problem, but I¹m an Ops guy and being able to more easily >>>integrate more things, makes my life better. :) >>> >>>-Jonathan >>> >>>On 9/30/14, 11:26 PM, "Joe Stein" <joe.st...@stealth.ly> wrote: >>> >>>>inline >>>> >>>>On Tue, Sep 30, 2014 at 11:58 PM, Jay Kreps <jay.kr...@gmail.com> wrote: >>>> >>>>> Hey Joe, >>>>> >>>>> For (1) what are you thinking for the PermissionManager api? >>>>> >>>>> The way I see it, the first question we have to answer is whether it >>>>> is possible to make authentication and authorization independent. What >>>>> I mean by that is whether I can write an authorization library that >>>>> will work the same whether you authenticate with ssl or kerberos. >>>> >>>> >>>>To me that is a requirement. We can't tie them together. We have to >>>>provide the ability for authorization to work regardless of the >>>>authentication. One *VERY* important use case is level of trust in >>>>authentication from the authorization perpsective. e.g. I authorize >>>>"identity" based on the how you authenticated.... Alice is able to view >>>>topic X if Alice authenticated over kerberos. Bob isn't allowed to view >>>>topic X no matter what. Alice can authenticate over not kerberos (uses >>>>cases for that) and in that case Alice wouldn't see topic X. A concrete >>>>use case for this with Kafka would be a third party bank consuming data >>>>to >>>>a broker. The service provider would have some kerberos local auth for >>>>that bank to-do back up that would also have access to other topics >>>>related >>>>to that banks data.... the bank itself over SSL wants a stream of events >>>>(some specific topic) and that banks identity only sees that topic. It >>>>is >>>>important to not confuse identity, authentication and authorization. >>>> >>>> >>>>> If >>>>> so then we need to pick some subset of identity information that we >>>>> can extract from both and have this constitute the identity we pass >>>>> into the authorization interface. The original proposal had just the >>>>> username/subject. But maybe we should add the ip address as well as >>>>> that is useful. What I would prefer not to do is add everything in the >>>>> certificate. I think the assumption is that you are generating these >>>>> certificates for Kafka so you can put whatever identity info you want >>>>> in the Subject Alternative Name. If that is true then just using that >>>>> should be okay, right? >>>>> >>>> >>>>I think we should just push the byte[] and let the plugin deal with it. >>>>So, if we have a certificate object then pass that along with whatever >>>>other meta data (e.g. IP address of client) we can. I don't think we >>>>should do any parsing whatsover and let the plugin deal with that. Any >>>>parsing we do on the identity information for the "security object" >>>>forces >>>>us into specific implementations and I don't see any reason to-do that... >>>>If plug-ins want an "easier" time to deal with certs and parsing and blah >>>>blah blah then we can implement some way they can do this without much >>>>fuss.... we also need to make sure that crypto library is plugable too >>>>(so >>>>we can expose an API for them to call) so that HSM can be easily dropped >>>>in >>>>without Kafka caring... so in the plugin we could provide a >>>>indentity.getAlternativeAttribute() and then that use case is solved (and >>>>we can use bouncy castle or whatever to parse it for them to make it >>>>easier).... and always give them raw bytes so they could do it >>>>themselves. >>>> >>>> >>>>> >>>>> -Jay >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, Sep 30, 2014 at 4:09 PM, Joe Stein <joe.st...@stealth.ly> >>>>>wrote: >>>>> > 1) We need to support the most flexibility we can and make this >>>>> transparent >>>>> > to kafka (to use Gwen's term). Any specific implementation is going >>>>>to >>>>> > make it not work with some solution stopping people from using Kafka. >>>>> That >>>>> > is a reality because everyone just does it slightly differently >>>>>enough. >>>>> If >>>>> > we have an "identity" byte structure (lets not use string because >>>>>some >>>>> > security objects are bytes) this should just fall through to the >>>>> > implementor. For certs this is the entire x509 object (not just the >>>>> > certificate part as it could contain an ASN.1 timestamp) and inside >>>>>you >>>>> > parse and do what you want with it. >>>>> > >>>>> > 2) While I think there are many benefits to just the handshake >>>>>approach I >>>>> > don't think it outweighs the cons Jay expressed. a) We can't lead the >>>>> > client libraries down a new path of interacting with Kafka. By >>>>> > incrementally adding to the wire protocol we are directing a very >>>>>clear >>>>> and >>>>> > expect ted approach. We already have issues with implementation even >>>>> with >>>>> > the wire protocol in place and are trying to improve that aspect of >>>>>the >>>>> > community as a whole. Lets not take a step backwards with this >>>>>there... >>>>> > also we need to not add more/different hoops to >>>>> > debugging/administering/monitoring kafka so taking advantage (as Jay >>>>> says) >>>>> > of built in logging (etc) is important... also for the client >>>>>librariy >>>>> > developers too :) >>>>> > >>>>> > On Tue, Sep 30, 2014 at 6:44 PM, Gwen Shapira <gshap...@cloudera.com> >>>>> wrote: >>>>> > >>>>> >> Re #1: >>>>> >> >>>>> >> Since the auth_to_local is a kerberos config, its up to the admin to >>>>> >> decide how he likes the user names and set it up properly (or leave >>>>> >> empty) and make sure the ACLs match. Simplified names may be needed >>>>>if >>>>> >> the authorization system integrates with LDAP to get groups or >>>>> >> something fancy like that. >>>>> >> >>>>> >> Note that its completely transparent to Kafka - if the admin sets up >>>>> >> auth_to_local rules, we simply see a different principal name. No >>>>>need >>>>> >> to do anything different. >>>>> >> >>>>> >> Gwen >>>>> >> >>>>> >> On Tue, Sep 30, 2014 at 3:31 PM, Jay Kreps <jay.kr...@gmail.com> >>>>>wrote: >>>>> >> > Current proposal is here: >>>>> >> > >>>>> >> > https://cwiki.apache.org/confluence/display/KAFKA/Security >>>>> >> > >>>>> >> > Here are the two open questions I am aware of: >>>>> >> > >>>>> >> > 1. We want to separate authentication and authorization. This >>>>>means >>>>> >> > permissions will be assigned to some user-like >>>>>subject/entity/person >>>>> >> > string that is independent of the authorization mechanism. It >>>>>sounds >>>>> >> > like we agreed this could be done and we had in mind some >>>>>krb-specific >>>>> >> > mangling that Gwen knew about and I think the plan was to use >>>>>whatever >>>>> >> > the user chose to put in the Subject Alternative Name of the cert >>>>>for >>>>> >> > ssl. So in both cases these would translate to a string denoting >>>>>the >>>>> >> > entity whom we are granting permissions to in the authorization >>>>>layer. >>>>> >> > We should document these in the wiki to get feedback on them. >>>>> >> > >>>>> >> > The Hadoop approach to extraction was something like this: >>>>> >> > >>>>> >> >>>>> >>>>>http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.1/bk_installing_ma >>>>>n >>>>>ually_book/content/rpm-chap14-2-3-1.html >>>>> >> > >>>>> >> > But actually I'm not sure if just using the full kerberos >>>>>principal is >>>>> >> > so bad? I.e. having the user be jenni...@athena.mit.edu versus >>>>>just >>>>> >> > jennifer. Where this would make a difference would be in a case >>>>>where >>>>> >> > you wanted the same user/entity to be able to authenticate via >>>>> >> > different mechanisms (Hadoop auth, kerberos, ssl) and have a >>>>>single >>>>> >> > set of permissions. >>>>> >> > >>>>> >> > 2. For SASL/Kerberos we need to figure out how the communication >>>>> >> > between client and server will be handled to pass the >>>>> >> > challenge/response byte[]. I.e. >>>>> >> > >>>>> >> > >>>>> >> >>>>> >>>>>http://docs.oracle.com/javase/7/docs/api/javax/security/sasl/SaslClient. >>>>>h >>>>>tml#evaluateChallenge(byte[]) >>>>> >> > >>>>> >> >>>>> >>>>>http://docs.oracle.com/javase/7/docs/api/javax/security/sasl/SaslServer. >>>>>h >>>>>tml#evaluateResponse(byte[]) >>>>> >> > >>>>> >> > I am not super expert in this area but I will try to give my >>>>> >> > understanding and I'm sure someone can correct me if I am >>>>>confused. >>>>> >> > >>>>> >> > Unlike SSL the transmission of this is actually outside the scope >>>>>of >>>>> >> > SASL so we have to specify this. Two proposals >>>>> >> > >>>>> >> > Original Proposal: Add a new "authenticate" request/response >>>>> >> > >>>>> >> > The proposal in the original wiki was to add a new "authenticate" >>>>> >> > request/response to pass this information. This matches what was >>>>>done >>>>> >> > in the kerberos implementation for zookeeper. The intention is >>>>>that >>>>> >> > the client would send this request immediately after establishing >>>>>a >>>>> >> > connection, in which case it acts much like a "handshake", however >>>>> >> > there is no requirement that they do so. >>>>> >> > >>>>> >> > Whether the authentication happens via SSL or via Kerberos, the >>>>>effect >>>>> >> > will just be to set the username in their session. This will >>>>>default >>>>> >> > to the "anybody" user. So in the default non-secure case we will >>>>>just >>>>> >> > be defaulting "anybody" to have full permission. So to answer the >>>>> >> > question about whether changing user is required or not, I don't >>>>>think >>>>> >> > it is but I think we kind of get it for free in this approach. >>>>> >> > >>>>> >> > In this approach there is no particular need or advantage to >>>>>having a >>>>> >> > separate port for kerberos I don't think. >>>>> >> > >>>>> >> > Alternate Proposal: Create a Handshake >>>>> >> > >>>>> >> > The alternative I think Michael was proposing was to create a >>>>> >> > handshake that would happen at connection time on connections >>>>>coming >>>>> >> > in on the SASL port. This would require a separate port for SASL >>>>>since >>>>> >> > otherwise you wouldn't be able to tell if the bytes you were >>>>>getting >>>>> >> > were for SASL or were the first request of an unauthenticated >>>>> >> > connection. >>>>> >> > >>>>> >> > Michael it would be good to work out the details of how this >>>>>works. >>>>> >> > Are we just sending size-delimited byte arrays back and forth >>>>>until >>>>> >> > the challenge response terminates? >>>>> >> > >>>>> >> > My Take >>>>> >> > >>>>> >> > The pro I see for Michael's proposal is that it keeps the >>>>> >> > authentication logic more localized in the socket server. >>>>> >> > >>>>> >> > I see two cons: >>>>> >> > 1. Since the handshake won't go through the normal api layer it >>>>>won't >>>>> >> > go through the normal logging (e.g. request log), jmx monitoring, >>>>> >> > client trace token, correlation id, etc that we get for other >>>>> >> > requests. This could make operations a little confusing and make >>>>> >> > debugging a little harder since the client will be blocking on >>>>>network >>>>> >> > requests without the normal logging. >>>>> >> > 2. This part of the protocol will be inconsistent with the rest of >>>>>the >>>>> >> > Kafka protocol so it will be a little odd for client implementors >>>>>as >>>>> >> > this will effectively be a request/response that they will have to >>>>> >> > implement that will be different from all the other >>>>>request/responses >>>>> >> > they implement. >>>>> >> > >>>>> >> > In practice these two alternatives are not very different except >>>>>that >>>>> >> > in the original proposal the bytes you send are prefixed by the >>>>>normal >>>>> >> > request header fields such as the client id, correlation id, etc. >>>>> >> > Overall I would prefer this as I think it is a bit more consistent >>>>> >> > from the client's point of view. >>>>> >> > >>>>> >> > Cheers, >>>>> >> > >>>>> >> > -Jay >>>>> >> >>>>> >>> >>