Hi all, We currently have a new direct stream connector, thanks to work by Cody and others on SPARK-12177.
However, that can't be used in secure clusters that require Kerberos authentication. That's because Kafka currently doesn't support delegation tokens (KAFKA-1696 <https://issues.apache.org/jira/browse/KAFKA-1696>). Unfortunately, very little work has been done on that JIRA, so, in my opinion, folks who want to use secure Kafka (using the norm - Kerberos) can't do so because Spark Streaming can't consume from it today. The right way is, of course, to get delegation tokens in Kafka but honestly I don't know if that's happening in the near future. I am wondering if we should consider something to remedy this - for example, we could come up with a receiver based connector based on the new Kafka consumer API that'd support kerberos authentication. It won't require delegation tokens since there's only a very small number of executors talking to Kafka. Of course, for anyone who cares about high throughput and other direct connector benefits would have to use direct connector. Another thing we could do is ship the keytab to the executors in the direct connector, so delegation tokens are not required but the latter would be a pretty comprising solution, and I'd prefer not doing that. What do folks think? Would love to hear your thoughts, especially about the receiver. Thanks! Mark