Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23195#discussion_r238399535
  
    --- Diff: docs/structured-streaming-kafka-integration.md ---
    @@ -624,3 +624,56 @@ For experimenting on `spark-shell`, you can also use 
`--packages` to add `spark-
     
     See [Application Submission Guide](submitting-applications.html) for more 
details about submitting
     applications with external dependencies.
    +
    +## Security
    +
    +Kafka 0.9.0.0 introduced several features that increases security in a 
cluster. For detailed
    +description about these possibilities, see [Kafka security 
docs](http://kafka.apache.org/documentation.html#security).
    +
    +It's worth noting that security is optional and turned off by default.
    +
    +Spark supports the following ways to authenticate against Kafka cluster:
    +- **Delegation token (introduced in Kafka broker 1.1.0)**: This way the 
application can be configured
    +  via Spark parameters and may not need JAAS login configuration (Spark 
can use Kafka's dynamic JAAS
    +  configuration feature). For further information about delegation tokens, 
see
    +  [Kafka delegation token 
docs](http://kafka.apache.org/documentation/#security_delegation_token).
    +
    +  The process is initiated by Spark's Kafka delegation token provider. 
This is enabled by default
    +  but can be turned off with `spark.security.credentials.kafka.enabled`. 
When
    +  `spark.kafka.bootstrap.servers` set Spark looks for authentication 
information in the following
    +  order and choose the first available to log in:
    +  - **JAAS login configuration**
    +  - **Keytab file**, such as,
    +
    +        ./bin/spark-submit \
    +            --keytab <KEYTAB_FILE> \
    +            --principal <PRINCIPAL> \
    +            --conf spark.kafka.bootstrap.servers=<KAFKA_SERVERS> \
    +            ...
    +
    +  - **Kerberos credential cache**, such as,
    +
    +        ./bin/spark-submit \
    +            --conf spark.kafka.bootstrap.servers=<KAFKA_SERVERS> \
    +            ...
    +
    +  Spark supports the following authentication protocols to obtain token:
    --- End diff --
    
    This must match the Kafka broker's config, right? So it's not really "Spark 
supports", but that Spark's configuration must match the Kafka config, right? 
If that's the case, then explaining each option here is not really that 
helpful, since the Kafka admin is the one who should care.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to