Github user ifilonenko commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22915#discussion_r230134430
  
    --- Diff: docs/security.md ---
    @@ -798,6 +782,50 @@ achieved by setting 
`spark.kubernetes.hadoop.configMapName` to a pre-existing Co
         local:///opt/spark/examples/jars/spark-examples_<VERSION>.jar \
         <HDFS_FILE_LOCATION>
     ```
    +
    +## Long-Running Applications
    +
    +Long-running applications may run into issues if their run time exceeds 
the maximum delegation
    +token lifetime configured in services it needs to access.
    +
    +Spark supports automatically creating new tokens for these applications 
when running in YARN, Mesos, and Kubernetes modes.
    +If one wishes to launch the renewal thread in the Driver, Kerberos 
credentials need to be provided to the Spark application
    +via the `spark-submit` command, using the `--principal` and `--keytab` 
parameters.
    +
    +The provided keytab will be copied over to the machine running the 
Application Master via the Hadoop
    +Distributed Cache. For this reason, it's strongly recommended that both 
YARN and HDFS be secured
    +with encryption, at least.
    +
    +The Kerberos login will be periodically renewed using the provided 
credentials, and new delegation
    +tokens for supported will be created.
    +
    +#### Long-Running Kerberos in Kubernetes
    +
    +This section addresses the additional feature added uniquely to 
Kubernetes. If you are running an external token service
    --- End diff --
    
    The problem is that such a service can run in a variety of ways, so I 
thought it was a matter of defining what the resulting secret would look like. 
We wrote an example external service in our deprecated-fork to give an example 
of how such a service would function like: 
https://github.com/apache-spark-on-k8s/spark/pull/453. In essence, using a 
service keytab it should aquire delegation tokens bounded to the job-users 
principle. and place the contents in the secret as a new data-item. For us 
internally, and other companies running their own unique external renewal 
services. we might have varying implementations, but I just want to have a 
well-defined spec of the resulting secret, so I am just experimenting with a 
WIP spec below. 
    
    However, it clearly seems necessary to define how such a service should 
function as well. Would that be sufficient? Sadly, that would still be a bit 
hand-wavy. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to