Mark Grover created SPARK-18535:
-----------------------------------

             Summary: Redact sensitive information from Spark logs and UI
                 Key: SPARK-18535
                 URL: https://issues.apache.org/jira/browse/SPARK-18535
             Project: Spark
          Issue Type: Bug
          Components: Web UI, YARN
    Affects Versions: 2.1.0
            Reporter: Mark Grover


A Spark user may have to provide a sensitive information for a Spark 
configuration property, or a source out an environment variable in the executor 
or driver environment that contains sensitive information. A good example of 
this would be when reading/writing data from/to S3 using Spark. The S3 secret 
and S3 access key can be placed in a [hadoop credential 
provider|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html].
 However, one still needs to provide the password for the credential provider 
to Spark, which is typically supplied as an environment variable to the driver 
and executor environments. This environment variable shows up in logs, and may 
also show up in the UI.

1. For logs, it shows up in a few places:
  1A. Event logs under {{SparkListenerEnvironmentUpdate}} event.
  1B. YARN logs, when printing the executor launch context.
2. For UI, it would show up in the _Environment_ tab, but it is redacted if it 
contains the words "password" or "secret" in it. And, these magic words are 
[hardcoded|https://github.com/apache/spark/blob/a2d464770cd183daa7d727bf377bde9c21e29e6a/core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala#L30]
 and hence not customizable.

This JIRA is to track the work to make sure sensitive information is redacted 
from all logs and UIs in Spark, while still being passed on to all relevant 
places it needs to get passed on to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to