Steve Loughran created SPARK-40567:
--------------------------------------

             Summary: SharedState to redact secrets when propagating them to 
HadoopConf
                 Key: SPARK-40567
                 URL: https://issues.apache.org/jira/browse/SPARK-40567
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.3.0
            Reporter: Steve Loughran



When SharedState propagates (key, value) pairs from initialConfigs to 
HadoopConf, it logs the values at debug.

If the config contained secrets (cloud credentials, etc) the log will contain 
them.

The org.apache.hadoop.conf.ConfigRedactor class will redact values of all keys 
matching a patten in "hadoop.security.sensitive-config-keys"; this is 
configured by default to be


{code}
  "secret$",
  "password$",
  "ssl.keystore.pass$",
  "fs.s3.*[Ss]ecret.?[Kk]ey",
  "fs.s3a.*.server-side-encryption.key",
  "fs.s3a.encryption.algorithm",
  "fs.s3a.encryption.key",
  "fs.azure\\.account.key.*",
  "credential$",
  "oauth.*secret",
  "oauth.*password",
  "oauth.*token",
        "hadoop.security.sensitive-config-keys"
{code}

...And it may be extended in site configs/future hadoop releases

Spark should be using the redactor for log hygiene/security





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to