gaborgsomogyi opened a new pull request #27637: [SPARK-30874][SQL] Support 
Postgres Kerberos login in JDBC connector
URL: https://github.com/apache/spark/pull/27637
 
 
   ### What changes were proposed in this pull request?
   When loading DataFrames from JDBC datasource with Kerberos authentication, 
remote executors (yarn-client/cluster etc. modes) fail to establish a 
connection due to lack of Kerberos ticket or ability to generate it.
   
   This is a real issue when trying to ingest data from kerberized data sources 
(SQL Server, Oracle) in enterprise environment where exposing simple 
authentication access is not an option due to IT policy issues.
   
   In this PR I've added Postgres support (other supported databases will come 
in later PRs).
   
   What this PR contains:
   * Added `keytab` and `principal` JDBC options
   * Added `ConnectionProvider` trait and it's impementations:
     * `BasicConnectionProvider` => unsecure connection
     * `PostgresConnectionProvider` => postgres secure connection
   * Added `ConnectionProvider` tests
   * Added `PostgresKrbIntegrationSuite` docker integration test
   * Created `SecurityUtils` to concentrate re-usable security related 
functionalities
   * Documentation
   
   ### Why are the changes needed?
   Missing JDBC kerberos support.
   
   ### Does this PR introduce any user-facing change?
   Yes, 2 additional JDBC options added:
   * keytab
   * principal
   
   If both provided then Spark does kerberos authentication.
   
   ### How was this patch tested?
   To demonstrate the functionality with a standalone application I've created 
this repository: https://github.com/gaborgsomogyi/docker-kerberos
   
   * Additional + existing unit tests
   * Additional docker integration test
   * Test on cluster manually
   * `SKIP_API=1 jekyll build`
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to