[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r759389734 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProviderSuite.scala ## @@ -68,12 +69,20 @@ class ConnectionProviderSuite override def canHandle(driver: Driver, options: Map[String, String]): Boolean = true override def getConnection(driver: Driver, options: Map[String, String]): Connection = throw new RuntimeException() + override def needsModifySecurityConfiguration( Review comment: I'm definitely open to this, but defer to the community on whether that is appropriate or not. It's worth noting that a default of `false` seems like the right answer to me, but Spark versions 3.1+ implicitly have a default of `true`. I think a default of `false` is sensible, but it would change what is currently the default behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758587271 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends JdbcConnectionProvider with logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and properties: $properties") driver.connect(jdbcOptions.url, properties) } + + override def needsModifySecurityConfiguration( +driver: Driver, +options: Map[String, String] + ): Boolean = { +val jdbcOptions = new JDBCOptions(options) +jdbcOptions.keytab != null && jdbcOptions.principal != null Review comment: This could also just return `false` (based on the logic in `canHandle`), but I'm kind of confused about what's going on here. [This doc](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md#:~:text=Built%2Din%20CPs%20added%20which%20support%20kerberos%20authentication%20using%20keytab%20and%20principal) suggests that built-in CPs support kerberos auth, but if I understand correctly that should require a `keytab` and `principal`. I'm inclined to believe that this `JdbcConnectionProvider` doesn't actually support kerberos auth, but maybe I'm misunderstanding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758587271 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends JdbcConnectionProvider with logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and properties: $properties") driver.connect(jdbcOptions.url, properties) } + + override def needsModifySecurityConfiguration( +driver: Driver, +options: Map[String, String] + ): Boolean = { +val jdbcOptions = new JDBCOptions(options) +jdbcOptions.keytab != null && jdbcOptions.principal != null Review comment: This could also just return `false`, but I'm kind of confused about what's going on here. [This doc](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md#:~:text=Built%2Din%20CPs%20added%20which%20support%20kerberos%20authentication%20using%20keytab%20and%20principal) suggests that built-in CPs support kerberos auth, but if I understand correctly that should require a `keytab` and `principal`. I'm inclined to believe that this `JdbcConnectionProvider` doesn't actually support kerberos auth, but maybe I'm misunderstanding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758510360 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends JdbcConnectionProvider with logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and properties: $properties") driver.connect(jdbcOptions.url, properties) } + + override def needsModifySecurityConfiguration( +driver: Driver, +options: Map[String, String] + ): Boolean = { +val jdbcOptions = new JDBCOptions(options) +jdbcOptions.keytab == null || jdbcOptions.principal == null Review comment: This could also just return `false`, but I'm kind of confused about what's going on here. [This doc](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md#:~:text=Built%2Din%20CPs%20added%20which%20support%20kerberos%20authentication%20using%20keytab%20and%20principal) suggests that built-in CPs support kerberos auth, but if I understand correctly that should require a `keytab` and ` principal`. I'm inclined to believe that this `JdbcConnectionProvider` doesn't actually support kerberos auth, but maybe I'm misunderstanding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758551435 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends JdbcConnectionProvider with logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and properties: $properties") driver.connect(jdbcOptions.url, properties) } + + override def needsModifySecurityConfiguration( +driver: Driver, +options: Map[String, String] + ): Boolean = { +val jdbcOptions = new JDBCOptions(options) +jdbcOptions.keytab == null || jdbcOptions.principal == null Review comment: Oops, just realized while driving that I need to invert this check. Well fix when I'm back at the keyboard. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758551435 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends JdbcConnectionProvider with logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and properties: $properties") driver.connect(jdbcOptions.url, properties) } + + override def needsModifySecurityConfiguration( +driver: Driver, +options: Map[String, String] + ): Boolean = { +val jdbcOptions = new JDBCOptions(options) +jdbcOptions.keytab == null || jdbcOptions.principal == null Review comment: Oops, just realized while driving that I need to invert this check. Well fix when I'm back at the keyboard. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758512327 ## File path: sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcConnectionProvider.scala ## @@ -53,12 +53,25 @@ abstract class JdbcConnectionProvider { def canHandle(driver: Driver, options: Map[String, String]): Boolean /** - * Opens connection toward the database. Since global JVM security configuration change may needed - * this API is called synchronized by `SecurityConfigurationLock` to avoid race. + * Opens connection to the database. Since global JVM security configuration change may be + * needed this API is called synchronized by `SecurityConfigurationLock` to avoid race when + * `needsModifySecurityConfiguration` returns true for the given driver with the given options. * * @param driver Java driver which initiates the connection * @param options Driver options which initiates the connection * @return a `Connection` object that represents a connection to the URL */ def getConnection(driver: Driver, options: Map[String, String]): Connection + + /** + * Checks if this connection provider instance needs to modify global security configuration to + * handle authentication and thus should synchronize access to the security configuration while + * the given driver is initiating a connection with the given options. + * + * @param driver Java driver which initiates the connection + * @param options Driver options which initiates the connection + * @return True if the connection provider will need to modify the security configuration when + * initiating a connection with the given driver with the given options. + */ + def needsModifySecurityConfiguration(driver: Driver, options: Map[String, String]): Boolean Review comment: Also open to a default implementation, but seemed better to force implementers to explicitly implement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758511777 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProviderSuite.scala ## @@ -107,12 +124,50 @@ class ConnectionProviderSuite assert(providerBase.create(mock[Driver], Map.empty, Some("test2")).isInstanceOf[Connection]) } + test("Synchronize on SecurityConfigurationLock when the specified connection provider needs") { +val provider1 = new JdbcConnectionProvider() { + override val name: String = "test1" + override def canHandle(driver: Driver, options: Map[String, String]): Boolean = true + override def getConnection(driver: Driver, options: Map[String, String]): Connection = { +assert(Thread.holdsLock(SecurityConfigurationLock)) Review comment: Open to better ideas on testing that the lock was acquired or not. Could change it to an actual lock, but preferred to avoid the downstream repercussions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758510360 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala ## @@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends JdbcConnectionProvider with logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and properties: $properties") driver.connect(jdbcOptions.url, properties) } + + override def needsModifySecurityConfiguration( +driver: Driver, +options: Map[String, String] + ): Boolean = { +val jdbcOptions = new JDBCOptions(options) +jdbcOptions.keytab == null || jdbcOptions.principal == null Review comment: This could also just return `false`, but I'm kind of confused about what's going on here. [This doc](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md#:~:text=Built%2Din%20CPs%20added%20which%20support%20kerberos%20authentication%20using%20keytab%20and%20principal) suggests that built-in CPs support kerberos auth, but if I understand correctly that should require a `keytab` and ` principal`. I'm inclined to believe that this `JdbcConnectionProvider` doesn't actually support kerberos auth, but maybe I'm misunderstanding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock
tdg5 commented on a change in pull request #34745: URL: https://github.com/apache/spark/pull/34745#discussion_r758508360 ## File path: sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcConnectionProvider.scala ## @@ -53,12 +53,25 @@ abstract class JdbcConnectionProvider { def canHandle(driver: Driver, options: Map[String, String]): Boolean /** - * Opens connection toward the database. Since global JVM security configuration change may needed - * this API is called synchronized by `SecurityConfigurationLock` to avoid race. + * Opens connection to the database. Since global JVM security configuration change may be + * needed this API is called synchronized by `SecurityConfigurationLock` to avoid race when + * `needsModifySecurityConfiguration` returns true for the given driver with the given options. * * @param driver Java driver which initiates the connection * @param options Driver options which initiates the connection * @return a `Connection` object that represents a connection to the URL */ def getConnection(driver: Driver, options: Map[String, String]): Connection + + /** + * Checks if this connection provider instance needs to modify global security configuration to + * handle authentication and thus should synchronize access to the security configuration while + * the given driver is initiating a connection with the given options. + * + * @param driver Java driver which initiates the connection + * @param options Driver options which initiates the connection + * @return True if the connection provider will need to modify the security configuration when + * initiating a connection with the given driver with the given options. + */ + def needsModifySecurityConfiguration(driver: Driver, options: Map[String, String]): Boolean Review comment: I'm open to other naming ideas. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org