[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-30 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r759389734



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProviderSuite.scala
##
@@ -68,12 +69,20 @@ class ConnectionProviderSuite
   override def canHandle(driver: Driver, options: Map[String, String]): 
Boolean = true
   override def getConnection(driver: Driver, options: Map[String, 
String]): Connection =
 throw new RuntimeException()
+  override def needsModifySecurityConfiguration(

Review comment:
   I'm definitely open to this, but defer to the community on whether that 
is appropriate or not.
   
   It's worth noting that a default of `false` seems like the right answer to 
me, but Spark versions 3.1+ implicitly have a default of `true`. 
   
   I think a default of `false` is sensible, but it would change what is 
currently the default behavior.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758587271



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala
##
@@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends 
JdbcConnectionProvider with
 logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and 
properties: $properties")
 driver.connect(jdbcOptions.url, properties)
   }
+
+  override def needsModifySecurityConfiguration(
+driver: Driver,
+options: Map[String, String]
+  ): Boolean = {
+val jdbcOptions = new JDBCOptions(options)
+jdbcOptions.keytab != null && jdbcOptions.principal != null

Review comment:
   This could also just return `false` (based on the logic in `canHandle`), 
but I'm kind of confused about what's going on here. [This 
doc](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md#:~:text=Built%2Din%20CPs%20added%20which%20support%20kerberos%20authentication%20using%20keytab%20and%20principal)
 suggests that built-in CPs support kerberos auth, but if I understand 
correctly that should require a `keytab` and `principal`. I'm inclined to 
believe that this `JdbcConnectionProvider` doesn't actually support kerberos 
auth, but maybe I'm misunderstanding.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758587271



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala
##
@@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends 
JdbcConnectionProvider with
 logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and 
properties: $properties")
 driver.connect(jdbcOptions.url, properties)
   }
+
+  override def needsModifySecurityConfiguration(
+driver: Driver,
+options: Map[String, String]
+  ): Boolean = {
+val jdbcOptions = new JDBCOptions(options)
+jdbcOptions.keytab != null && jdbcOptions.principal != null

Review comment:
   This could also just return `false`, but I'm kind of confused about 
what's going on here. [This 
doc](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md#:~:text=Built%2Din%20CPs%20added%20which%20support%20kerberos%20authentication%20using%20keytab%20and%20principal)
 suggests that built-in CPs support kerberos auth, but if I understand 
correctly that should require a `keytab` and `principal`. I'm inclined to 
believe that this `JdbcConnectionProvider` doesn't actually support kerberos 
auth, but maybe I'm misunderstanding.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758510360



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala
##
@@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends 
JdbcConnectionProvider with
 logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and 
properties: $properties")
 driver.connect(jdbcOptions.url, properties)
   }
+
+  override def needsModifySecurityConfiguration(
+driver: Driver,
+options: Map[String, String]
+  ): Boolean = {
+val jdbcOptions = new JDBCOptions(options)
+jdbcOptions.keytab == null || jdbcOptions.principal == null

Review comment:
   This could also just return `false`, but I'm kind of confused about 
what's going on here. [This 
doc](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md#:~:text=Built%2Din%20CPs%20added%20which%20support%20kerberos%20authentication%20using%20keytab%20and%20principal)
 suggests that built-in CPs support kerberos auth, but if I understand 
correctly that should require a `keytab` and ` principal`. I'm inclined to 
believe that this `JdbcConnectionProvider` doesn't actually support kerberos 
auth, but maybe I'm misunderstanding.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758551435



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala
##
@@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends 
JdbcConnectionProvider with
 logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and 
properties: $properties")
 driver.connect(jdbcOptions.url, properties)
   }
+
+  override def needsModifySecurityConfiguration(
+driver: Driver,
+options: Map[String, String]
+  ): Boolean = {
+val jdbcOptions = new JDBCOptions(options)
+jdbcOptions.keytab == null || jdbcOptions.principal == null

Review comment:
   Oops, just realized while driving that I need to invert this check. Well 
fix when I'm back at the keyboard.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758551435



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala
##
@@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends 
JdbcConnectionProvider with
 logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and 
properties: $properties")
 driver.connect(jdbcOptions.url, properties)
   }
+
+  override def needsModifySecurityConfiguration(
+driver: Driver,
+options: Map[String, String]
+  ): Boolean = {
+val jdbcOptions = new JDBCOptions(options)
+jdbcOptions.keytab == null || jdbcOptions.principal == null

Review comment:
   Oops, just realized while driving that I need to invert this check. Well 
fix when I'm back at the keyboard.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758512327



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcConnectionProvider.scala
##
@@ -53,12 +53,25 @@ abstract class JdbcConnectionProvider {
   def canHandle(driver: Driver, options: Map[String, String]): Boolean
 
   /**
-   * Opens connection toward the database. Since global JVM security 
configuration change may needed
-   * this API is called synchronized by `SecurityConfigurationLock` to avoid 
race.
+   * Opens connection to the database. Since global JVM security configuration 
change may be
+   * needed this API is called synchronized by `SecurityConfigurationLock` to 
avoid race when
+   * `needsModifySecurityConfiguration` returns true for the given driver with 
the given options.
*
* @param driver  Java driver which initiates the connection
* @param options Driver options which initiates the connection
* @return a `Connection` object that represents a connection to the URL
*/
   def getConnection(driver: Driver, options: Map[String, String]): Connection
+
+  /**
+   * Checks if this connection provider instance needs to modify global 
security configuration to
+   * handle authentication and thus should synchronize access to the security 
configuration while
+   * the given driver is initiating a connection with the given options.
+   *
+   * @param driver  Java driver which initiates the connection
+   * @param options Driver options which initiates the connection
+   * @return True if the connection provider will need to modify the security 
configuration when
+   * initiating a connection with the given driver with the given options.
+   */
+  def needsModifySecurityConfiguration(driver: Driver, options: Map[String, 
String]): Boolean

Review comment:
   Also open to a default implementation, but seemed better to force 
implementers to explicitly implement.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758511777



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProviderSuite.scala
##
@@ -107,12 +124,50 @@ class ConnectionProviderSuite
 assert(providerBase.create(mock[Driver], Map.empty, 
Some("test2")).isInstanceOf[Connection])
   }
 
+  test("Synchronize on SecurityConfigurationLock when the specified connection 
provider needs") {
+val provider1 = new JdbcConnectionProvider() {
+  override val name: String = "test1"
+  override def canHandle(driver: Driver, options: Map[String, String]): 
Boolean = true
+  override def getConnection(driver: Driver, options: Map[String, 
String]): Connection = {
+assert(Thread.holdsLock(SecurityConfigurationLock))

Review comment:
   Open to better ideas on testing that the lock was acquired or not. Could 
change it to an actual lock, but preferred to avoid the downstream 
repercussions.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758510360



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/BasicConnectionProvider.scala
##
@@ -48,4 +48,12 @@ private[jdbc] class BasicConnectionProvider extends 
JdbcConnectionProvider with
 logDebug(s"JDBC connection initiated with URL: ${jdbcOptions.url} and 
properties: $properties")
 driver.connect(jdbcOptions.url, properties)
   }
+
+  override def needsModifySecurityConfiguration(
+driver: Driver,
+options: Map[String, String]
+  ): Boolean = {
+val jdbcOptions = new JDBCOptions(options)
+jdbcOptions.keytab == null || jdbcOptions.principal == null

Review comment:
   This could also just return `false`, but I'm kind of confused about 
what's going on here. [This 
doc](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md#:~:text=Built%2Din%20CPs%20added%20which%20support%20kerberos%20authentication%20using%20keytab%20and%20principal)
 suggests that built-in CPs support kerberos auth, but if I understand 
correctly that should require a `keytab` and ` principal`. I'm inclined to 
believe that this `JdbcConnectionProvider` doesn't actually support kerberos 
auth, but maybe I'm misunderstanding.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tdg5 commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


tdg5 commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758508360



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcConnectionProvider.scala
##
@@ -53,12 +53,25 @@ abstract class JdbcConnectionProvider {
   def canHandle(driver: Driver, options: Map[String, String]): Boolean
 
   /**
-   * Opens connection toward the database. Since global JVM security 
configuration change may needed
-   * this API is called synchronized by `SecurityConfigurationLock` to avoid 
race.
+   * Opens connection to the database. Since global JVM security configuration 
change may be
+   * needed this API is called synchronized by `SecurityConfigurationLock` to 
avoid race when
+   * `needsModifySecurityConfiguration` returns true for the given driver with 
the given options.
*
* @param driver  Java driver which initiates the connection
* @param options Driver options which initiates the connection
* @return a `Connection` object that represents a connection to the URL
*/
   def getConnection(driver: Driver, options: Map[String, String]): Connection
+
+  /**
+   * Checks if this connection provider instance needs to modify global 
security configuration to
+   * handle authentication and thus should synchronize access to the security 
configuration while
+   * the given driver is initiating a connection with the given options.
+   *
+   * @param driver  Java driver which initiates the connection
+   * @param options Driver options which initiates the connection
+   * @return True if the connection provider will need to modify the security 
configuration when
+   * initiating a connection with the given driver with the given options.
+   */
+  def needsModifySecurityConfiguration(driver: Driver, options: Map[String, 
String]): Boolean

Review comment:
   I'm open to other naming ideas.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org