[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21875


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread TomaszGaweda
Github user TomaszGaweda commented on a diff in the pull request:

https://github.com/apache/spark/pull/21875#discussion_r205269695
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
 ---
@@ -172,7 +172,11 @@ private[sql] case class JDBCRelation(
 
   // Check if JDBCRDD.compileFilter can accept input filters
   override def unhandledFilters(filters: Array[Filter]): Array[Filter] = {
-filters.filter(JDBCRDD.compileFilter(_, 
JdbcDialects.get(jdbcOptions.url)).isEmpty)
+if (jdbcOptions.pushDownPredicate) {
--- End diff --

Indeed it's confusing. `buildScan` argument may be named `pushedFilters`, 
variables also, then code will be self-describing


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread TomaszGaweda
Github user TomaszGaweda commented on a diff in the pull request:

https://github.com/apache/spark/pull/21875#discussion_r205269684
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
 ---
@@ -172,7 +172,11 @@ private[sql] case class JDBCRelation(
 
   // Check if JDBCRDD.compileFilter can accept input filters
   override def unhandledFilters(filters: Array[Filter]): Array[Filter] = {
-filters.filter(JDBCRDD.compileFilter(_, 
JdbcDialects.get(jdbcOptions.url)).isEmpty)
+if (jdbcOptions.pushDownPredicate) {
--- End diff --

Indeed it's confusing. `buildScan` argument may be named `pushedFilters`, 
variables also, then code will be self-describing


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread maryannxue
Github user maryannxue commented on a diff in the pull request:

https://github.com/apache/spark/pull/21875#discussion_r205268701
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
 ---
@@ -172,7 +172,11 @@ private[sql] case class JDBCRelation(
 
   // Check if JDBCRDD.compileFilter can accept input filters
   override def unhandledFilters(filters: Array[Filter]): Array[Filter] = {
-filters.filter(JDBCRDD.compileFilter(_, 
JdbcDialects.get(jdbcOptions.url)).isEmpty)
+if (jdbcOptions.pushDownPredicate) {
--- End diff --

No. I share your opinion actually. It is confusing here... maybe we should 
change the parameter names at some point.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread TomaszGaweda
Github user TomaszGaweda commented on a diff in the pull request:

https://github.com/apache/spark/pull/21875#discussion_r205268370
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -183,6 +183,9 @@ class JDBCOptions(
 }
   // An option to execute custom SQL before fetching data from the remote 
DB
   val sessionInitStatement = parameters.get(JDBC_SESSION_INIT_STATEMENT)
+
+  // An option to allow/disallow pushing down predicate into JDBC data 
source
+  val pushDownPredicate = parameters.getOrElse(JDBC_PUSHDOWN_PREDICATE, 
"true").toBoolean
--- End diff --

Yeah, consistency is a very good argument :) Indeed it will be plural or 
not, depending from which side we are looking at it


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread TomaszGaweda
Github user TomaszGaweda commented on a diff in the pull request:

https://github.com/apache/spark/pull/21875#discussion_r205267937
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
 ---
@@ -172,7 +172,11 @@ private[sql] case class JDBCRelation(
 
   // Check if JDBCRDD.compileFilter can accept input filters
   override def unhandledFilters(filters: Array[Filter]): Array[Filter] = {
-filters.filter(JDBCRDD.compileFilter(_, 
JdbcDialects.get(jdbcOptions.url)).isEmpty)
+if (jdbcOptions.pushDownPredicate) {
--- End diff --

I see now, my mistake. Thanks for clarification :)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread maryannxue
Github user maryannxue commented on a diff in the pull request:

https://github.com/apache/spark/pull/21875#discussion_r205267327
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -183,6 +183,9 @@ class JDBCOptions(
 }
   // An option to execute custom SQL before fetching data from the remote 
DB
   val sessionInitStatement = parameters.get(JDBC_SESSION_INIT_STATEMENT)
+
+  // An option to allow/disallow pushing down predicate into JDBC data 
source
+  val pushDownPredicate = parameters.getOrElse(JDBC_PUSHDOWN_PREDICATE, 
"true").toBoolean
--- End diff --

Or one could argue that "predicate" is a notion of all filters as a whole. 
It's a nice reminder though. I had not thought about it, but anyway I just 
checked: we use `PushDownPredicate` and the singular form in similar rules. So 
maybe we keep it singular here too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread maryannxue
Github user maryannxue commented on a diff in the pull request:

https://github.com/apache/spark/pull/21875#discussion_r205266067
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
 ---
@@ -172,7 +172,11 @@ private[sql] case class JDBCRelation(
 
   // Check if JDBCRDD.compileFilter can accept input filters
   override def unhandledFilters(filters: Array[Filter]): Array[Filter] = {
-filters.filter(JDBCRDD.compileFilter(_, 
JdbcDialects.get(jdbcOptions.url)).isEmpty)
+if (jdbcOptions.pushDownPredicate) {
--- End diff --

Yes, this is the only source of truth for defining handled/unhandled. The 
caller calls this method and push "handled" to scanTable, in this case JDBCRDD.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org