[GitHub] [spark] imback82 commented on a change in pull request #29328: [WIP][SPARK-32516][SQL] 'path' option should be treated consistently when loading dataframes for different APIs

2020-08-06 Thread GitBox


imback82 commented on a change in pull request #29328:
URL: https://github.com/apache/spark/pull/29328#discussion_r466634683



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##
@@ -229,7 +229,7 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
*/
   def load(path: String): DataFrame = {
 // force invocation of `load(...varargs...)`
-option("path", path).load(Seq.empty: _*)

Review comment:
   I pushed a new commit that doesn't change the existing code except for 
the check. Please let me know what you think. Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #29328: [WIP][SPARK-32516][SQL] 'path' option should be treated consistently when loading dataframes for different APIs

2020-08-06 Thread GitBox


imback82 commented on a change in pull request #29328:
URL: https://github.com/apache/spark/pull/29328#discussion_r466634683



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##
@@ -229,7 +229,7 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
*/
   def load(path: String): DataFrame = {
 // force invocation of `load(...varargs...)`
-option("path", path).load(Seq.empty: _*)

Review comment:
   I pushed a new commit that doesn't change existing code except for the 
check. Please let me know what you think. Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #29328: [WIP][SPARK-32516][SQL] 'path' option should be treated consistently when loading dataframes for different APIs

2020-08-06 Thread GitBox


imback82 commented on a change in pull request #29328:
URL: https://github.com/apache/spark/pull/29328#discussion_r466623120



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
##
@@ -224,17 +224,6 @@ class DataFrameReaderWriterSuite extends QueryTest with 
SharedSparkSession with
 assert(LastOptions.parameters("opt3") == "3")
   }
 
-  test("SPARK-32364: path argument of load function should override all 
existing options") {

Review comment:
   OK. I put this back and renamed the test. Thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #29328: [WIP][SPARK-32516][SQL] 'path' option should be treated consistently when loading dataframes for different APIs

2020-08-06 Thread GitBox


imback82 commented on a change in pull request #29328:
URL: https://github.com/apache/spark/pull/29328#discussion_r466622862



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##
@@ -229,7 +229,7 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
*/
   def load(path: String): DataFrame = {
 // force invocation of `load(...varargs...)`
-option("path", path).load(Seq.empty: _*)

Review comment:
   If we update L257, it will only handle the case when 
`DataSource.lookupDataSourceV2` returns `Some`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #29328: [WIP][SPARK-32516][SQL] 'path' option should be treated consistently when loading dataframes for different APIs

2020-08-05 Thread GitBox


imback82 commented on a change in pull request #29328:
URL: https://github.com/apache/spark/pull/29328#discussion_r466125984



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
##
@@ -826,27 +826,6 @@ class FileBasedDataSourceSuite extends QueryTest
 }
   }
 
-  test("File table location should include both values of option `path` and 
`paths`") {

Review comment:
   @cloud-fan This is a test that's failing. Looks like this was a 
supported scenario and is it OK to deprecate this behavior?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #29328: [WIP][SPARK-32516][SQL] 'path' option should be treated consistently when loading dataframes for different APIs

2020-08-05 Thread GitBox


imback82 commented on a change in pull request #29328:
URL: https://github.com/apache/spark/pull/29328#discussion_r466122116



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
##
@@ -224,17 +224,6 @@ class DataFrameReaderWriterSuite extends QueryTest with 
SharedSparkSession with
 assert(LastOptions.parameters("opt3") == "3")
   }
 
-  test("SPARK-32364: path argument of load function should override all 
existing options") {

Review comment:
   This is no longer relevant, so I am removing it. But I am keeping the 
case insensitivity check in the new test.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #29328: [WIP][SPARK-32516][SQL] 'path' option should be treated consistently when loading dataframes for different APIs

2020-08-02 Thread GitBox


imback82 commented on a change in pull request #29328:
URL: https://github.com/apache/spark/pull/29328#discussion_r464203736



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##
@@ -245,15 +245,22 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
 "read files of Hive data source directly.")
 }
 
+val updatedPaths = if (paths.length == 1) {

Review comment:
   +1 for your suggestion





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org