[GitHub] [spark] maropu commented on a change in pull request #28863: [SPARK-31336][SQL] Support Oracle Kerberos login in JDBC connector

2020-06-19 Thread GitBox


maropu commented on a change in pull request #28863:
URL: https://github.com/apache/spark/pull/28863#discussion_r443104744



##
File path: external/docker-integration-tests/pom.xml
##
@@ -130,15 +130,9 @@
   postgresql
   test
 
-
-
-  com.oracle
-  ojdbc6
-  11.2.0.1.0
+
+  com.oracle.database.jdbc
+  ojdbc8

Review comment:
   +1; this fix looks separate from this PR.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28616:
URL: https://github.com/apache/spark/pull/28616#issuecomment-646945193


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124308/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28616:
URL: https://github.com/apache/spark/pull/28616#issuecomment-646945191


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28616:
URL: https://github.com/apache/spark/pull/28616#issuecomment-646945191







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-06-19 Thread GitBox


SparkQA commented on pull request #28616:
URL: https://github.com/apache/spark/pull/28616#issuecomment-646945067


   **[Test build #124308 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124308/testReport)**
 for PR 28616 at commit 
[`4fd056d`](https://github.com/apache/spark/commit/4fd056dffc46e3db9547133c97589e0e2aba7f77).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-06-19 Thread GitBox


SparkQA removed a comment on pull request #28616:
URL: https://github.com/apache/spark/pull/28616#issuecomment-646927958


   **[Test build #124308 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124308/testReport)**
 for PR 28616 at commit 
[`4fd056d`](https://github.com/apache/spark/commit/4fd056dffc46e3db9547133c97589e0e2aba7f77).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on pull request #28850: [SPARK-32015][Core]Remote inheritable thread local variables after spark context is stopped

2020-06-19 Thread GitBox


gatorsmile commented on pull request #28850:
URL: https://github.com/apache/spark/pull/28850#issuecomment-646944392


   cc @Ngone51 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on pull request #28863: [SPARK-31336][SQL] Support Oracle Kerberos login in JDBC connector

2020-06-19 Thread GitBox


gatorsmile commented on pull request #28863:
URL: https://github.com/apache/spark/pull/28863#issuecomment-646940577


   cc @maropu @MaxGekk 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on pull request #28860: [SPARK-32002][SQL]Support ExtractValue from nested ArrayStruct

2020-06-19 Thread GitBox


gatorsmile commented on pull request #28860:
URL: https://github.com/apache/spark/pull/28860#issuecomment-646940477


   cc @MaxGekk @HyukjinKwon 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on pull request #28859: [SPARK-32024][WEBUI] Update ApplicationStoreInfo.size during HistoryServerDiskManager initializing

2020-06-19 Thread GitBox


gatorsmile commented on pull request #28859:
URL: https://github.com/apache/spark/pull/28859#issuecomment-646937829


   cc @gengliangwang 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28840:
URL: https://github.com/apache/spark/pull/28840#issuecomment-646937453







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28840:
URL: https://github.com/apache/spark/pull/28840#issuecomment-646937453







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28868: [SPARK-32029][SQL] Make active session null when application end

2020-06-19 Thread GitBox


SparkQA commented on pull request #28868:
URL: https://github.com/apache/spark/pull/28868#issuecomment-646937285


   **[Test build #124312 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124312/testReport)**
 for PR 28868 at commit 
[`a59119f`](https://github.com/apache/spark/commit/a59119f7369a7e8560c179698c85b9a7437899d5).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-19 Thread GitBox


SparkQA commented on pull request #28840:
URL: https://github.com/apache/spark/pull/28840#issuecomment-646937314


   **[Test build #124313 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124313/testReport)**
 for PR 28840 at commit 
[`6cb2edd`](https://github.com/apache/spark/commit/6cb2edd60e1a74ee7f0464d816d97b72d4a20ef3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28868: [SPARK-32029][SQL] Make active session null when application end

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28868:
URL: https://github.com/apache/spark/pull/28868#issuecomment-646936353







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28853: [SPARK-32019][SQL] Add spark.sql.files.minPartitionNum config

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28853:
URL: https://github.com/apache/spark/pull/28853#issuecomment-646936393







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28868: [SPARK-32029][SQL] Make active session null when application end

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28868:
URL: https://github.com/apache/spark/pull/28868#issuecomment-646936353







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28853: [SPARK-32019][SQL] Add spark.sql.files.minPartitionNum config

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28853:
URL: https://github.com/apache/spark/pull/28853#issuecomment-646936393







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28853: [SPARK-32019][SQL] Add spark.sql.files.minPartitionNum config

2020-06-19 Thread GitBox


SparkQA commented on pull request #28853:
URL: https://github.com/apache/spark/pull/28853#issuecomment-646936144


   **[Test build #124311 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124311/testReport)**
 for PR 28853 at commit 
[`1fb9dc6`](https://github.com/apache/spark/commit/1fb9dc651d5e1041fef8612bdbd3299dcea494a5).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28804: [SPARK-31973][SQL] Add ability to disable Sort,Spill in Partial aggregation

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28804:
URL: https://github.com/apache/spark/pull/28804#issuecomment-646935915







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28804: [SPARK-31973][SQL] Add ability to disable Sort,Spill in Partial aggregation

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28804:
URL: https://github.com/apache/spark/pull/28804#issuecomment-646935915







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28804: [SPARK-31973][SQL] Add ability to disable Sort,Spill in Partial aggregation

2020-06-19 Thread GitBox


SparkQA removed a comment on pull request #28804:
URL: https://github.com/apache/spark/pull/28804#issuecomment-646893545


   **[Test build #124304 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124304/testReport)**
 for PR 28804 at commit 
[`56c95e2`](https://github.com/apache/spark/commit/56c95e242126d7aacdb4862adc5e094b4e29561b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28804: [SPARK-31973][SQL] Add ability to disable Sort,Spill in Partial aggregation

2020-06-19 Thread GitBox


SparkQA commented on pull request #28804:
URL: https://github.com/apache/spark/pull/28804#issuecomment-646935525


   **[Test build #124304 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124304/testReport)**
 for PR 28804 at commit 
[`56c95e2`](https://github.com/apache/spark/commit/56c95e242126d7aacdb4862adc5e094b4e29561b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] siknezevic edited a comment on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-06-19 Thread GitBox


siknezevic edited a comment on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-646931795


   > > Could you please let me know would it be OK to hard-code the read buffer 
size to 1024?
   > 
   > You think the performance is independent of running platforms, e.g., CPU 
arch and disk I/O? I'm not 100% sure that the `1024` value is the best on our 
supported platforms...
   > 
   > > With 10TB TPCDS data set I tested spilling with query q14a and buffer 
size of 1024. Execution with hard-coded read buffer size is faster by 37% (27 
min vs 37 min) comparing to the execution when buffer size is parameterized and 
the same size 1024 is used. Query q14a, for 10TB data set, generates around 180 
million joins per partition and when buffer size is parameterized, that 
translates into 10 min longer execution time.
   > 
   > Why does the parameterized one have so much overhead?
   
   Not sure. It looks that call to package.scala to read parameter takes some 
time. And that time is big enough to cause performance hit because it is 
executed for each join row. In the case of 10TB data set there is around 180 
million rows per partition. The code change that I did was to put 1024 when 
array is created and comment call to read parameter from package.scala. That is 
all.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] TJX2014 commented on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-19 Thread GitBox


TJX2014 commented on pull request #28819:
URL: https://github.com/apache/spark/pull/28819#issuecomment-646934894


   Thanks, I will make a PR for branch-2.4.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] TJX2014 removed a comment on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-19 Thread GitBox


TJX2014 removed a comment on pull request #28819:
URL: https://github.com/apache/spark/pull/28819#issuecomment-646933713


   @dongjoon-hyun Thanks, I am willing to, I still have a question, Could you 
please help me check.Could I  base `master` to use 
`CalendarInterval.fromString` instead of `stringToInterval` or just base 
`branch-2.4` to use `CalendarInterval.fromString` but base `master` to use 
`stringToInterval`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] TJX2014 commented on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-19 Thread GitBox


TJX2014 commented on pull request #28819:
URL: https://github.com/apache/spark/pull/28819#issuecomment-646933713


   @dongjoon-hyun Thanks, I am willing to, I still have a question, Could you 
please help me check.Could I  base `master` to use 
`CalendarInterval.fromString` instead of `stringToInterval` or just base 
`branch-2.4` to use `CalendarInterval.fromString` but base `master` to use 
`stringToInterval`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] siknezevic commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-06-19 Thread GitBox


siknezevic commented on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-646931795


   > > Could you please let me know would it be OK to hard-code the read buffer 
size to 1024?
   > 
   > You think the performance is independent of running platforms, e.g., CPU 
arch and disk I/O? I'm not 100% sure that the `1024` value is the best on our 
supported platforms...
   > 
   > > With 10TB TPCDS data set I tested spilling with query q14a and buffer 
size of 1024. Execution with hard-coded read buffer size is faster by 37% (27 
min vs 37 min) comparing to the execution when buffer size is parameterized and 
the same size 1024 is used. Query q14a, for 10TB data set, generates around 180 
million joins per partition and when buffer size is parameterized, that 
translates into 10 min longer execution time.
   > 
   > Why does the parameterized one have so much overhead?
   
   Not sure. It looks that call to package.scala to read parameter takes some 
time. And that time is big enough to cause performance hit because it is 
executed for each join row. In the case of 10TB data set there is around 180 
million rows per partition. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-646931118







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-646931118







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dilipbiswal commented on a change in pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-06-19 Thread GitBox


dilipbiswal commented on a change in pull request #28683:
URL: https://github.com/apache/spark/pull/28683#discussion_r443096878



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -2065,6 +2065,15 @@ object SQLConf {
   .booleanConf
   .createWithDefault(true)
 
+  val OPTIMIZER_HINTS_ENABLED =
+buildConf("spark.sql.optimizer.hints.enabled")

Review comment:
   @dongjoon-hyun Thank you. I have made the change.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-06-19 Thread GitBox


SparkQA commented on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-646930991


   **[Test build #124310 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124310/testReport)**
 for PR 28683 at commit 
[`dd06548`](https://github.com/apache/spark/commit/dd06548de900ec8fc103624d4cd1d5792448c820).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-646929406


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124309/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-06-19 Thread GitBox


SparkQA removed a comment on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-646929024


   **[Test build #124309 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124309/testReport)**
 for PR 28528 at commit 
[`e2ebe65`](https://github.com/apache/spark/commit/e2ebe658c7352d393ded48a63f37a85b23e199d3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-646929403


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-646929403







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-646929262







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-06-19 Thread GitBox


SparkQA commented on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-646929396


   **[Test build #124309 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124309/testReport)**
 for PR 28528 at commit 
[`e2ebe65`](https://github.com/apache/spark/commit/e2ebe658c7352d393ded48a63f37a85b23e199d3).
* This patch **fails build dependency tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-646929262







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-06-19 Thread GitBox


SparkQA commented on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-646929024


   **[Test build #124309 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124309/testReport)**
 for PR 28528 at commit 
[`e2ebe65`](https://github.com/apache/spark/commit/e2ebe658c7352d393ded48a63f37a85b23e199d3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28618: [SPARK-31801][WIP][API][SHUFFLE] Register map output metadata

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28618:
URL: https://github.com/apache/spark/pull/28618#issuecomment-646928295


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124307/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28528: [SPARK-31711][CORE] Register the executor source with the metrics system when running in local mode.

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28528:
URL: https://github.com/apache/spark/pull/28528#issuecomment-646928529


   Retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28618: [SPARK-31801][WIP][API][SHUFFLE] Register map output metadata

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28618:
URL: https://github.com/apache/spark/pull/28618#issuecomment-646928291


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28618: [SPARK-31801][WIP][API][SHUFFLE] Register map output metadata

2020-06-19 Thread GitBox


SparkQA removed a comment on pull request #28618:
URL: https://github.com/apache/spark/pull/28618#issuecomment-646927922


   **[Test build #124307 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124307/testReport)**
 for PR 28618 at commit 
[`e7c9988`](https://github.com/apache/spark/commit/e7c998814df365c726d6615da4b7a14b2ba2167c).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28618: [SPARK-31801][WIP][API][SHUFFLE] Register map output metadata

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28618:
URL: https://github.com/apache/spark/pull/28618#issuecomment-646928291







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28616:
URL: https://github.com/apache/spark/pull/28616#issuecomment-646928173







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28618: [SPARK-31801][WIP][API][SHUFFLE] Register map output metadata

2020-06-19 Thread GitBox


SparkQA commented on pull request #28618:
URL: https://github.com/apache/spark/pull/28618#issuecomment-646928282


   **[Test build #124307 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124307/testReport)**
 for PR 28618 at commit 
[`e7c9988`](https://github.com/apache/spark/commit/e7c998814df365c726d6615da4b7a14b2ba2167c).
* This patch **fails build dependency tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28618: [SPARK-31801][WIP][API][SHUFFLE] Register map output metadata

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28618:
URL: https://github.com/apache/spark/pull/28618#issuecomment-646928142







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28616:
URL: https://github.com/apache/spark/pull/28616#issuecomment-646928173







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28618: [SPARK-31801][WIP][API][SHUFFLE] Register map output metadata

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28618:
URL: https://github.com/apache/spark/pull/28618#issuecomment-646928142







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-06-19 Thread GitBox


SparkQA commented on pull request #28616:
URL: https://github.com/apache/spark/pull/28616#issuecomment-646927958


   **[Test build #124308 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124308/testReport)**
 for PR 28616 at commit 
[`4fd056d`](https://github.com/apache/spark/commit/4fd056dffc46e3db9547133c97589e0e2aba7f77).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28618: [SPARK-31801][WIP][API][SHUFFLE] Register map output metadata

2020-06-19 Thread GitBox


SparkQA commented on pull request #28618:
URL: https://github.com/apache/spark/pull/28618#issuecomment-646927922


   **[Test build #124307 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124307/testReport)**
 for PR 28618 at commit 
[`e7c9988`](https://github.com/apache/spark/commit/e7c998814df365c726d6615da4b7a14b2ba2167c).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on pull request #28784: [SPARK-31957][SQL] Cleanup hive scratch dir for the developer api startWithContext

2020-06-19 Thread GitBox


yaooqinn commented on pull request #28784:
URL: https://github.com/apache/spark/pull/28784#issuecomment-646927538


   Thank you all for reviewing and merging 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28616: [SPARK-31798][SHUFFLE][API] Shuffle Writer API changes to return custom map output metadata

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28616:
URL: https://github.com/apache/spark/pull/28616#issuecomment-646927265


   Retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28618: [SPARK-31801][WIP][API][SHUFFLE] Register map output metadata

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28618:
URL: https://github.com/apache/spark/pull/28618#issuecomment-646927049


   Retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #28784: [SPARK-31957][SQL] Cleanup hive scratch dir for the developer api startWithContext

2020-06-19 Thread GitBox


dongjoon-hyun closed pull request #28784:
URL: https://github.com/apache/spark/pull/28784


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-646925511


   Thank you, @dilipbiswal . The feature looks useful to me.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-06-19 Thread GitBox


dongjoon-hyun commented on a change in pull request #28683:
URL: https://github.com/apache/spark/pull/28683#discussion_r443094576



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -2065,6 +2065,15 @@ object SQLConf {
   .booleanConf
   .createWithDefault(true)
 
+  val OPTIMIZER_HINTS_ENABLED =
+buildConf("spark.sql.optimizer.hints.enabled")

Review comment:
   Can we have more direct names like `OPTIMIZER_IGNORE_HINTS`? Maybe, 
`spark.sql.optimizer.ignoreHints.enabled` like `spark.files.ignoreMissingFiles` 
or `spark.files.ignoreCorruptFiles`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-646924782


   Could you resolve a conflict, @dilipbiswal ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28873: [SPARK-32021][SQL] Increase precision of seconds and fractions of `make_interval`

2020-06-19 Thread GitBox


dongjoon-hyun edited a comment on pull request #28873:
URL: https://github.com/apache/spark/pull/28873#issuecomment-646923926


   Could you make a backporting PR to branch-3.0? We need to regenerate some 
files.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28873: [SPARK-32021][SQL] Increase precision of seconds and fractions of `make_interval`

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28873:
URL: https://github.com/apache/spark/pull/28873#issuecomment-646923926


   Could you make a backporting PR to branch-3.0?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #28873: [SPARK-32021][SQL] Increase precision of seconds and fractions of `make_interval`

2020-06-19 Thread GitBox


dongjoon-hyun closed pull request #28873:
URL: https://github.com/apache/spark/pull/28873


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28819:
URL: https://github.com/apache/spark/pull/28819#issuecomment-646923541


   Hi, @TJX2014 .
   In branch-2.4, `stringToInterval` doesn't exist. Could you make a PR for 
Apache Spark 2.4.7 please?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk commented on pull request #28874: [SPARK-32036] Replace references to blacklist/whitelist language with more appropriate terminology, excluding the blacklisting feature.

2020-06-19 Thread GitBox


holdenk commented on pull request #28874:
URL: https://github.com/apache/spark/pull/28874#issuecomment-646923353


   Sure, I'm taking this weekend away from coding so I'll get to this early 
next week.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #28819: [SPARK-31980][SQL]Function sequence() fails if start and end of range are equal dates

2020-06-19 Thread GitBox


dongjoon-hyun closed pull request #28819:
URL: https://github.com/apache/spark/pull/28819


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28873: [SPARK-32021][SQL] Increase precision of seconds and fractions of `make_interval`

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28873:
URL: https://github.com/apache/spark/pull/28873#issuecomment-646920959







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28873: [SPARK-32021][SQL] Increase precision of seconds and fractions of `make_interval`

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28873:
URL: https://github.com/apache/spark/pull/28873#issuecomment-646920959







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28873: [SPARK-32021][SQL] Increase precision of seconds and fractions of `make_interval`

2020-06-19 Thread GitBox


SparkQA removed a comment on pull request #28873:
URL: https://github.com/apache/spark/pull/28873#issuecomment-646867588


   **[Test build #124301 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124301/testReport)**
 for PR 28873 at commit 
[`3764477`](https://github.com/apache/spark/commit/376447782460cabd206dee59e0668b1af96c7683).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28873: [SPARK-32021][SQL] Increase precision of seconds and fractions of `make_interval`

2020-06-19 Thread GitBox


SparkQA commented on pull request #28873:
URL: https://github.com/apache/spark/pull/28873#issuecomment-646920751


   **[Test build #124301 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124301/testReport)**
 for PR 28873 at commit 
[`3764477`](https://github.com/apache/spark/commit/376447782460cabd206dee59e0668b1af96c7683).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28868: [SPARK-32029][SQL] Make active session null when application end

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28868:
URL: https://github.com/apache/spark/pull/28868#issuecomment-646916561


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124302/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28868: [SPARK-32029][SQL] Make active session null when application end

2020-06-19 Thread GitBox


SparkQA removed a comment on pull request #28868:
URL: https://github.com/apache/spark/pull/28868#issuecomment-646881818


   **[Test build #124302 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124302/testReport)**
 for PR 28868 at commit 
[`4d7ff58`](https://github.com/apache/spark/commit/4d7ff588b27ee054e9812e0d667d67002265676a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28868: [SPARK-32029][SQL] Make active session null when application end

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28868:
URL: https://github.com/apache/spark/pull/28868#issuecomment-646916553


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28868: [SPARK-32029][SQL] Make active session null when application end

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28868:
URL: https://github.com/apache/spark/pull/28868#issuecomment-646916553







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28868: [SPARK-32029][SQL] Make active session null when application end

2020-06-19 Thread GitBox


SparkQA commented on pull request #28868:
URL: https://github.com/apache/spark/pull/28868#issuecomment-646916428


   **[Test build #124302 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124302/testReport)**
 for PR 28868 at commit 
[`4d7ff58`](https://github.com/apache/spark/commit/4d7ff588b27ee054e9812e0d667d67002265676a).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] venkata91 commented on a change in pull request #28874: [SPARK-32036] Replace references to blacklist/whitelist language with more appropriate terminology, excluding the blacklisting

2020-06-19 Thread GitBox


venkata91 commented on a change in pull request #28874:
URL: https://github.com/apache/spark/pull/28874#discussion_r443089270



##
File path: core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala
##
@@ -48,24 +48,24 @@ import org.apache.spark.util.CallSite
 
 private[spark] class SparkUICssErrorHandler extends DefaultCssErrorHandler {
 
-  private val cssWhiteList = List("bootstrap.min.css", 
"vis-timeline-graph2d.min.css")
+  private val cssExcludeList = List("bootstrap.min.css", 
"vis-timeline-graph2d.min.css")

Review comment:
   also it seems in some cases we use `exclude` for both `whitelist` as 
well as `blacklist`. Like here `exclude` is used for `whiteList` and 
[here](https://github.com/apache/spark/blob/8f414bc6b4eeb59203ecb33c26c762a57bf5429e/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala#L542)
 for `blackList`. In general, I prefer `allowedList` for `whitelist` and 
`denyList` or `rejectList` or `stopList` etc for `blacklist` makes it easier to 
comprehend quickly. I understand its hard to use the same word everywhere 
because of the context.

##
File path: core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala
##
@@ -48,24 +48,24 @@ import org.apache.spark.util.CallSite
 
 private[spark] class SparkUICssErrorHandler extends DefaultCssErrorHandler {
 
-  private val cssWhiteList = List("bootstrap.min.css", 
"vis-timeline-graph2d.min.css")
+  private val cssExcludeList = List("bootstrap.min.css", 
"vis-timeline-graph2d.min.css")

Review comment:
   same here instead of `exclude` how about `allowed`?

##
File path: R/pkg/tests/fulltests/test_sparkSQL.R
##
@@ -3921,14 +3921,14 @@ test_that("No extra files are created in SPARK_HOME by 
starting session and maki
   # before creating a SparkSession with enableHiveSupport = T at the top of 
this test file
   # (filesBefore). The test here is to compare that (filesBefore) against the 
list of files before
   # any test is run in run-all.R (sparkRFilesBefore).
-  # sparkRWhitelistSQLDirs is also defined in run-all.R, and should contain 
only 2 whitelisted dirs,
+  # sparkRIncludedSQLDirs is also defined in run-all.R, and should contain 
only 2 included dirs,

Review comment:
   does it make sense to have `allowed` here instead of `included`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xianyinxin commented on pull request #28875: [SPARK-32030][SQL] Support unlimited MATCHED and NOT MATCHED clauses in MERGE INTO

2020-06-19 Thread GitBox


xianyinxin commented on pull request #28875:
URL: https://github.com/apache/spark/pull/28875#issuecomment-646915406


   @cloud-fan @brkyvz , pls take a look.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28826: [SPARK-31988][SQL] Schema pruning may discard attribute metadata

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28826:
URL: https://github.com/apache/spark/pull/28826#issuecomment-646914222


   BTW, @guykhazma . Is the example in the PR description enough? If I follow 
the direction, the metadata is not lost.
   ```
   scala> sc.version
   res1: String = 3.0.0
   ...
   {"key":"value"}
   {}
   {"key":"value"}
   {}
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28826: [SPARK-31988][SQL] Schema pruning may discard attribute metadata

2020-06-19 Thread GitBox


dongjoon-hyun edited a comment on pull request #28826:
URL: https://github.com/apache/spark/pull/28826#issuecomment-646914222


   BTW, @guykhazma . Is the example in the PR description enough? If I follow 
the direction, the metadata is not lost. If that is insufficient, please add 
more steps for the other people.
   ```
   scala> sc.version
   res1: String = 3.0.0
   ...
   {"key":"value"}
   {}
   {"key":"value"}
   {}
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28875: [SPARK-32030][SQL] Support unlimited MATCHED and NOT MATCHED clauses in MERGE INTO

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28875:
URL: https://github.com/apache/spark/pull/28875#issuecomment-646914223







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28875: [SPARK-32030][SQL] Support unlimited MATCHED and NOT MATCHED clauses in MERGE INTO

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28875:
URL: https://github.com/apache/spark/pull/28875#issuecomment-646914223







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28875: [SPARK-32030][SQL] Support unlimited MATCHED and NOT MATCHED clauses in MERGE INTO

2020-06-19 Thread GitBox


SparkQA commented on pull request #28875:
URL: https://github.com/apache/spark/pull/28875#issuecomment-646914074


   **[Test build #124306 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124306/testReport)**
 for PR 28875 at commit 
[`e18a7a5`](https://github.com/apache/spark/commit/e18a7a52ccf3da3689b6bfc3a623c8d608814ab4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28826: [SPARK-31988][SQL] Schema pruning may discard attribute metadata

2020-06-19 Thread GitBox


dongjoon-hyun edited a comment on pull request #28826:
URL: https://github.com/apache/spark/pull/28826#issuecomment-646913172


   +1, for @maropu 's suggestion. You can use your example in the PR 
description, @guykhazma . 
   
   Also, I have the same question like @viirya . This seems to affect both v1 
and v2. Please add a test case for both V1/V2.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28826: [SPARK-31988][SQL] Schema pruning may discard attribute metadata

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28826:
URL: https://github.com/apache/spark/pull/28826#issuecomment-646913172


   +1, for @maropu 's suggestion. You can use your example in the PR 
description, @guykhazma . 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xianyinxin opened a new pull request #28875: [SPARK-32030][SQL] Support unlimited MATCHED and NOT MATCHED clauses in MERGE INTO

2020-06-19 Thread GitBox


xianyinxin opened a new pull request #28875:
URL: https://github.com/apache/spark/pull/28875


   ### What changes were proposed in this pull request?
   This PR add unlimited MATCHED and NOT MATCHED clauses in MERGE INTO 
statement.
   
   ### Why are the changes needed?
   Now the MERGE INTO syntax is,
   ```
   MERGE INTO [db_name.]target_table [AS target_alias]
USING [db_name.]source_table [] [AS source_alias]
ON 
[ WHEN MATCHED [ AND  ] THEN  ]
[ WHEN MATCHED [ AND  ] THEN  ]
[ WHEN NOT MATCHED [ AND  ] THEN  ]
   ```
   It would be nice if we support unlimited MATCHED and NOT MATCHED clauses in 
MERGE INTO statement, because users may want to deal with different "AND 
"s, the result of which just like a series of "CASE WHEN"s. The 
expected syntax looks like
   ```
   MERGE INTO [db_name.]target_table [AS target_alias]
USING [db_name.]source_table [] [AS source_alias]
ON 
[when_clause [, ...]]
   ```
   where when_clause is
   ```
   WHEN MATCHED [ AND  ] THEN 
   ```
   or
   ```
   WHEN NOT MATCHED [ AND  ] THEN 
```
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. The SQL command changes, but it is backward compatible.
   
   ### How was this patch tested?
   New tests added.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28867: [SPARK-32028][WEBUI] fix app id link for multi attempts app in history summary page

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #28867:
URL: https://github.com/apache/spark/pull/28867#issuecomment-646911984







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28867: [SPARK-32028][WEBUI] fix app id link for multi attempts app in history summary page

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #28867:
URL: https://github.com/apache/spark/pull/28867#issuecomment-646911984







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28867: [SPARK-32028][WEBUI] fix app id link for multi attempts app in history summary page

2020-06-19 Thread GitBox


SparkQA removed a comment on pull request #28867:
URL: https://github.com/apache/spark/pull/28867#issuecomment-646881817


   **[Test build #124303 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124303/testReport)**
 for PR 28867 at commit 
[`ee5193e`](https://github.com/apache/spark/commit/ee5193e34d17e0bc5ce4a9fa3d323685fe71dfaf).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28867: [SPARK-32028][WEBUI] fix app id link for multi attempts app in history summary page

2020-06-19 Thread GitBox


SparkQA commented on pull request #28867:
URL: https://github.com/apache/spark/pull/28867#issuecomment-646911587


   **[Test build #124303 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124303/testReport)**
 for PR 28867 at commit 
[`ee5193e`](https://github.com/apache/spark/commit/ee5193e34d17e0bc5ce4a9fa3d323685fe71dfaf).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28841: [SPARK-31962][SQL] Provide option to load files after a specified date when reading from a folder path

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-646911317


   BTW, welcome to the Apache Spark community and thank you for your first 
contribution, @cchighman .



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28841: [SPARK-31962][SQL] Provide option to load files after a specified date when reading from a folder path

2020-06-19 Thread GitBox


dongjoon-hyun commented on a change in pull request #28841:
URL: https://github.com/apache/spark/pull/28841#discussion_r443087420



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
##
@@ -356,26 +380,35 @@ object InMemoryFileIndex extends Logging {
   bulkListLeafFiles(
 dirs.map(_.getPath),
 hadoopConf,
-filter,
+filters,
 session,
-areRootPaths = false
+areRootPaths = false,
+parameters = parameters
   ).flatMap(_._2)
 case _ =>
   dirs.flatMap { dir =>
 listLeafFiles(
   dir.getPath,
   hadoopConf,
-  filter,
+  filters,
   sessionOpt,
   ignoreMissingFiles = ignoreMissingFiles,
   ignoreLocality = ignoreLocality,
-  isRootPath = false)
+  isRootPath = false,
+  parameters = parameters)
   }
   }
+  val fileFilters = (filters ++ FileModifiedDateOption.accept(
+parameters,
+sessionOpt.get,

Review comment:
   You can choose some tests in the following report in order to verify 
your fix. Please push your commit after passing those UT locally.
   - 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124247/testReport/





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cchighman commented on a change in pull request #28841: [SPARK-31962][SQL] Provide option to load files after a specified date when reading from a folder path

2020-06-19 Thread GitBox


cchighman commented on a change in pull request #28841:
URL: https://github.com/apache/spark/pull/28841#discussion_r443087412



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
##
@@ -356,26 +380,35 @@ object InMemoryFileIndex extends Logging {
   bulkListLeafFiles(
 dirs.map(_.getPath),
 hadoopConf,
-filter,
+filters,
 session,
-areRootPaths = false
+areRootPaths = false,
+parameters = parameters
   ).flatMap(_._2)
 case _ =>
   dirs.flatMap { dir =>
 listLeafFiles(
   dir.getPath,
   hadoopConf,
-  filter,
+  filters,
   sessionOpt,
   ignoreMissingFiles = ignoreMissingFiles,
   ignoreLocality = ignoreLocality,
-  isRootPath = false)
+  isRootPath = false,
+  parameters = parameters)
   }
   }
+  val fileFilters = (filters ++ FileModifiedDateOption.accept(
+parameters,
+sessionOpt.get,

Review comment:
   Yes, thank you.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28841: [SPARK-31962][SQL] Provide option to load files after a specified date when reading from a folder path

2020-06-19 Thread GitBox


dongjoon-hyun commented on a change in pull request #28841:
URL: https://github.com/apache/spark/pull/28841#discussion_r443087259



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
##
@@ -356,26 +380,35 @@ object InMemoryFileIndex extends Logging {
   bulkListLeafFiles(
 dirs.map(_.getPath),
 hadoopConf,
-filter,
+filters,
 session,
-areRootPaths = false
+areRootPaths = false,
+parameters = parameters
   ).flatMap(_._2)
 case _ =>
   dirs.flatMap { dir =>
 listLeafFiles(
   dir.getPath,
   hadoopConf,
-  filter,
+  filters,
   sessionOpt,
   ignoreMissingFiles = ignoreMissingFiles,
   ignoreLocality = ignoreLocality,
-  isRootPath = false)
+  isRootPath = false,
+  parameters = parameters)
   }
   }
+  val fileFilters = (filters ++ FileModifiedDateOption.accept(
+parameters,
+sessionOpt.get,

Review comment:
   Hi, @cchighman . This is the root cause of the failure. Could you fix 
and make the tests pass?
   ```
   java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:529)
at scala.None$.get(Option.scala:527)
at 
org.apache.spark.sql.execution.datasources.InMemoryFileIndex$.listLeafFiles(InMemoryFileIndex.scala:403)
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2020-06-19 Thread GitBox


AmplabJenkins removed a comment on pull request #23531:
URL: https://github.com/apache/spark/pull/23531#issuecomment-646910123







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2020-06-19 Thread GitBox


AmplabJenkins commented on pull request #23531:
URL: https://github.com/apache/spark/pull/23531#issuecomment-646910123







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2020-06-19 Thread GitBox


SparkQA removed a comment on pull request #23531:
URL: https://github.com/apache/spark/pull/23531#issuecomment-646843922


   **[Test build #124299 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124299/testReport)**
 for PR 23531 at commit 
[`69202d5`](https://github.com/apache/spark/commit/69202d5e43224e62cc69178d61bfb7eb1646a708).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2020-06-19 Thread GitBox


SparkQA commented on pull request #23531:
URL: https://github.com/apache/spark/pull/23531#issuecomment-646909740


   **[Test build #124299 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124299/testReport)**
 for PR 23531 at commit 
[`69202d5`](https://github.com/apache/spark/commit/69202d5e43224e62cc69178d61bfb7eb1646a708).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #28848: [SPARK-32003][CORE] Unregister outputs for executor on fetch failure …

2020-06-19 Thread GitBox


dongjoon-hyun commented on pull request #28848:
URL: https://github.com/apache/spark/pull/28848#issuecomment-646909066


   Thank you so much for this contribution, @wypoon and @attilapiros .



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28853: [SPARK-32019][SQL] Add spark.sql.files.minPartitionNum config

2020-06-19 Thread GitBox


maropu commented on a change in pull request #28853:
URL: https://github.com/apache/spark/pull/28853#discussion_r443085104



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala
##
@@ -528,6 +528,41 @@ class FileSourceStrategySuite extends QueryTest with 
SharedSparkSession with Pre
 }
   }
 
+  test("SPARK-32019: Add spark.sql.files.minPartitionNum config") {
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "1") {
+  val table =
+createTable(files = Seq(
+  "file1" -> 1,
+  "file2" -> 1,
+  "file3" -> 1
+))
+  assert(table.rdd.partitions.length == 1)
+}
+
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "10") {
+  val table =
+createTable(files = Seq(
+  "file1" -> 1,
+  "file2" -> 1,
+  "file3" -> 1
+))
+  assert(table.rdd.partitions.length == 3)
+}
+
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "16") {
+  val partitions = (1 to 100).map(i => s"file$i" -> 128*1024*1024)
+  val table = createTable(files = partitions)
+  // partition is limit by filesMaxPartitionBytes(128MB)
+  assert(table.rdd.partitions.length == 100)
+}
+
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "32") {
+  val partitions = (1 to 800).map(i => s"file$i" -> 4*1024*1024)

Review comment:
   ditto





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28853: [SPARK-32019][SQL] Add spark.sql.files.minPartitionNum config

2020-06-19 Thread GitBox


maropu commented on a change in pull request #28853:
URL: https://github.com/apache/spark/pull/28853#discussion_r443085083



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala
##
@@ -528,6 +528,41 @@ class FileSourceStrategySuite extends QueryTest with 
SharedSparkSession with Pre
 }
   }
 
+  test("SPARK-32019: Add spark.sql.files.minPartitionNum config") {
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "1") {
+  val table =
+createTable(files = Seq(
+  "file1" -> 1,
+  "file2" -> 1,
+  "file3" -> 1
+))
+  assert(table.rdd.partitions.length == 1)
+}
+
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "10") {
+  val table =
+createTable(files = Seq(
+  "file1" -> 1,
+  "file2" -> 1,
+  "file3" -> 1
+))
+  assert(table.rdd.partitions.length == 3)
+}
+
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "16") {
+  val partitions = (1 to 100).map(i => s"file$i" -> 128*1024*1024)
+  val table = createTable(files = partitions)
+  // partition is limit by filesMaxPartitionBytes(128MB)

Review comment:
   nit: limit -> limited





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28853: [SPARK-32019][SQL] Add spark.sql.files.minPartitionNum config

2020-06-19 Thread GitBox


maropu commented on a change in pull request #28853:
URL: https://github.com/apache/spark/pull/28853#discussion_r443085037



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala
##
@@ -528,6 +528,41 @@ class FileSourceStrategySuite extends QueryTest with 
SharedSparkSession with Pre
 }
   }
 
+  test("SPARK-32019: Add spark.sql.files.minPartitionNum config") {
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "1") {
+  val table =
+createTable(files = Seq(
+  "file1" -> 1,
+  "file2" -> 1,
+  "file3" -> 1
+))
+  assert(table.rdd.partitions.length == 1)
+}
+
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "10") {
+  val table =
+createTable(files = Seq(
+  "file1" -> 1,
+  "file2" -> 1,
+  "file3" -> 1
+))
+  assert(table.rdd.partitions.length == 3)
+}
+
+withSQLConf(SQLConf.FILES_MIN_PARTITION_NUM.key -> "16") {
+  val partitions = (1 to 100).map(i => s"file$i" -> 128*1024*1024)

Review comment:
   nit: `128*1024*1024` -> `128 * 1024 * 1024`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   >