[GitHub] [spark] huaxingao commented on a change in pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
huaxingao commented on a change in pull request #29396: URL: https://github.com/apache/spark/pull/29396#discussion_r468343334 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala ## @@ -64,11 +64,14 @@ class

[GitHub] [spark] c21 commented on pull request #29342: [SPARK-32399][SQL] Full outer shuffled hash join

2020-08-10 Thread GitBox
c21 commented on pull request #29342: URL: https://github.com/apache/spark/pull/29342#issuecomment-671742514 @cloud-fan, @agrawaldevesh, @maropu and @viirya - I took a more closer look inside `BytesToBytesMap.java`, and found it would probably be hard / hacky to get key index when

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-671741732 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA removed a comment on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-08-10 Thread GitBox
SparkQA removed a comment on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-671696614 **[Test build #127304 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127304/testReport)** for PR 29276 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-671741724 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-671741724 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-08-10 Thread GitBox
SparkQA commented on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-671741096 **[Test build #127304 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127304/testReport)** for PR 29276 at commit

[GitHub] [spark] c21 edited a comment on pull request #29342: [SPARK-32399][SQL] Full outer shuffled hash join

2020-08-10 Thread GitBox
c21 edited a comment on pull request #29342: URL: https://github.com/apache/spark/pull/29342#issuecomment-671723199 ~~@cloud-fan - sorry if I miss anything, could you elaborate more of~~ ~~> We don't need to get the value index. We can calculate it by ourselves~~ ~~How do we

[GitHub] [spark] SparkQA removed a comment on pull request #29402: [SPARK-32584][PYTHON][DOCS] Exclude _images and _sources that are generated by Sphinx in Jekyll build

2020-08-10 Thread GitBox
SparkQA removed a comment on pull request #29402: URL: https://github.com/apache/spark/pull/29402#issuecomment-671727996 **[Test build #127307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127307/testReport)** for PR 29402 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29402: [SPARK-32584][PYTHON][DOCS] Exclude _images and _sources that are generated by Sphinx in Jekyll build

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29402: URL: https://github.com/apache/spark/pull/29402#issuecomment-671730938 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29402: [SPARK-32584][PYTHON][DOCS] Exclude _images and _sources that are generated by Sphinx in Jekyll build

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29402: URL: https://github.com/apache/spark/pull/29402#issuecomment-671730938 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29402: [SPARK-32584][PYTHON][DOCS] Exclude _images and _sources that are generated by Sphinx in Jekyll build

2020-08-10 Thread GitBox
SparkQA commented on pull request #29402: URL: https://github.com/apache/spark/pull/29402#issuecomment-671730864 **[Test build #127307 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127307/testReport)** for PR 29402 at commit

[GitHub] [spark] imback82 commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
imback82 commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468330472 ## File path: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala ## @@ -395,7 +395,7 @@ class

[GitHub] [spark] imback82 commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
imback82 commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468330253 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala ## @@ -155,7 +155,7 @@ object

[GitHub] [spark] HyukjinKwon commented on pull request #29402: [SPARK-32584][PYTHON][DOCS] Exclude _images and _sources that are generated by Sphinx in Jekyll build

2020-08-10 Thread GitBox
HyukjinKwon commented on pull request #29402: URL: https://github.com/apache/spark/pull/29402#issuecomment-671729038 cc @viirya can you take a quick look when you're available? This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29402: [SPARK-32584][PYTHON][DOCS] Exclude _images and _sources that are generated by Sphinx in Jekyll build

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29402: URL: https://github.com/apache/spark/pull/29402#issuecomment-671728329 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29402: [SPARK-32584][PYTHON][DOCS] Exclude _images and _sources that are generated by Sphinx in Jekyll build

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29402: URL: https://github.com/apache/spark/pull/29402#issuecomment-671728329 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29402: [SPARK-32584][PYTHON][DOCS] Exclude _images and _sources that are generated by Sphinx in Jekyll build

2020-08-10 Thread GitBox
SparkQA commented on pull request #29402: URL: https://github.com/apache/spark/pull/29402#issuecomment-671727996 **[Test build #127307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127307/testReport)** for PR 29402 at commit

[GitHub] [spark] HyukjinKwon opened a new pull request #29402: [SPARK-32584][PYTHON][DOCS] Exclude _images and _sources that are generated by Sphinx in Jekyll build

2020-08-10 Thread GitBox
HyukjinKwon opened a new pull request #29402: URL: https://github.com/apache/spark/pull/29402 ### What changes were proposed in this pull request? This PR proposes to `include` `_images` and `_sources` directories in Jekyll build. For `_images` directory, After

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671725265 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671725265 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
SparkQA removed a comment on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671657035 **[Test build #127297 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127297/testReport)** for PR 29396 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
cloud-fan commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468326123 ## File path: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala ## @@ -395,7 +395,7 @@ class

[GitHub] [spark] SparkQA commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
SparkQA commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671724764 **[Test build #127297 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127297/testReport)** for PR 29396 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
cloud-fan commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468326123 ## File path: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala ## @@ -395,7 +395,7 @@ class

[GitHub] [spark] c21 commented on pull request #29342: [SPARK-32399][SQL] Full outer shuffled hash join

2020-08-10 Thread GitBox
c21 commented on pull request #29342: URL: https://github.com/apache/spark/pull/29342#issuecomment-671723199 @cloud-fan - sorry if I miss anything, could you elaborate more of > We don't need to get the value index. We can calculate it by ourselves How do we calculate value

[GitHub] [spark] cloud-fan commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
cloud-fan commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468324590 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala ## @@ -155,7 +155,7 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
cloud-fan commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468324072 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -245,12 +245,19 @@ class DataFrameReader

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29367: [SPARK-31198][CORE] Use graceful decommissioning as part of dynamic scaling

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29367: URL: https://github.com/apache/spark/pull/29367#issuecomment-671721882 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] imback82 commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
imback82 commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468286947 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala ## @@ -155,7 +155,7 @@ object

[GitHub] [spark] imback82 commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
imback82 commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468295791 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala ## @@ -155,7 +155,7 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
cloud-fan commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468323610 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -245,12 +245,19 @@ class DataFrameReader

[GitHub] [spark] AmplabJenkins commented on pull request #29367: [SPARK-31198][CORE] Use graceful decommissioning as part of dynamic scaling

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29367: URL: https://github.com/apache/spark/pull/29367#issuecomment-671721882 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29367: [SPARK-31198][CORE] Use graceful decommissioning as part of dynamic scaling

2020-08-10 Thread GitBox
SparkQA commented on pull request #29367: URL: https://github.com/apache/spark/pull/29367#issuecomment-671721570 **[Test build #127306 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127306/testReport)** for PR 29367 at commit

[GitHub] [spark] imback82 commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
imback82 commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468323154 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -245,6 +246,10 @@ class DataFrameReader

[GitHub] [spark] cloud-fan commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
cloud-fan commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468323149 ## File path: docs/sql-migration-guide.md ## @@ -36,6 +36,8 @@ license: | - In Spark 3.1, NULL elements of structures, arrays and maps are

[GitHub] [spark] cloud-fan commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
cloud-fan commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468322696 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -245,6 +246,10 @@ class DataFrameReader

[GitHub] [spark] cloud-fan commented on pull request #29342: [SPARK-32399][SQL] Full outer shuffled hash join

2020-08-10 Thread GitBox
cloud-fan commented on pull request #29342: URL: https://github.com/apache/spark/pull/29342#issuecomment-671719927 A few more thoughts: 1. For `keyIsUnique` code path, we know it's one key one value, I think we can still use bitset. 2. We don't need to get the value index. We can

[GitHub] [spark] imback82 commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
imback82 commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468320706 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -245,6 +246,10 @@ class DataFrameReader

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29389: [SPARK-32573][SQL] Anti Join Improvement with EmptyHashedRelation and EmptyHashedRelationWithAllNullKeys

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29389: URL: https://github.com/apache/spark/pull/29389#issuecomment-671718149 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29389: [SPARK-32573][SQL] Anti Join Improvement with EmptyHashedRelation and EmptyHashedRelationWithAllNullKeys

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29389: URL: https://github.com/apache/spark/pull/29389#issuecomment-671718149 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] leanken commented on pull request #29389: [SPARK-32573][SQL] Anti Join Improvement with EmptyHashedRelation and EmptyHashedRelationWithAllNullKeys

2020-08-10 Thread GitBox
leanken commented on pull request #29389: URL: https://github.com/apache/spark/pull/29389#issuecomment-671718124 @cloud-fan Test passed, ready to merge. This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] SparkQA removed a comment on pull request #29389: [SPARK-32573][SQL] Anti Join Improvement with EmptyHashedRelation and EmptyHashedRelationWithAllNullKeys

2020-08-10 Thread GitBox
SparkQA removed a comment on pull request #29389: URL: https://github.com/apache/spark/pull/29389#issuecomment-671646391 **[Test build #127296 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127296/testReport)** for PR 29389 at commit

[GitHub] [spark] SparkQA commented on pull request #29389: [SPARK-32573][SQL] Anti Join Improvement with EmptyHashedRelation and EmptyHashedRelationWithAllNullKeys

2020-08-10 Thread GitBox
SparkQA commented on pull request #29389: URL: https://github.com/apache/spark/pull/29389#issuecomment-671717620 **[Test build #127296 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127296/testReport)** for PR 29389 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671716088 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671716088 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
SparkQA commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671715831 **[Test build #127305 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127305/testReport)** for PR 29396 at commit

[GitHub] [spark] gatorsmile commented on a change in pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-10 Thread GitBox
gatorsmile commented on a change in pull request #28904: URL: https://github.com/apache/spark/pull/28904#discussion_r468306894 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala ## @@ -160,8 +152,42 @@ class HDFSMetadataLog[T

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671704714 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671704714 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
SparkQA removed a comment on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671694607 **[Test build #127303 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127303/testReport)** for PR 28991 at commit

[GitHub] [spark] SparkQA commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
SparkQA commented on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671704534 **[Test build #127303 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127303/testReport)** for PR 28991 at commit

[GitHub] [spark] gatorsmile commented on a change in pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-10 Thread GitBox
gatorsmile commented on a change in pull request #28904: URL: https://github.com/apache/spark/pull/28904#discussion_r468306410 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSinkLog.scala ## @@ -97,18 +97,13 @@ class

[GitHub] [spark] gatorsmile commented on a change in pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-10 Thread GitBox
gatorsmile commented on a change in pull request #28904: URL: https://github.com/apache/spark/pull/28904#discussion_r468305462 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala ## @@ -132,12 +130,18 @@ abstract class

[GitHub] [spark] gatorsmile commented on a change in pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-10 Thread GitBox
gatorsmile commented on a change in pull request #28904: URL: https://github.com/apache/spark/pull/28904#discussion_r468305424 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala ## @@ -132,12 +130,18 @@ abstract class

[GitHub] [spark] cloud-fan commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
cloud-fan commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468305334 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -245,6 +246,10 @@ class DataFrameReader

[GitHub] [spark] gatorsmile commented on a change in pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-10 Thread GitBox
gatorsmile commented on a change in pull request #28904: URL: https://github.com/apache/spark/pull/28904#discussion_r468305223 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala ## @@ -173,37 +177,64 @@ abstract class

[GitHub] [spark] gatorsmile commented on a change in pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-10 Thread GitBox
gatorsmile commented on a change in pull request #28904: URL: https://github.com/apache/spark/pull/28904#discussion_r468303420 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala ## @@ -173,37 +177,64 @@ abstract class

[GitHub] [spark] cloud-fan closed pull request #29031: [SPARK-32216][SQL] Remove redundant ProjectExec

2020-08-10 Thread GitBox
cloud-fan closed pull request #29031: URL: https://github.com/apache/spark/pull/29031 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on pull request #29031: [SPARK-32216][SQL] Remove redundant ProjectExec

2020-08-10 Thread GitBox
cloud-fan commented on pull request #29031: URL: https://github.com/apache/spark/pull/29031#issuecomment-671699494 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-671696968 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-671696968 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-08-10 Thread GitBox
SparkQA commented on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-671696614 **[Test build #127304 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127304/testReport)** for PR 29276 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671694990 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671695012 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671695006 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29369: [SPARK-32540][SQL] Eliminate the filter clause in aggregate

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29369: URL: https://github.com/apache/spark/pull/29369#issuecomment-671694921 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
SparkQA removed a comment on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671694555 **[Test build #127301 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127301/testReport)** for PR 29396 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671694990 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29369: [SPARK-32540][SQL] Eliminate the filter clause in aggregate

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29369: URL: https://github.com/apache/spark/pull/29369#issuecomment-671694921 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
SparkQA commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671694997 **[Test build #127301 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127301/testReport)** for PR 29396 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671695006 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29369: [SPARK-32540][SQL] Eliminate the filter clause in aggregate

2020-08-10 Thread GitBox
SparkQA commented on pull request #29369: URL: https://github.com/apache/spark/pull/29369#issuecomment-671694575 **[Test build #127302 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127302/testReport)** for PR 29369 at commit

[GitHub] [spark] SparkQA commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
SparkQA commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671694555 **[Test build #127301 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127301/testReport)** for PR 29396 at commit

[GitHub] [spark] SparkQA commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
SparkQA commented on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671694607 **[Test build #127303 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127303/testReport)** for PR 28991 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671692831 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] imback82 commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
imback82 commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468295791 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala ## @@ -155,7 +155,7 @@ object

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671692976 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] huaxingao commented on a change in pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
huaxingao commented on a change in pull request #29396: URL: https://github.com/apache/spark/pull/29396#discussion_r468295410 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalogSuite.scala ## @@ -107,6 +111,41 @@ class

[GitHub] [spark] AmplabJenkins commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671692976 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
SparkQA removed a comment on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671690512 **[Test build #127300 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127300/testReport)** for PR 28991 at commit

[GitHub] [spark] huaxingao commented on a change in pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
huaxingao commented on a change in pull request #29396: URL: https://github.com/apache/spark/pull/29396#discussion_r468295245 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTable.scala ## @@ -34,4 +38,16 @@ case class

[GitHub] [spark] huaxingao commented on a change in pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
huaxingao commented on a change in pull request #29396: URL: https://github.com/apache/spark/pull/29396#discussion_r468295282 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCWriteBuilder.scala ## @@ -0,0 +1,47 @@ +/* + * Licensed

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671692824 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] huaxingao commented on a change in pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
huaxingao commented on a change in pull request #29396: URL: https://github.com/apache/spark/pull/29396#discussion_r468295197 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCScanBuilder.scala ## @@ -0,0 +1,61 @@ +/* + * Licensed to

[GitHub] [spark] SparkQA commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
SparkQA commented on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671692811 **[Test build #127300 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127300/testReport)** for PR 28991 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671692824 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] huaxingao commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-10 Thread GitBox
huaxingao commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-671692513 > Could you add tests for write since you supported WriteBuilder in this PR. I will take JDBCV2Suite from @cloud-fan 's https://github.com/apache/spark/pull/27345. I will

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671691009 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29401: [SPARK-32400][SQL] Improve test coverage of HiveScriptTransformationExec

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29401: URL: https://github.com/apache/spark/pull/29401#issuecomment-671690941 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671691009 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29401: [SPARK-32400][SQL] Improve test coverage of HiveScriptTransformationExec

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29401: URL: https://github.com/apache/spark/pull/29401#issuecomment-671690941 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-08-10 Thread GitBox
SparkQA commented on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-671690512 **[Test build #127300 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127300/testReport)** for PR 28991 at commit

[GitHub] [spark] SparkQA commented on pull request #29401: [SPARK-32400][SQL] Improve test coverage of HiveScriptTransformationExec

2020-08-10 Thread GitBox
SparkQA commented on pull request #29401: URL: https://github.com/apache/spark/pull/29401#issuecomment-671690473 **[Test build #127299 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127299/testReport)** for PR 29401 at commit

[GitHub] [spark] AngersZhuuuu opened a new pull request #29401: [SPARK-32400][SQL] Improve test coverage of HiveScriptTransformationExec

2020-08-10 Thread GitBox
AngersZh opened a new pull request #29401: URL: https://github.com/apache/spark/pull/29401 ### What changes were proposed in this pull request? 1. Extract common test case (no serde) to BasicScriptTransformationExecSuite 2. Add more test case for no serde mode about supported

[GitHub] [spark] SparkQA commented on pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
SparkQA commented on pull request #29328: URL: https://github.com/apache/spark/pull/29328#issuecomment-671685821 **[Test build #127298 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127298/testReport)** for PR 29328 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29328: URL: https://github.com/apache/spark/pull/29328#issuecomment-671684138 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
AmplabJenkins commented on pull request #29328: URL: https://github.com/apache/spark/pull/29328#issuecomment-671684138 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] imback82 commented on a change in pull request #29328: [SPARK-32516][SQL] 'path' option cannot co-exist with load()'s path parameters

2020-08-10 Thread GitBox
imback82 commented on a change in pull request #29328: URL: https://github.com/apache/spark/pull/29328#discussion_r468286947 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala ## @@ -155,7 +155,7 @@ object

[GitHub] [spark] AngersZhuuuu commented on pull request #28490: [SPARK-31670][SQL]Resolve Struct Field in Grouping Aggregate with same ExprId

2020-08-10 Thread GitBox
AngersZh commented on pull request #28490: URL: https://github.com/apache/spark/pull/28490#issuecomment-671682512 Any update for this? cc @cloud-fan @viirya @HyukjinKwon This is an automated message from the Apache Git

  1   2   3   4   5   6   >