[GitHub] [spark] AmplabJenkins removed a comment on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #27333: URL: https://github.com/apache/spark/pull/27333#issuecomment-675250950 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #27649: URL: https://github.com/apache/spark/pull/27649#issuecomment-675250851 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #25965: URL: https://github.com/apache/spark/pull/25965#issuecomment-675250955 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-675250854 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #27333: URL: https://github.com/apache/spark/pull/27333#issuecomment-675250942 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #24173: URL: https://github.com/apache/spark/pull/24173#issuecomment-675250987 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #24173: URL: https://github.com/apache/spark/pull/24173#issuecomment-675250987 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #28363: URL: https://github.com/apache/spark/pull/28363#issuecomment-675250821 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #28422: URL: https://github.com/apache/spark/pull/28422#issuecomment-675250864 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #25965: URL: https://github.com/apache/spark/pull/25965#issuecomment-675250955 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #28422: URL: https://github.com/apache/spark/pull/28422#issuecomment-675250864 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-675250854 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #28363: URL: https://github.com/apache/spark/pull/28363#issuecomment-675250821 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #27649: URL: https://github.com/apache/spark/pull/27649#issuecomment-675250851 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] HeartSaVioR commented on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-08-17 Thread GitBox
HeartSaVioR commented on pull request #24173: URL: https://github.com/apache/spark/pull/24173#issuecomment-675250585 retest this, please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] SparkQA commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-08-17 Thread GitBox
SparkQA commented on pull request #28422: URL: https://github.com/apache/spark/pull/28422#issuecomment-675250414 **[Test build #127531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127531/testReport)** for PR 28422 at commit [`06ee53d`](https://github.com

[GitHub] [spark] HeartSaVioR commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-08-17 Thread GitBox
HeartSaVioR commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-675250260 retest this, please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] SparkQA commented on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-08-17 Thread GitBox
SparkQA commented on pull request #28363: URL: https://github.com/apache/spark/pull/28363#issuecomment-675250435 **[Test build #127532 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127532/testReport)** for PR 28363 at commit [`b648156`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-17 Thread GitBox
SparkQA commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-675250393 **[Test build #127530 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127530/testReport)** for PR 28904 at commit [`e16ebe4`](https://github.com

[GitHub] [spark] HeartSaVioR commented on pull request #27333: [SPARK-29438][SS][FOLLOWUP] Add regression tests for Streaming Aggregation and flatMapGroupsWithState

2020-08-17 Thread GitBox
HeartSaVioR commented on pull request #27333: URL: https://github.com/apache/spark/pull/27333#issuecomment-675250318 retest this, please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] HeartSaVioR commented on pull request #28422: [SPARK-17604][SS] FileStreamSource: provide a new option to have retention on input files

2020-08-17 Thread GitBox
HeartSaVioR commented on pull request #28422: URL: https://github.com/apache/spark/pull/28422#issuecomment-675250207 retest this, please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] HeartSaVioR commented on pull request #27649: [SPARK-30900][SS] FileStreamSource: Avoid reading compact metadata log twice if the query restarts from compact batch

2020-08-17 Thread GitBox
HeartSaVioR commented on pull request #27649: URL: https://github.com/apache/spark/pull/27649#issuecomment-675250291 retest this, please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] HeartSaVioR commented on pull request #25965: [SPARK-26425][SS] Add more constraint checks to avoid checkpoint corruption

2020-08-17 Thread GitBox
HeartSaVioR commented on pull request #25965: URL: https://github.com/apache/spark/pull/25965#issuecomment-675250405 retest this, please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] HeartSaVioR commented on pull request #28363: [SPARK-27188][SS] FileStreamSink: provide a new option to have retention on output files

2020-08-17 Thread GitBox
HeartSaVioR commented on pull request #28363: URL: https://github.com/apache/spark/pull/28363#issuecomment-675250230 retest this, please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] mridulm commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-17 Thread GitBox
mridulm commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-675249974 Queue capacity is an approximation to bound memory usage in event queues; and tends to be conservative - event loss is fine (except DRA) - but driver OOM causes entire app to fa

[GitHub] [spark] AmplabJenkins commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-675248798 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-675248798 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cloud-fan commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-08-17 Thread GitBox
cloud-fan commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r471912000 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand(

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29082: URL: https://github.com/apache/spark/pull/29082#issuecomment-675246305 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29082: URL: https://github.com/apache/spark/pull/29082#issuecomment-675246303 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29082: URL: https://github.com/apache/spark/pull/29082#issuecomment-675246303 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-17 Thread GitBox
SparkQA removed a comment on pull request #29082: URL: https://github.com/apache/spark/pull/29082#issuecomment-675215033 **[Test build #127520 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127520/testReport)** for PR 29082 at commit [`961eae1`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-17 Thread GitBox
SparkQA commented on pull request #29082: URL: https://github.com/apache/spark/pull/29082#issuecomment-675246002 **[Test build #127520 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127520/testReport)** for PR 29082 at commit [`961eae1`](https://github.co

[GitHub] [spark] HeartSaVioR commented on a change in pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-17 Thread GitBox
HeartSaVioR commented on a change in pull request #28904: URL: https://github.com/apache/spark/pull/28904#discussion_r471910255 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSinkLog.scala ## @@ -97,18 +97,15 @@ class FileStreamSinkLog

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29395: URL: https://github.com/apache/spark/pull/29395#issuecomment-675242605 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29456: [WIP][SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29456: URL: https://github.com/apache/spark/pull/29456#issuecomment-675242364 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127

[GitHub] [spark] AmplabJenkins commented on pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29395: URL: https://github.com/apache/spark/pull/29395#issuecomment-675242605 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29456: [WIP][SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-17 Thread GitBox
SparkQA removed a comment on pull request #29456: URL: https://github.com/apache/spark/pull/29456#issuecomment-675232304 **[Test build #127525 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127525/testReport)** for PR 29456 at commit [`8fb651a`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29456: [WIP][SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29456: URL: https://github.com/apache/spark/pull/29456#issuecomment-675242360 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] SparkQA commented on pull request #29456: [WIP][SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-17 Thread GitBox
SparkQA commented on pull request #29456: URL: https://github.com/apache/spark/pull/29456#issuecomment-675242331 **[Test build #127525 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127525/testReport)** for PR 29456 at commit [`8fb651a`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #29456: [WIP][SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29456: URL: https://github.com/apache/spark/pull/29456#issuecomment-675242360 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-17 Thread GitBox
SparkQA removed a comment on pull request #29395: URL: https://github.com/apache/spark/pull/29395#issuecomment-675204497 **[Test build #127519 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127519/testReport)** for PR 29395 at commit [`daa205d`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-17 Thread GitBox
SparkQA commented on pull request #29395: URL: https://github.com/apache/spark/pull/29395#issuecomment-675242028 **[Test build #127519 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127519/testReport)** for PR 29395 at commit [`daa205d`](https://github.co

[GitHub] [spark] cloud-fan commented on a change in pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-17 Thread GitBox
cloud-fan commented on a change in pull request #29458: URL: https://github.com/apache/spark/pull/29458#discussion_r471906037 ## File path: docs/sql-migration-guide.md ## @@ -36,6 +36,10 @@ license: | - In Spark 3.1, NULL elements of structures, arrays and maps are convert

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29458: URL: https://github.com/apache/spark/pull/29458#issuecomment-675239740 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29458: URL: https://github.com/apache/spark/pull/29458#issuecomment-675239740 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-17 Thread GitBox
SparkQA removed a comment on pull request #29458: URL: https://github.com/apache/spark/pull/29458#issuecomment-675237024 **[Test build #127528 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127528/testReport)** for PR 29458 at commit [`ac10d8e`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-17 Thread GitBox
SparkQA commented on pull request #29458: URL: https://github.com/apache/spark/pull/29458#issuecomment-675239649 **[Test build #127528 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127528/testReport)** for PR 29458 at commit [`ac10d8e`](https://github.co

[GitHub] [spark] viirya commented on a change in pull request #29437: [SPARK-32621][SQL] 'path' option can cause issues while inferring schema in CSV/JSON datasources

2020-08-17 Thread GitBox
viirya commented on a change in pull request #29437: URL: https://github.com/apache/spark/pull/29437#discussion_r471903580 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala ## @@ -191,9 +191,11 @@ case class DataSource( val

[GitHub] [spark] dongjoon-hyun commented on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-17 Thread GitBox
dongjoon-hyun commented on pull request #29459: URL: https://github.com/apache/spark/pull/29459#issuecomment-675239069 +1. No problem. This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29459: URL: https://github.com/apache/spark/pull/29459#issuecomment-675238805 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29459: URL: https://github.com/apache/spark/pull/29459#issuecomment-675238805 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-17 Thread GitBox
SparkQA commented on pull request #29459: URL: https://github.com/apache/spark/pull/29459#issuecomment-675238599 **[Test build #127529 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127529/testReport)** for PR 29459 at commit [`3bd540f`](https://github.com

[GitHub] [spark] HyukjinKwon commented on pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-17 Thread GitBox
HyukjinKwon commented on pull request #29459: URL: https://github.com/apache/spark/pull/29459#issuecomment-675238629 cc @dongjoon-hyun, @gengliangwang and @viirya, this is a very minor stuff. I am preparing to backport GitHub Actions for SPARK-32249 and thought it's less readable. WDYT?

[GitHub] [spark] HyukjinKwon opened a new pull request #29459: [MINOR][INFRA] Rename master.yml to build_and_test.yml

2020-08-17 Thread GitBox
HyukjinKwon opened a new pull request #29459: URL: https://github.com/apache/spark/pull/29459 ### What changes were proposed in this pull request? This PR renames `master.yml` to `build_and_test.yml` to indicate this is the workflow that builds and runs the tests. ### Why are

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29458: URL: https://github.com/apache/spark/pull/29458#issuecomment-675237329 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29458: URL: https://github.com/apache/spark/pull/29458#issuecomment-675237329 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-17 Thread GitBox
SparkQA commented on pull request #29458: URL: https://github.com/apache/spark/pull/29458#issuecomment-675237024 **[Test build #127528 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127528/testReport)** for PR 29458 at commit [`ac10d8e`](https://github.com

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] AmplabJenkins commented on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29457: URL: https://github.com/apache/spark/pull/29457#issuecomment-675235617 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] gengliangwang opened a new pull request #29458: [SPARK-32018][FOLLOWUP][Doc] Add migration guide for decimal value overflow in sum aggregation

2020-08-17 Thread GitBox
gengliangwang opened a new pull request #29458: URL: https://github.com/apache/spark/pull/29458 ### What changes were proposed in this pull request? Add migration guide for decimal value overflow behavior in sum aggregation, introduced in https://github.com/apache/spark/pull/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29457: URL: https://github.com/apache/spark/pull/29457#issuecomment-675235617 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] SparkQA commented on pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-17 Thread GitBox
SparkQA commented on pull request #29457: URL: https://github.com/apache/spark/pull/29457#issuecomment-675235377 **[Test build #127527 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127527/testReport)** for PR 29457 at commit [`2f010de`](https://github.com

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] SaurabhChawla100 commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-17 Thread GitBox
SaurabhChawla100 commented on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-675235066 >what I understand from the discussions here is that, let us assume 30,000 is a good number in many cases for a certain workload. But in some scenarios, like slight cha

[GitHub] [spark] viirya opened a new pull request #29457: [SPARK-32646][SQL] ORC predicate pushdown should work with case-insensitive analysis

2020-08-17 Thread GitBox
viirya opened a new pull request #29457: URL: https://github.com/apache/spark/pull/29457 ### What changes were proposed in this pull request? This PR proposes to fix ORC predicate pushdown under case-insensitive analysis case. The field names in pushed down predicates don

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471899486 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala ## @@ -206,14 +206,14 @@ object JoinReor

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29437: [SPARK-32621][SQL] 'path' option can cause issues while inferring schema in CSV/JSON datasources

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29437: URL: https://github.com/apache/spark/pull/29437#issuecomment-675234231 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29437: [SPARK-32621][SQL] 'path' option can cause issues while inferring schema in CSV/JSON datasources

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29437: URL: https://github.com/apache/spark/pull/29437#issuecomment-675234231 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29360: [SPARK-32542][SQL] Add an optimizer rule to split an Expand into multiple Expands for aggregates

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29360: URL: https://github.com/apache/spark/pull/29360#issuecomment-675234148 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] AmplabJenkins commented on pull request #29360: [SPARK-32542][SQL] Add an optimizer rule to split an Expand into multiple Expands for aggregates

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29360: URL: https://github.com/apache/spark/pull/29360#issuecomment-675234148 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29437: [SPARK-32621][SQL] 'path' option can cause issues while inferring schema in CSV/JSON datasources

2020-08-17 Thread GitBox
SparkQA removed a comment on pull request #29437: URL: https://github.com/apache/spark/pull/29437#issuecomment-675168006 **[Test build #127514 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127514/testReport)** for PR 29437 at commit [`b7f4ff6`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29360: [SPARK-32542][SQL] Add an optimizer rule to split an Expand into multiple Expands for aggregates

2020-08-17 Thread GitBox
SparkQA commented on pull request #29360: URL: https://github.com/apache/spark/pull/29360#issuecomment-675233817 **[Test build #127526 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127526/testReport)** for PR 29360 at commit [`87b9a82`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #29437: [SPARK-32621][SQL] 'path' option can cause issues while inferring schema in CSV/JSON datasources

2020-08-17 Thread GitBox
SparkQA commented on pull request #29437: URL: https://github.com/apache/spark/pull/29437#issuecomment-675233731 **[Test build #127514 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127514/testReport)** for PR 29437 at commit [`b7f4ff6`](https://github.co

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] LuciferYang commented on a change in pull request #29434: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-17 Thread GitBox
LuciferYang commented on a change in pull request #29434: URL: https://github.com/apache/spark/pull/29434#discussion_r471898154 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala ## @@ -136,7 +136,8 @@ object ArrayBasedMapData

[GitHub] [spark] AmplabJenkins commented on pull request #29456: [WIP][SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29456: URL: https://github.com/apache/spark/pull/29456#issuecomment-675232554 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29456: [WIP][SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29456: URL: https://github.com/apache/spark/pull/29456#issuecomment-675232554 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29456: [WIP][SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-17 Thread GitBox
SparkQA commented on pull request #29456: URL: https://github.com/apache/spark/pull/29456#issuecomment-675232304 **[Test build #127525 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127525/testReport)** for PR 29456 at commit [`8fb651a`](https://github.com

[GitHub] [spark] HyukjinKwon opened a new pull request #29456: [WIP][SPARK-32647][INFRA] Report SparkR test results with JUnit reporter

2020-08-17 Thread GitBox
HyukjinKwon opened a new pull request #29456: URL: https://github.com/apache/spark/pull/29456 ### What changes were proposed in this pull request? This PR proposes to generate JUnit XML test report in SparkR tests that can be leveraged in both Jenkins and GitHub Actions. **Git

[GitHub] [spark] agrawaldevesh edited a comment on pull request #29452: [SPARK-32643] Consolidate state decommissioning in the TaskSchedulerImpl realm

2020-08-17 Thread GitBox
agrawaldevesh edited a comment on pull request #29452: URL: https://github.com/apache/spark/pull/29452#issuecomment-675222880 cc: @holdenk and @prakharjain09 ... This PR simply does some state cleanup/consolidation without making any semantic changes. I would be grateful for your review. I

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29455: [SPARK-32644][SQL] NAAJ support for ShuffleHashJoin when AQE is on

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #29455: URL: https://github.com/apache/spark/pull/29455#issuecomment-675230669 Can one of the admins verify this patch? This is an automated message from the Apache Git Service.

[GitHub] [spark] AmplabJenkins commented on pull request #29455: [SPARK-32644][SQL] NAAJ support for ShuffleHashJoin when AQE is on

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29455: URL: https://github.com/apache/spark/pull/29455#issuecomment-675230945 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] AmplabJenkins commented on pull request #29455: [SPARK-32644][SQL] NAAJ support for ShuffleHashJoin when AQE is on

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #29455: URL: https://github.com/apache/spark/pull/29455#issuecomment-675230669 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] leanken commented on pull request #29455: [SPARK-32644][SQL] NAAJ support for ShuffleHashJoin when AQE is on

2020-08-17 Thread GitBox
leanken commented on pull request #29455: URL: https://github.com/apache/spark/pull/29455#issuecomment-675230068 @cloud-fan could you please have a look at this PR, many thanks. This is an automated message from the Apache Gi

[GitHub] [spark] leanken opened a new pull request #29455: [SPARK-32644][SQL] NAAJ support for ShuffleHashJoin when AQE is on

2020-08-17 Thread GitBox
leanken opened a new pull request #29455: URL: https://github.com/apache/spark/pull/29455 ### What changes were proposed in this pull request? In [SPARK-32290](https://issues.apache.org/jira/browse/SPARK-32290), we managed to optimize NAAJ scenario from BNLJ to BHJ, but skipped the check

[GitHub] [spark] HarborZeng commented on pull request #16486: [SPARK-13610][ML] Create a Transformer to disassemble vectors in Data…

2020-08-17 Thread GitBox
HarborZeng commented on pull request #16486: URL: https://github.com/apache/spark/pull/16486#issuecomment-675229356 such a great transformer, don't understand why they chose to ingore this patch. This is an automated message

[GitHub] [spark] huaxingao commented on pull request #29396: [SPARK-32579][SQL] Implement JDBCScan/ScanBuilder/WriteBuilder

2020-08-17 Thread GitBox
huaxingao commented on pull request #29396: URL: https://github.com/apache/spark/pull/29396#issuecomment-675229296 @cloud-fan Could you please take a look when you have time? Most of this was taken from your PR. Thanks a lot! ---

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-17 Thread GitBox
AmplabJenkins removed a comment on pull request #28841: URL: https://github.com/apache/spark/pull/28841#issuecomment-675227697 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-17 Thread GitBox
AmplabJenkins commented on pull request #28841: URL: https://github.com/apache/spark/pull/28841#issuecomment-675227697 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] Ngone51 commented on pull request #29452: [SPARK-32643] Consolidate state decommissioning in the TaskSchedulerImpl realm

2020-08-17 Thread GitBox
Ngone51 commented on pull request #29452: URL: https://github.com/apache/spark/pull/29452#issuecomment-675227670 @agrawaldevesh Could you please add the `[CORE]` tag in the PR title like other PRs? This is an automated messa

[GitHub] [spark] SparkQA commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-17 Thread GitBox
SparkQA commented on pull request #28841: URL: https://github.com/apache/spark/pull/28841#issuecomment-675227409 **[Test build #127524 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127524/testReport)** for PR 28841 at commit [`263dd2a`](https://github.com

[GitHub] [spark] Ngone51 commented on a change in pull request #29454: [SPARK-32645][INFRA] Upload unit-tests.log as an artifact

2020-08-17 Thread GitBox
Ngone51 commented on a change in pull request #29454: URL: https://github.com/apache/spark/pull/29454#discussion_r471891073 ## File path: .github/workflows/master.yml ## @@ -183,6 +183,12 @@ jobs: with: name: test-results-${{ matrix.modules }}-${{ matrix.comment

<    1   2   3   4   5   >