[GitHub] [spark] xuanyuanking edited a comment on pull request #28707: [SPARK-31894][SS] Introduce UnsafeRow format validation for streaming state store

2020-06-14 Thread GitBox
xuanyuanking edited a comment on pull request #28707: URL: https://github.com/apache/spark/pull/28707#issuecomment-643916110 cc @maropu @gatorsmile @HeartSaVioR @dongjoon-hyun A new regression bug SPARK-31990 was found when investigating the test failure

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28619: URL: https://github.com/apache/spark/pull/28619#issuecomment-643916951 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] xuanyuanking commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
xuanyuanking commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643916855 ``` How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28829: [WIP][SQL] Benchmark the EXCEPTION rebase mode

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28829: URL: https://github.com/apache/spark/pull/28829#issuecomment-643916877 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #28829: [WIP][SQL] Benchmark the EXCEPTION rebase mode

2020-06-14 Thread GitBox
SparkQA commented on pull request #28829: URL: https://github.com/apache/spark/pull/28829#issuecomment-643916564 **[Test build #124033 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124033/testReport)** for PR 28829 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28829: [WIP][SQL] Benchmark the EXCEPTION rebase mode

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28829: URL: https://github.com/apache/spark/pull/28829#issuecomment-643916882 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] HyukjinKwon commented on pull request #17953: [SPARK-20680][SQL] Spark-sql do not support for void column datatype …

2020-06-14 Thread GitBox
HyukjinKwon commented on pull request #17953: URL: https://github.com/apache/spark/pull/17953#issuecomment-643916503 Yeah .. I personally support this change FWIW. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28619: URL: https://github.com/apache/spark/pull/28619#issuecomment-643916951 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-06-14 Thread GitBox
SparkQA commented on pull request #28619: URL: https://github.com/apache/spark/pull/28619#issuecomment-643916615 **[Test build #124034 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124034/testReport)** for PR 28619 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #27805: [SPARK-31056][SQL] Add CalendarIntervals division

2020-06-14 Thread GitBox
HyukjinKwon commented on pull request #27805: URL: https://github.com/apache/spark/pull/27805#issuecomment-643915859 Do we have an answer to https://github.com/apache/spark/pull/27805#issuecomment-635381702? It's easier to justify with actual references and/or standard.

[GitHub] [spark] xuanyuanking commented on pull request #28707: [SPARK-31894][SS] Introduce UnsafeRow format validation for streaming state store

2020-06-14 Thread GitBox
xuanyuanking commented on pull request #28707: URL: https://github.com/apache/spark/pull/28707#issuecomment-643916110 A new regression bug SPARK-31990 was found when investigating the test failure https://github.com/apache/spark/pull/28707#issuecomment-639861273. The root cause is that

[GitHub] [spark] MaxGekk commented on pull request #28829: [WIP][SQL] Benchmark the EXCEPTION rebase mode

2020-06-14 Thread GitBox
MaxGekk commented on pull request #28829: URL: https://github.com/apache/spark/pull/28829#issuecomment-643915417 jenkins, retest this, please This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] Ngone51 commented on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-06-14 Thread GitBox
Ngone51 commented on pull request #28619: URL: https://github.com/apache/spark/pull/28619#issuecomment-643915676 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28642: [SPARK-31809][SQL] Infer IsNotNull for non null intolerant child of null intolerant in join condition

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28642: URL: https://github.com/apache/spark/pull/28642#issuecomment-643914834 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HyukjinKwon commented on a change in pull request #28642: [SPARK-31809][SQL] Infer IsNotNull for non null intolerant child of null intolerant in join condition

2020-06-14 Thread GitBox
HyukjinKwon commented on a change in pull request #28642: URL: https://github.com/apache/spark/pull/28642#discussion_r439940687 ## File path: sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala ## @@ -1039,7 +1039,7 @@ class JoinSuite extends QueryTest with

[GitHub] [spark] AmplabJenkins commented on pull request #28642: [SPARK-31809][SQL] Infer IsNotNull for non null intolerant child of null intolerant in join condition

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28642: URL: https://github.com/apache/spark/pull/28642#issuecomment-643914834 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28642: [SPARK-31809][SQL] Infer IsNotNull for non null intolerant child of null intolerant in join condition

2020-06-14 Thread GitBox
SparkQA commented on pull request #28642: URL: https://github.com/apache/spark/pull/28642#issuecomment-643914470 **[Test build #124032 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124032/testReport)** for PR 28642 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #28642: [SPARK-31809][SQL] Infer IsNotNull for non null intolerant child of null intolerant in join condition

2020-06-14 Thread GitBox
HyukjinKwon commented on pull request #28642: URL: https://github.com/apache/spark/pull/28642#issuecomment-643913716 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] Ngone51 commented on pull request #28801: [SPARK-31970][CORE] Make MDC configuration step be consistent between setLocalProperty and log4j.properties

2020-06-14 Thread GitBox
Ngone51 commented on pull request #28801: URL: https://github.com/apache/spark/pull/28801#issuecomment-643912320 thanks all!! This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643909975 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643909967 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643909975 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124026/

[GitHub] [spark] SparkQA removed a comment on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643877230 **[Test build #124026 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124026/testReport)** for PR 27604 at commit

[GitHub] [spark] SparkQA commented on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
SparkQA commented on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643909627 **[Test build #124026 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124026/testReport)** for PR 27604 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #28828: [SPARK-24634][SS][FOLLOWUP] Rename the variable from "numLateInputs" to "numDropppedRowsByWatermark"

2020-06-14 Thread GitBox
HyukjinKwon commented on pull request #28828: URL: https://github.com/apache/spark/pull/28828#issuecomment-643906549 @xuanyuanking too FYI This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28821: URL: https://github.com/apache/spark/pull/28821#issuecomment-643904439 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28821: URL: https://github.com/apache/spark/pull/28821#issuecomment-643904434 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28821: URL: https://github.com/apache/spark/pull/28821#issuecomment-643865812 **[Test build #124024 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124024/testReport)** for PR 28821 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28821: URL: https://github.com/apache/spark/pull/28821#issuecomment-643904434 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28821: [SPARK-31981][SQL] Keep TimestampType when taking an average of a Timestamp

2020-06-14 Thread GitBox
SparkQA commented on pull request #28821: URL: https://github.com/apache/spark/pull/28821#issuecomment-643904220 **[Test build #124024 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124024/testReport)** for PR 28821 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28807: URL: https://github.com/apache/spark/pull/28807#issuecomment-643899506 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28807: URL: https://github.com/apache/spark/pull/28807#issuecomment-643899506 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] maropu commented on a change in pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
maropu commented on a change in pull request #28807: URL: https://github.com/apache/spark/pull/28807#discussion_r439927771 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala ## @@ -388,12 +396,24 @@ class

[GitHub] [spark] SparkQA commented on pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
SparkQA commented on pull request #28807: URL: https://github.com/apache/spark/pull/28807#issuecomment-643899210 **[Test build #124031 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124031/testReport)** for PR 28807 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28710: [SPARK-31893][ML] Add a generic ClassificationSummary trait

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28710: URL: https://github.com/apache/spark/pull/28710#issuecomment-643897872 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28710: [SPARK-31893][ML] Add a generic ClassificationSummary trait

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28710: URL: https://github.com/apache/spark/pull/28710#issuecomment-643897872 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28710: [SPARK-31893][ML] Add a generic ClassificationSummary trait

2020-06-14 Thread GitBox
SparkQA commented on pull request #28710: URL: https://github.com/apache/spark/pull/28710#issuecomment-643897635 **[Test build #124030 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124030/testReport)** for PR 28710 at commit

[GitHub] [spark] huaxingao commented on pull request #28710: [SPARK-31893][ML] Add a generic ClassificationSummary trait

2020-06-14 Thread GitBox
huaxingao commented on pull request #28710: URL: https://github.com/apache/spark/pull/28710#issuecomment-643896578 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28593: URL: https://github.com/apache/spark/pull/28593#issuecomment-643892810 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28593: URL: https://github.com/apache/spark/pull/28593#issuecomment-643892810 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-06-14 Thread GitBox
SparkQA commented on pull request #28593: URL: https://github.com/apache/spark/pull/28593#issuecomment-643892530 **[Test build #124029 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124029/testReport)** for PR 28593 at commit

[GitHub] [spark] dongjoon-hyun commented on pull request #24922: [SPARK-28120][SS] Rocksdb state storage implementation

2020-06-14 Thread GitBox
dongjoon-hyun commented on pull request #24922: URL: https://github.com/apache/spark/pull/24922#issuecomment-643892244 Thank you for the update, @itsvikramagr . This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28710: [SPARK-31893][ML] Add a generic ClassificationSummary trait

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28710: URL: https://github.com/apache/spark/pull/28710#issuecomment-643891541 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28710: [SPARK-31893][ML] Add a generic ClassificationSummary trait

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28710: URL: https://github.com/apache/spark/pull/28710#issuecomment-643891538 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #28710: [SPARK-31893][ML] Add a generic ClassificationSummary trait

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28710: URL: https://github.com/apache/spark/pull/28710#issuecomment-643855623 **[Test build #124021 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124021/testReport)** for PR 28710 at commit

[GitHub] [spark] SparkQA commented on pull request #28710: [SPARK-31893][ML] Add a generic ClassificationSummary trait

2020-06-14 Thread GitBox
SparkQA commented on pull request #28710: URL: https://github.com/apache/spark/pull/28710#issuecomment-643891334 **[Test build #124021 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124021/testReport)** for PR 28710 at commit

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-64318 How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated

[GitHub] [spark] GuoPhilipse commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-06-14 Thread GitBox
GuoPhilipse commented on pull request #28593: URL: https://github.com/apache/spark/pull/28593#issuecomment-643890774 it is generated by set command,now we have removed it. This is an automated message from the Apache Git

[GitHub] [spark] HeartSaVioR edited a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-643878976 I’m sorry, but version 4 doesn’t leverage UnsafeRow. (version 2 was.) Please read the description thoughtfully. As I commented earlier there’re still lots of

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-64318 How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-64318 How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-64318 How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-64318 How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated

[GitHub] [spark] HeartSaVioR commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-64318 How we plan to consolidate both? How we will write JIRA title/description and PR title/description? Which is the type of the consolidated issue? Is the consolidated issue a

[GitHub] [spark] AmplabJenkins commented on pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #27690: URL: https://github.com/apache/spark/pull/27690#issuecomment-643887374 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #27690: URL: https://github.com/apache/spark/pull/27690#issuecomment-643887374 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] moomindani commented on a change in pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage

2020-06-14 Thread GitBox
moomindani commented on a change in pull request #27690: URL: https://github.com/apache/spark/pull/27690#discussion_r439917190 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala ## @@ -124,11 +153,24 @@ private[hive] trait

[GitHub] [spark] moomindani commented on a change in pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage

2020-06-14 Thread GitBox
moomindani commented on a change in pull request #27690: URL: https://github.com/apache/spark/pull/27690#discussion_r439917190 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala ## @@ -124,11 +153,24 @@ private[hive] trait

[GitHub] [spark] SparkQA commented on pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage

2020-06-14 Thread GitBox
SparkQA commented on pull request #27690: URL: https://github.com/apache/spark/pull/27690#issuecomment-643887119 **[Test build #124028 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124028/testReport)** for PR 27690 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28786: [SPARK-31925][ML] Summary.totalIterations greater than maxIters

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28786: URL: https://github.com/apache/spark/pull/28786#issuecomment-643885908 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #28786: [SPARK-31925][ML] Summary.totalIterations greater than maxIters

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28786: URL: https://github.com/apache/spark/pull/28786#issuecomment-643867351 **[Test build #124025 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124025/testReport)** for PR 28786 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28786: [SPARK-31925][ML] Summary.totalIterations greater than maxIters

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28786: URL: https://github.com/apache/spark/pull/28786#issuecomment-643885908 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28786: [SPARK-31925][ML] Summary.totalIterations greater than maxIters

2020-06-14 Thread GitBox
SparkQA commented on pull request #28786: URL: https://github.com/apache/spark/pull/28786#issuecomment-643885633 **[Test build #124025 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124025/testReport)** for PR 28786 at commit

[GitHub] [spark] maropu commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
maropu commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643885408 > Thanks for the quick fix @maropu! I think maybe we can simplify the bugfix by combining it together with #28707. WDYT? I'll also reference this PR with #28707.

[GitHub] [spark] moomindani commented on a change in pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage

2020-06-14 Thread GitBox
moomindani commented on a change in pull request #27690: URL: https://github.com/apache/spark/pull/27690#discussion_r439913882 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -839,6 +839,17 @@ object SQLConf {

[GitHub] [spark] moomindani commented on a change in pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage

2020-06-14 Thread GitBox
moomindani commented on a change in pull request #27690: URL: https://github.com/apache/spark/pull/27690#discussion_r439913383 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -839,6 +839,17 @@ object SQLConf {

[GitHub] [spark] AmplabJenkins commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643882434 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643882434 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-14 Thread GitBox
SparkQA commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643882321 **[Test build #124027 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124027/testReport)** for PR 28830 at commit

[GitHub] [spark] HeartSaVioR edited a comment on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643880332 +1 to partial revert which should be also OK with author. (I guess it was applied simply by pattern, and it wasn’t for some intended improvement, so no problem for

[GitHub] [spark] HeartSaVioR commented on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643880332 +1 to partial revert which should be also OK with author. (I guess it was applied simply by pattern, and it wasn’t for some outstanding improvement, so no problem for

[GitHub] [spark] xuanyuanking commented on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
xuanyuanking commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643880347 Yep, I think just revert that part is good enough. I will give more context and details on #28707. This

[GitHub] [spark] dongjoon-hyun commented on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
dongjoon-hyun commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643880429 Ya. +1 for partial revert in this PR. This is an automated message from the Apache Git Service. To

[GitHub] [spark] xuanyuanking commented on a change in pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
xuanyuanking commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439910372 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ## @@ -2548,6 +2548,21 @@ class DataFrameSuite extends QueryTest

[GitHub] [spark] HeartSaVioR edited a comment on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-06-14 Thread GitBox
HeartSaVioR edited a comment on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-643878976 I’m sorry, but version 4 doesn’t leverage UnsafeRow. (version 2 was.) Please read the description thoughtfully. As I commented earlier there’re still lots of

[GitHub] [spark] gatorsmile commented on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
gatorsmile commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643879059 Yes. I prefer to reverting the original fix in 3.0.1. and then discuss how to solve/avoid the problems in a proper way.

[GitHub] [spark] maropu commented on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
maropu commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643879180 okay, I'll revert that part in this PR first. This is an automated message from the Apache Git Service. To

[GitHub] [spark] HeartSaVioR commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-06-14 Thread GitBox
HeartSaVioR commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-643878976 I’m sorry, but version 4 doesn’t leverage UnsafeRow. (version 2 was.) Please read the description thoughtfully. As I commented earlier there’re still lots of possible

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643878606 Hi, All. This issue is marked as a hotfix for the blocker issue, but the validation of this issue looks non-trivial. Since `toSet.toSeq` is used since Apache

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643878606 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
dongjoon-hyun commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643878606 Hi, All. This issue is marked as a hotfix for the blocker issue, but the validation of this issue looks non-trivial. Since `toSet.toSeq` is used since Apache Spark

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643877494 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643877494 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] uncleGen commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-06-14 Thread GitBox
uncleGen commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-643877261 @HeartSaVioR Thanks for your efforts. The result (version 4) is very impressive. Overall, it makes sense to me. But we should resolve the concern about using `UnsafeRow`. I am

[GitHub] [spark] SparkQA commented on pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-06-14 Thread GitBox
SparkQA commented on pull request #27604: URL: https://github.com/apache/spark/pull/27604#issuecomment-643877230 **[Test build #124026 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124026/testReport)** for PR 27604 at commit

[GitHub] [spark] iRakson commented on pull request #26901: [SPARK-29152][CORE][2.4] Executor Plugin shutdown when dynamic allocation is enabled

2020-06-14 Thread GitBox
iRakson commented on pull request #26901: URL: https://github.com/apache/spark/pull/26901#issuecomment-643876875 @dongjoon-hyun Its behaviour is pretty confusing. But yeah, if this is breaking branch again then we should not keep it. Yes, this patch failed twice so we must move on.

[GitHub] [spark] maropu commented on a change in pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
maropu commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439907906 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -2541,7 +2542,20 @@ class Dataset[T] private[sql]( def

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439907916 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ## @@ -2548,6 +2548,21 @@ class DataFrameSuite extends QueryTest

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439907052 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -2541,7 +2542,20 @@ class Dataset[T] private[sql]( def

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28830: [SPARK-31990][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439907052 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -2541,7 +2542,20 @@ class Dataset[T] private[sql]( def

[GitHub] [spark] HeartSaVioR commented on a change in pull request #28830: [SPARK-31990][SQL][SS] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439905543 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -2541,7 +2542,20 @@ class Dataset[T] private[sql]( def

[GitHub] [spark] cloud-fan commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-06-14 Thread GitBox
cloud-fan commented on pull request #28593: URL: https://github.com/apache/spark/pull/28593#issuecomment-643873707 Why are there empty golden files generated in `sql/hive/src/test/resources/golden`? This is an automated

[GitHub] [spark] iRakson commented on pull request #28752: [SPARK-31983] Fix Sorting for duration column and make Status column sortable

2020-06-14 Thread GitBox
iRakson commented on pull request #28752: URL: https://github.com/apache/spark/pull/28752#issuecomment-643873658 Thank You. @srowen @sarutak. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] iRakson commented on pull request #28823: [SPARK-31983][WEBUI][3.0] Fix sorting for duration column in structured streaming tab

2020-06-14 Thread GitBox
iRakson commented on pull request #28823: URL: https://github.com/apache/spark/pull/28823#issuecomment-643873542 Thank You. @srowen @sarutak :) This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] maropu commented on a change in pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
maropu commented on a change in pull request #28807: URL: https://github.com/apache/spark/pull/28807#discussion_r439905202 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala ## @@ -388,12 +391,24 @@ class

[GitHub] [spark] AmplabJenkins commented on pull request #28828: [SPARK-24634][SS][FOLLOWUP] Rename the variable from "numLateInputs" to "numDropppedRowsByWatermark"

2020-06-14 Thread GitBox
AmplabJenkins commented on pull request #28828: URL: https://github.com/apache/spark/pull/28828#issuecomment-643873268 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28828: [SPARK-24634][SS][FOLLOWUP] Rename the variable from "numLateInputs" to "numDropppedRowsByWatermark"

2020-06-14 Thread GitBox
AmplabJenkins removed a comment on pull request #28828: URL: https://github.com/apache/spark/pull/28828#issuecomment-643873268 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] maropu commented on a change in pull request #28807: [SPARK-26905][SQL] Follow the SQL:2016 reserved keywords

2020-06-14 Thread GitBox
maropu commented on a change in pull request #28807: URL: https://github.com/apache/spark/pull/28807#discussion_r439905098 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala ## @@ -388,12 +391,24 @@ class

[GitHub] [spark] HeartSaVioR commented on a change in pull request #28830: [SPARK-31990][SQL] Preserves the input order of colNames in dropDuplicates

2020-06-14 Thread GitBox
HeartSaVioR commented on a change in pull request #28830: URL: https://github.com/apache/spark/pull/28830#discussion_r439904837 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -2541,7 +2542,20 @@ class Dataset[T] private[sql]( def

[GitHub] [spark] SparkQA removed a comment on pull request #28828: [SPARK-24634][SS][FOLLOWUP] Rename the variable from "numLateInputs" to "numDropppedRowsByWatermark"

2020-06-14 Thread GitBox
SparkQA removed a comment on pull request #28828: URL: https://github.com/apache/spark/pull/28828#issuecomment-643827509 **[Test build #124015 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124015/testReport)** for PR 28828 at commit

  1   2   3   4   5   >