date:20200615

[GitHub] [spark] SparkQA removed a comment on pull request #28781: [SPARK-31953][SS] Add Spark Structured Streaming History Server Support

2020-06-15 Thread GitBox

SparkQA removed a comment on pull request #28781: URL: https://github.com/apache/spark/pull/28781#issuecomment-643862941 **[Test build #124023 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124023/testReport)** for PR 28781 at commit

[GitHub] [spark] MaxGekk commented on pull request #28829: [SPARK-31992][SQL] Benchmark the EXCEPTION rebase mode

2020-06-15 Thread GitBox

MaxGekk commented on pull request #28829: URL: https://github.com/apache/spark/pull/28829#issuecomment-643930188 @cloud-fan Please, review the PR This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA commented on pull request #28781: [SPARK-31953][SS] Add Spark Structured Streaming History Server Support

2020-06-15 Thread GitBox

SparkQA commented on pull request #28781: URL: https://github.com/apache/spark/pull/28781#issuecomment-643929926 **[Test build #124023 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124023/testReport)** for PR 28781 at commit

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-15 Thread GitBox

dongjoon-hyun edited a comment on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643929047 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-15 Thread GitBox

dongjoon-hyun commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643929047 The last commit is to trying to preserve the previous behavior (whatever it was) since Apache Spark 2.2.0 although there is no guarantee which it safe or not. We will

[GitHub] [spark] HyukjinKwon commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

2020-06-15 Thread GitBox

HyukjinKwon commented on pull request #28830: URL: https://github.com/apache/spark/pull/28830#issuecomment-643927909 I am okay to revert it for now but I couldn't fully follow why we expect an explicit order from a set. Has it been ever guaranteed somewhere? Using `distinct`, we can

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28784: [SPARK-31957][SQL][test-maven] Cleanup hive scratch dir for the developer api startWithContext

2020-06-15 Thread GitBox

AmplabJenkins removed a comment on pull request #28784: URL: https://github.com/apache/spark/pull/28784#issuecomment-643926776 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28784: [SPARK-31957][SQL][test-maven] Cleanup hive scratch dir for the developer api startWithContext

2020-06-15 Thread GitBox

AmplabJenkins commented on pull request #28784: URL: https://github.com/apache/spark/pull/28784#issuecomment-643926776 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28784: [SPARK-31957][SQL][test-maven] Cleanup hive scratch dir for the developer api startWithContext

2020-06-15 Thread GitBox

SparkQA commented on pull request #28784: URL: https://github.com/apache/spark/pull/28784#issuecomment-643926432 **[Test build #124035 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124035/testReport)** for PR 28784 at commit

[GitHub] [spark] dilipbiswal commented on pull request #28032: [SPARK-31264][SQL] Repartition by dynamic partition columns before insert partition table

2020-06-15 Thread GitBox

dilipbiswal commented on pull request #28032: URL: https://github.com/apache/spark/pull/28032#issuecomment-643926524 @wangyum Thanks for your response. If the incoming data is not even distributed by the repartitioning key, wouldn't this strategy create issues when there is skew in the

[GitHub] [spark] yaooqinn commented on pull request #28784: [SPARK-31957][SQL][test-maven] Cleanup hive scratch dir for the developer api startWithContext

2020-06-15 Thread GitBox

yaooqinn commented on pull request #28784: URL: https://github.com/apache/spark/pull/28784#issuecomment-643926043 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] yaooqinn commented on pull request #28784: [SPARK-31957][SQL] Cleanup hive scratch dir for the developer api startWithContext

2020-06-15 Thread GitBox

yaooqinn commented on pull request #28784: URL: https://github.com/apache/spark/pull/28784#issuecomment-643925671 Thanks @HyukjinKwon and @juliuszsompolski, I was waiting for https://github.com/apache/spark/pull/28797 to be merged and then ping you guys. Now it's been done. The

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28828: [SPARK-24634][SS][FOLLOWUP] Rename the variable from "numLateInputs" to "numDropppedRowsByWatermark"

2020-06-15 Thread GitBox

dongjoon-hyun commented on a change in pull request #28828: URL: https://github.com/apache/spark/pull/28828#discussion_r439949163 ## File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala ## @@ -43,7 +43,7 @@ class StateOperatorProgress private[sql](

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28828: [SPARK-24634][SS][FOLLOWUP] Rename the variable from "numLateInputs" to "numDropppedRowsByWatermark"

2020-06-15 Thread GitBox

dongjoon-hyun commented on a change in pull request #28828: URL: https://github.com/apache/spark/pull/28828#discussion_r439949163 ## File path: sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala ## @@ -43,7 +43,7 @@ class StateOperatorProgress private[sql](

[GitHub] [spark] cloud-fan commented on pull request #28797: [SPARK-31926][SQL][TEST-HIVE1.2][test-maven] Fix concurrency issue for ThriftCLIService to getPortNumber

2020-06-15 Thread GitBox

cloud-fan commented on pull request #28797: URL: https://github.com/apache/spark/pull/28797#issuecomment-643923811 thanks, merging to master/3.0! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] cloud-fan closed pull request #28797: [SPARK-31926][SQL][TEST-HIVE1.2][test-maven] Fix concurrency issue for ThriftCLIService to getPortNumber

2020-06-15 Thread GitBox

cloud-fan closed pull request #28797: URL: https://github.com/apache/spark/pull/28797 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] MaxGekk commented on pull request #28809: [SPARK-31959][SQL][3.0] Fix Gregorian-Julian micros rebasing while switching standard time zone offset

2020-06-15 Thread GitBox

MaxGekk commented on pull request #28809: URL: https://github.com/apache/spark/pull/28809#issuecomment-643922594 I am going to skip the test checks if JDK tzdb is outdated and Asia/Hong_Kong doesn't have timestamps overlapping in 1945 at all.

[GitHub] [spark] MaxGekk commented on pull request #28809: [SPARK-31959][SQL][3.0] Fix Gregorian-Julian micros rebasing while switching standard time zone offset

2020-06-15 Thread GitBox

MaxGekk commented on pull request #28809: URL: https://github.com/apache/spark/pull/28809#issuecomment-643920058 > It might be Amplap Jenkins host issue (Java version or environment). It uses JDK w/ outdated time zone database (not clear from log which version): ```

[GitHub] [spark] SparkQA removed a comment on pull request #28781: [SPARK-31953][SS] Add Spark Structured Streaming History Server Support

[GitHub] [spark] MaxGekk commented on pull request #28829: [SPARK-31992][SQL] Benchmark the EXCEPTION rebase mode

[GitHub] [spark] SparkQA commented on pull request #28781: [SPARK-31953][SS] Add Spark Structured Streaming History Server Support

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

[GitHub] [spark] dongjoon-hyun commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

[GitHub] [spark] HyukjinKwon commented on pull request #28830: [SPARK-31990][SS] Use toSet.toSeq in Dataset.dropDuplicates

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28784: [SPARK-31957][SQL][test-maven] Cleanup hive scratch dir for the developer api startWithContext

[GitHub] [spark] AmplabJenkins commented on pull request #28784: [SPARK-31957][SQL][test-maven] Cleanup hive scratch dir for the developer api startWithContext

[GitHub] [spark] SparkQA commented on pull request #28784: [SPARK-31957][SQL][test-maven] Cleanup hive scratch dir for the developer api startWithContext

[GitHub] [spark] dilipbiswal commented on pull request #28032: [SPARK-31264][SQL] Repartition by dynamic partition columns before insert partition table

[GitHub] [spark] yaooqinn commented on pull request #28784: [SPARK-31957][SQL][test-maven] Cleanup hive scratch dir for the developer api startWithContext

[GitHub] [spark] yaooqinn commented on pull request #28784: [SPARK-31957][SQL] Cleanup hive scratch dir for the developer api startWithContext

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28828: [SPARK-24634][SS][FOLLOWUP] Rename the variable from "numLateInputs" to "numDropppedRowsByWatermark"

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28828: [SPARK-24634][SS][FOLLOWUP] Rename the variable from "numLateInputs" to "numDropppedRowsByWatermark"

[GitHub] [spark] cloud-fan commented on pull request #28797: [SPARK-31926][SQL][TEST-HIVE1.2][test-maven] Fix concurrency issue for ThriftCLIService to getPortNumber

[GitHub] [spark] cloud-fan closed pull request #28797: [SPARK-31926][SQL][TEST-HIVE1.2][test-maven] Fix concurrency issue for ThriftCLIService to getPortNumber

[GitHub] [spark] MaxGekk commented on pull request #28809: [SPARK-31959][SQL][3.0] Fix Gregorian-Julian micros rebasing while switching standard time zone offset

[GitHub] [spark] MaxGekk commented on pull request #28809: [SPARK-31959][SQL][3.0] Fix Gregorian-Julian micros rebasing while switching standard time zone offset

< 3 4 5 6 7 8

701 - 718 of 718 matches

Site Navigation

Mail list logo

Footer information