[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user KevinZwx commented on the issue: https://github.com/apache/spark/pull/16970 I'm a little confused with the behavior of dropDuplicates with watermark. According to my understanding of the guide documentation, if I have the following code, I expect to deduplicate still with uuid but use timestamp column and watermark to expire state. `.withWatermark("timestamp", "1 day") .dropDuplicates("uuid", "timestamp")` But in fact I found that the program probably uses uuid and timestamp as a combined key to deduplicate elements because the result count is much larger than using dropDuplicates("uuid") and more close to the result with no duplication. Is it the expected behaviorï¼If so how to achieve what I want? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/16970 @zsxwing Thanks, I am missing it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/16970 @uncleGen I think `requiredChildDistribution = ClusteredDistribution(keyExpressions) :: Nil` (please see [here](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala#L344-L345)) takes care of it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/16970 One question: witout aggregation, how to drop duplication between partitions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73297/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73297 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73297/testReport)** for PR 16970 at commit [`d0b7b77`](https://github.com/apache/spark/commit/d0b7b77e345b275d58ba5582f6acde86a80cb3da). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73297 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73297/testReport)** for PR 16970 at commit [`d0b7b77`](https://github.com/apache/spark/commit/d0b7b77e345b275d58ba5582f6acde86a80cb3da). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73285/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73285 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73285/testReport)** for PR 16970 at commit [`7a7c0c7`](https://github.com/apache/spark/commit/7a7c0c781c236f8421304ab17403f7347eededcb). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Deduplicate(` * `case class StreamingDeduplicateExec(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73285/testReport)** for PR 16970 at commit [`7a7c0c7`](https://github.com/apache/spark/commit/7a7c0c781c236f8421304ab17403f7347eededcb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16970 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73265/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73265 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73265/testReport)** for PR 16970 at commit [`7a7c0c7`](https://github.com/apache/spark/commit/7a7c0c781c236f8421304ab17403f7347eededcb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73247/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73247 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73247/testReport)** for PR 16970 at commit [`78dfdfe`](https://github.com/apache/spark/commit/78dfdfe20b6c7f788e5d289ecc63c325679ccd44). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Deduplication(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16970 @tdas I created https://issues.apache.org/jira/browse/SPARK-19690 to track the issue when joining a batch DataFrame with a streaming DataFrame. I will fix it in a separate PR to unblock this one as it touches many files. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73247 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73247/testReport)** for PR 16970 at commit [`78dfdfe`](https://github.com/apache/spark/commit/78dfdfe20b6c7f788e5d289ecc63c325679ccd44). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73236/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73236 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73236/testReport)** for PR 16970 at commit [`b2e9cb0`](https://github.com/apache/spark/commit/b2e9cb03f9f5dd9467fdebfd7a6f69639ae36f2b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73236 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73236/testReport)** for PR 16970 at commit [`b2e9cb0`](https://github.com/apache/spark/commit/b2e9cb03f9f5dd9467fdebfd7a6f69639ae36f2b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73225/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73225 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73225/testReport)** for PR 16970 at commit [`0e72217`](https://github.com/apache/spark/commit/0e7221718ea825f70de594c68081db75b5f841ea). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73225 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73225/testReport)** for PR 16970 at commit [`0e72217`](https://github.com/apache/spark/commit/0e7221718ea825f70de594c68081db75b5f841ea). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user tdas commented on the issue: https://github.com/apache/spark/pull/16970 overall looks good. just a bunch of nits. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73076/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73076 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73076/testReport)** for PR 16970 at commit [`ba58e2a`](https://github.com/apache/spark/commit/ba58e2a6315260abe16bdb09bc81efa20afffab2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class StreamingDeduplicationExec(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73076/testReport)** for PR 16970 at commit [`ba58e2a`](https://github.com/apache/spark/commit/ba58e2a6315260abe16bdb09bc81efa20afffab2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73064/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73064 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73064/testReport)** for PR 16970 at commit [`5a6af8b`](https://github.com/apache/spark/commit/5a6af8b5fb452f6878c6446f074a049c79c95623). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/16970 aw man. I should always refresh before starting a review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16970 @brkyvz looks like you were looking at my old changes. I pushed a new commit and updated the PR description to reflect the latest supported queries. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73064 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73064/testReport)** for PR 16970 at commit [`5a6af8b`](https://github.com/apache/spark/commit/5a6af8b5fb452f6878c6446f074a049c79c95623). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73028/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16970 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73028 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73028/testReport)** for PR 16970 at commit [`63a7f4c`](https://github.com/apache/spark/commit/63a7f4c62b2da32351d008f9719d513e14562e56). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Deduplication(` * `trait WatermarkSupport extends SparkPlan ` * `case class DeduplicationExec(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16970: [SPARK-19497][SS]Implement streaming deduplication
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16970 **[Test build #73028 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73028/testReport)** for PR 16970 at commit [`63a7f4c`](https://github.com/apache/spark/commit/63a7f4c62b2da32351d008f9719d513e14562e56). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org