[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10127 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161567164 @zsxwing Please check this. I think this problem has been caused by the #9707 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161608484 **[Test build #2162 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2162/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `class CrossValidator @Since(\"1.2.0\") (@Since(\"1.4.0\") override val uid: String)`\n * `class ParamGridBuilder @Since(\"1.2.0\") `\n * `class TrainValidationSplit @Since(\"1.5.0\") (@Since(\"1.5.0\") override val uid: String)`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/10127 [SPARK-12122][STREAMING] Prevent batches from being submitted twice after recovering StreamingContext from checkpoint You can merge this pull request into a Git repository by running: $ git pull https://github.com/tdas/spark SPARK-12122 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10127.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10127 commit d904b25a7037e2b12693158f29e069f13aa0fa78 Author: Tathagata DasDate: 2015-12-03T09:30:27Z Remove duplicate --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161591828 **[Test build #47134 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47134/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161569412 **[Test build #47134 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47134/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161567552 **[Test build #2161 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2161/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161567545 **[Test build #2162 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2162/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161592110 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47134/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161592106 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161590894 **[Test build #2161 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2161/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161768001 **[Test build #2165 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2165/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161767836 **[Test build #2164 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2164/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161768758 **[Test build #47163 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47163/consoleFull)** for PR 10127 at commit [`fe69fbf`](https://github.com/apache/spark/commit/fe69fbfe185e12e70d46003a40988fff57cd24b7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/10127#discussion_r46604122 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala --- @@ -220,7 +220,8 @@ class JobGenerator(jobScheduler: JobScheduler) extends Logging { logInfo("Batches pending processing (" + pendingTimes.size + " batches): " + pendingTimes.mkString(", ")) // Reschedule jobs for these times -val timesToReschedule = (pendingTimes ++ downTimes).distinct.sorted(Time.ordering) +val timesToReschedule = (pendingTimes ++ downTimes).filter { _ != restartTime } + .distinct.sorted(Time.ordering) --- End diff -- Explained offline: The restart time is always checkpointTime+1 (assuming batch duration = 1). However, pending times can already have batches >= checkpointTime+1. This can cause `timesToReschedule` to have batches >= checkpointTime+1, which will be explicitly submitted, and then resubmitted through the timer. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161783905 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47163/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161783902 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161783301 **[Test build #2164 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2164/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161768648 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161783330 **[Test build #2165 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2165/consoleFull)** for PR 10127 at commit [`d904b25`](https://github.com/apache/spark/commit/d904b25a7037e2b12693158f29e069f13aa0fa78). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10127#issuecomment-161783734 **[Test build #47163 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47163/consoleFull)** for PR 10127 at commit [`fe69fbf`](https://github.com/apache/spark/commit/fe69fbfe185e12e70d46003a40988fff57cd24b7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12122][STREAMING] Prevent batches from ...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/10127#discussion_r46592510 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala --- @@ -220,7 +220,8 @@ class JobGenerator(jobScheduler: JobScheduler) extends Logging { logInfo("Batches pending processing (" + pendingTimes.size + " batches): " + pendingTimes.mkString(", ")) // Reschedule jobs for these times -val timesToReschedule = (pendingTimes ++ downTimes).distinct.sorted(Time.ordering) +val timesToReschedule = (pendingTimes ++ downTimes).filter { _ != restartTime } + .distinct.sorted(Time.ordering) --- End diff -- Could you clarify why `pendingTimes` may contain `restartTime`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org