[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-03-01 Thread jeanlyn
Github user jeanlyn closed the pull request at: https://github.com/apache/spark/pull/11440 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-03-01 Thread jeanlyn
Github user jeanlyn commented on the pull request: https://github.com/apache/spark/pull/11440#issuecomment-190994425 Thanks @jerryshao @srowen @zsxwing for suggestions.I close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-03-01 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/11440#discussion_r54610922 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala --- @@ -221,8 +221,12 @@ class JobGenerator(jobScheduler:

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-03-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/11440#discussion_r54551767 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala --- @@ -221,8 +221,12 @@ class JobGenerator(jobScheduler:

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-03-01 Thread jeanlyn
Github user jeanlyn commented on the pull request: https://github.com/apache/spark/pull/11440#issuecomment-190613342 My bad. I will try to figure out the way to fix the when window operations appear with the config set to true. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-03-01 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/11440#issuecomment-190610110 But how do you define "much longer", based on the batch number or time? IMHO we cannot fix a patch based on the assumptions. We should add some defensive codes to

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-03-01 Thread jeanlyn
Github user jeanlyn commented on the pull request: https://github.com/apache/spark/pull/11440#issuecomment-190608101 @jerryshao Thanks for the explanation. I see what you mean. It's only happen in the beginning, and if the stop time is much longer than the window time, i think it's

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-02-29 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/11440#issuecomment-190570144 For example, if your sliding duration is 1, window duration is 4, and batch duration is 1, and the down time is 3. If you skip this this 3 batches, IIUC the result

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-02-29 Thread jeanlyn
Github user jeanlyn commented on the pull request: https://github.com/apache/spark/pull/11440#issuecomment-190568465 Thanks @jerryshao for suggestion! > Jobs generated in the down time can be used for WAL replay, did you test when these down jobs are removed, the behavior of WAL

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-02-29 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/11440#issuecomment-190531231 Also for some windowing operations, I think this removal of down time jobs may possibly lead to the inconsistent result of windowing aggregation. --- If your

[GitHub] spark pull request: [SPARK-13586][STREAMING]add config to skip gen...

2016-02-29 Thread jerryshao
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/11440#issuecomment-190530543 Jobs generated in the down time can be used for WAL replay, did you test when these down jobs are removed, the behavior of WAL replay is still correct? --- If your