[ https://issues.apache.org/jira/browse/SPARK-30774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036684#comment-17036684 ]
Hyukjin Kwon commented on SPARK-30774: -------------------------------------- Can you show the reproducible codes to show the comment is broken? > The default checkpointing interval is not as claimed in the comment. > -------------------------------------------------------------------- > > Key: SPARK-30774 > URL: https://issues.apache.org/jira/browse/SPARK-30774 > Project: Spark > Issue Type: Bug > Components: DStreams > Affects Versions: 2.4.5 > Reporter: Kyle Krueger > Priority: Minor > > [https://github.com/apache/spark/blob/71737861531180bbda9aec8d241b1428fe91cab2/streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala#L199-L203MajorMajor] > -The checkpoint duration is set to be the window duration, maybe the idea in > the old comment wanting to set to the higher of 10s or window-size is no > longer relevant.- > -I propose we either adapt the comment to just say to just say that we set > the checkpoint duration to the window size and clean up how that value is > set, or we change the code to do as the comment remarks.- > > So, the original statement I made was wrong. This code is still broken > though. Consider the case where window duration is 3, the result would be a > checkpoint size of 12s. That doesn't correspond to the rule implied by the > comment and is thus unexpected behaviour. > This code does however result in the checkpoint size being a multiple of the > slide duration, which is safe as far as I know. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org