[ 
https://issues.apache.org/jira/browse/SPARK-30774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036684#comment-17036684
 ] 

Hyukjin Kwon commented on SPARK-30774:
--------------------------------------

Can you show the reproducible codes to show the comment is broken?

> The default checkpointing interval is not as claimed in the comment.
> --------------------------------------------------------------------
>
>                 Key: SPARK-30774
>                 URL: https://issues.apache.org/jira/browse/SPARK-30774
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 2.4.5
>            Reporter: Kyle Krueger
>            Priority: Minor
>
> [https://github.com/apache/spark/blob/71737861531180bbda9aec8d241b1428fe91cab2/streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala#L199-L203MajorMajor]
> -The checkpoint duration is set to be the window duration, maybe the idea in 
> the old comment wanting to set to the higher of 10s or window-size is no 
> longer relevant.-
> -I propose we either adapt the comment to just say to just say that we set 
> the checkpoint duration to the window size and clean up how that value is 
> set, or we change the code to do as the comment remarks.-
>  
> So, the original statement I made was wrong. This code is still broken 
> though. Consider the case where window duration is 3, the result would be a 
> checkpoint size of 12s. That doesn't correspond to the rule implied by the 
> comment and is thus unexpected behaviour.
> This code does however result in the checkpoint size being a multiple of the 
> slide duration, which is safe as far as I know.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to