Github user koeninger commented on the pull request: https://github.com/apache/spark/pull/4805#issuecomment-77882744 As it stands now, no offsets are stored by spark unless you're checkpointing. Does it really make sense to have an option to automatically store offsets in Kafka, but not store offsets in the checkpoint? Failure recovery in that case depends on user provided starting offsets (or starting at the beginning / end of the log). If someone has the sophistication to get offsets from kafka in order to provide them as a starting point, they probably have the sophistication to save offsets to kafka themselves in the job. If offsets are only being sent to Kafka when they are also stored in the checkpoint, then does sending offsets to kafka in compute() also make sense? Yes, you can lag behind, but those offsets are in the queue to get processed at least once. I'm not 100% sure on the answer to this, its more a question of desired behavior, but that's why I brought it up. On Mon, Mar 9, 2015 at 12:14 AM, Saisai Shao <notificati...@github.com> wrote: > Hi @koeninger <https://github.com/koeninger> , would you please review > this again? Thanks a lot and appreciate your time. > > Here I still keep using the HashMap for Time -> offset relation mapping, > since checkpoint data will only be updated when checkpoint is enabled, I > hope this could also be worked even without checkpoint enabled. > > And I still use StreamingListener to update the offset, the reason is > mentioned before. > > Besides I updated the configuration name, not sure is it suitable. > > Thanks a lot. > > â > Reply to this email directly or view it on GitHub > <https://github.com/apache/spark/pull/4805#issuecomment-77801344>. >
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org