[ https://issues.apache.org/jira/browse/SAMZA-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Branislav Cogic updated SAMZA-856: ---------------------------------- Attachment: SAMZA-856.0.patch > Optionally automatically commit based on number of processed messages > --------------------------------------------------------------------- > > Key: SAMZA-856 > URL: https://issues.apache.org/jira/browse/SAMZA-856 > Project: Samza > Issue Type: Improvement > Components: container > Reporter: Elias Levy > Assignee: Branislav Cogic > Attachments: SAMZA-856.0.patch > > > Currently Samza support automatic checkpoint commits based on time via the > task.commit.ms property. The number of messages processed during any time > window will vary with the throughput of the system. Thus, the current > automatic checkpointing can't guarantee a maximum number of messages being > reprocessed when recovering after a failure. > I propose the addition of an option that would automatically commit > checkpoints after a configurable number of messages have been processed. > The messages could be counted per container, per task, or per stream. > Properties could be named task.commit.msg.container.cnt, > task.commit.msg.task.cnt and/or task.commit.msg.stream.cnt. > Alternatively, a per stream count limit could use different values for > different streams. E.g. task.commit.msg.stream.<some_stream>.cnt=1000, > task.commit.msg.stream.<some_other_stream>.cnt=200. > A message count auto commit would be orthogonal to the existing time based > auto commit and they could be used at the same time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)