[ 
https://issues.apache.org/jira/browse/SAMZA-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branislav Cogic updated SAMZA-856:
----------------------------------
    Attachment: SAMZA-856.0.patch

> Optionally automatically commit based on number of processed messages
> ---------------------------------------------------------------------
>
>                 Key: SAMZA-856
>                 URL: https://issues.apache.org/jira/browse/SAMZA-856
>             Project: Samza
>          Issue Type: Improvement
>          Components: container
>            Reporter: Elias Levy
>            Assignee: Branislav Cogic
>         Attachments: SAMZA-856.0.patch
>
>
> Currently Samza support automatic checkpoint commits based on time via the 
> task.commit.ms property.  The number of messages processed during any time 
> window will vary with the throughput of the system.  Thus, the current 
> automatic checkpointing can't guarantee a maximum number of messages being 
> reprocessed when recovering after a failure.
> I propose the addition of an option that would automatically commit 
> checkpoints  after a configurable number of messages have been processed.  
> The messages could be counted per container, per task, or per stream. 
> Properties could be named task.commit.msg.container.cnt, 
> task.commit.msg.task.cnt and/or task.commit.msg.stream.cnt.
> Alternatively, a per stream count limit could use different values for 
> different streams. E.g. task.commit.msg.stream.<some_stream>.cnt=1000, 
> task.commit.msg.stream.<some_other_stream>.cnt=200.
> A message count auto commit would be orthogonal to the existing time based 
> auto commit and they could be used at the same time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to