[ 
https://issues.apache.org/jira/browse/SAMZA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated SAMZA-245:
----------------------------------

    Attachment: SAMZA-245-1.patch

Attaching re-based patch. All tests pass. RB is available at:

https://reviews.apache.org/r/23588/

# Rebased original patch.
# Fixed tests to compile and pass again.
# Found a bug in original patch, where dropping deserialization messages could 
lead to never consuming from the stream again, so fixed that.

One thing to discuss here is the refreshThreshold. We need a way to trigger 
polling, so I introduced this, which is a global lower bound that defines when 
we'll start polling systems for more messages. The problem is that this concept 
conflicts with the TieredPriorityChooser (see SAMZA-342), where we might want 
to consume real time messages immediately, even while processing batch 
messages. In such a case, with this patch, the batch messages would fill the 
buffer, and cause the real time streams not to be polled until all of the batch 
messages are processed. One work around for this would be to raise 
refreshThreshold to a very high number, so that you're always polling, but 
perhaps there's a better solution? Maybe stream-specific polling thresholds, or 
something.

> Improve SystemConsumers performance
> -----------------------------------
>
>                 Key: SAMZA-245
>                 URL: https://issues.apache.org/jira/browse/SAMZA-245
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>            Assignee: Chris Riccomini
>             Fix For: 0.8.0
>
>         Attachments: SAMZA-245-1.patch, SAMZA-245.0.patch
>
>
> As part of SAMZA-220, a more radical patch was proposed. This patch appears 
> to improve SystemConsumers' performance pretty significantly, while also 
> reducing its complexity. The decision was made to move this change into the 
> 0.8.0 release, rather than the 0.7.0 release, since it's a fairly risky 
> change.
> This ticket is to explore updating SystemConsumers to eliminate almost all 
> loops in order to increase performance in the Samza container.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to