[
https://issues.apache.org/jira/browse/SAMZA-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900544#comment-13900544
]
Jakob Homan commented on SAMZA-146:
-----------------------------------
In our production jobs that are consuming on the order of ~2500 topic
partitions across four containers, we see performance drop significantly (by an
order of magnitude). This is particularly evident where most of the topics are
relatively low volume compared to a few high volume topics. Specifically,
{code:title=SystemConsumers.java|borderStyle=solid}
def choose = {
// ... contents elided
refresh
envelopeFromChooser
}
private def refresh {
debug("Refreshing chooser with new messages.")
// Poll every system for new messages.
consumers.keys.foreach(poll(_))
// Update the chooser.
neededByChooser.foreach(systemStreamPartition =>
// If we have messages for a stream that the chooser needs, then update.
if (fetchMap(systemStreamPartition).intValue < maxMsgsPerStreamPartition)
{
chooser.update(unprocessedMessages(systemStreamPartition).dequeue)
updateFetchMap(systemStreamPartition)
neededByChooser -= systemStreamPartition
})
}
{code}
The call for each of the consumers to poll for new messages it expensive
relative to the single call to choose (and subsequent delivery of a messages to
process). Instrumenting these calls shows that more than 99% of the calls to
poll are returning back nothing.
> Throughput degrades unreasonably as the number of topic-partitions increases
> ----------------------------------------------------------------------------
>
> Key: SAMZA-146
> URL: https://issues.apache.org/jira/browse/SAMZA-146
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.7.0
> Reporter: Jakob Homan
> Assignee: Jakob Homan
> Fix For: 0.7.0
>
>
> Currently we poll each stream partition in need of messages before each call
> to process. For jobs with a large number of partitions, and particularly
> when those partitions are low volume, this dramatically impacts performance.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)