[ 
https://issues.apache.org/jira/browse/SAMZA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945816#comment-13945816
 ] 

Chris Riccomini commented on SAMZA-203:
---------------------------------------

I defaulted to 50,000 for the fetchThreshold since 50000 messages * 200 
bytes/message = 9.53 megs. If we assume a processor tops out at 30MB/s (which 
is the case during changelog restore), then this gives us 9.53mb / 30mb/s = 
317ms (.317s) for a fetch request before the queue is fully drained and the 
consumer polls and gets no messages back.

> Bad performance in BrokerProxy when restoring changelogs
> --------------------------------------------------------
>
>                 Key: SAMZA-203
>                 URL: https://issues.apache.org/jira/browse/SAMZA-203
>             Project: Samza
>          Issue Type: Bug
>          Components: kv
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>             Fix For: 0.7.0
>
>         Attachments: SAMZA-203.0.patch, SAMZA-203.1.patch
>
>
> As part of SAMZA-126, we introduced a Thread.sleep call in BrokerProxy's 
> fetchMessages method. The goal was to skip fetch requests on SimpleConsumer 
> when the topicAndPartitionsToFetch variable was empty. Since we had no 
> topic/partitions to fetch, we slowed the thread down by calling 
> Thread.sleep(sleepMSWhileNoTopicPartitions), which defaults to 1000ms.
> We now see that we are only getting about 1mb/s when restoring changelogs. 
> This is very slow. Upon investigation, it appears that the BrokerProxy thread 
> is sleeping 90% of the time during restore, and the main SamzaContainer 
> thread is polling for more messages about 60% of the time.
> The reason for the poor restore performance is that the BrokerProxy sleeps 
> for 1 second every time the message queue for the restore topic is not empty. 
> Effectively, the proxy starts throttling the reads. If I comment out the 
> Thread.sleep line in the BrokerProxy, I get about 64mb/s network usage on my 
> loopback (one broker running locally), 10mb/s disk read, and 70mb/s disk 
> write on my MacBook Air SSD--the write appears to be the bottleneck (since 
> we're writing all the values to the LevelDB store). This is much much faster 
> than before.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to