Re: Samza job throughput much lower than Kafka throughput

2015-05-22 Thread George Li
"container-name":"samza-container-0", "source":"samza-container-0", "job-name":"container-performance", "samza-version":"0.9.0", "version":"0.0.1" } }{ "metrics":{

Re: Samza job throughput much lower than Kafka throughput

2015-05-21 Thread Guozhang Wang
write to it? > > Thanks, > > George > > > > From: Roger Hoover > To: "dev@samza.apache.org" , > Date: 21/05/2015 12:04 PM > Subject:Re: Samza job throughput much lower than Kafka throughput > > > > Oops. Sent too s

Re: Samza job throughput much lower than Kafka throughput

2015-05-21 Thread George Li
000) >> > outputSystemStream = Option(config.get("task.outputs", >> > null)).map(Util.getSystemStreamFromNames(_)) >> > println("init!!") >> > } >> > >> > def process(envelope: IncomingMessageEnvelope, colle

Re: Samza job throughput much lower than Kafka throughput

2015-05-21 Thread George Li
@samza.apache.org, Date: 21/05/2015 12:31 AM Subject:Re: Samza job throughput much lower than Kafka throughput Hi George, Is there any reason you need to set the following configs? systems.kafka.consumer.fetch.wait.max.ms= 1 This setting will basically disable long pooling of the

Re: Samza job throughput much lower than Kafka throughput

2015-05-21 Thread Roger Hoover
outputSystemStream = Option(config.get("task.outputs", >> > null)).map(Util.getSystemStreamFromNames(_)) >> > println("init!!") >> > } >> > >> > def process(envelope: IncomingMessageEnvelope, collector: >> > MessageCollector, coord

Re: Samza job throughput much lower than Kafka throughput

2015-05-21 Thread Roger Hoover
stem.currentTimeMillis > > } > > > > if (outputSystemStream.isDefined) { > > collector.send(new OutgoingMessageEnvelope(outputSystemStream.get, > > envelope.getKey, envelope.getMessage)) > > } > > > > messagesProcessed +

Re: Samza job throughput much lower than Kafka throughput

2015-05-20 Thread Guozhang Wang
gt; } > > messagesProcessed += 1 > > if (messagesProcessed % logInterval == 0) { > val seconds = (System.currentTimeMillis - startTime) / 1000 > println("Processed %s messages in %s seconds." format > (messagesProcessed, seconds)) > }

Re: Samza job throughput much lower than Kafka throughput

2015-05-20 Thread George Li
1000 println("Processed %s messages in %s seconds." format (messagesProcessed, seconds)) } if (messagesProcessed >= maxMessages) { coordinator.shutdown(RequestScope.ALL_TASKS_IN_CONTAINER) } } } From: Yi Pan To: dev@samza.apache.org, Date: 20/

Re: Samza job throughput much lower than Kafka throughput

2015-05-20 Thread Yi Pan
Hi, George, Could you share w/ us the code and configuration of your sample test job? Thanks! -Yi On Wed, May 20, 2015 at 1:19 PM, George Li wrote: > Hi, > > We are evaluating Samza's performance, and our sample job with > TestPerformanceTask is much slower than a program reading directly from

Samza job throughput much lower than Kafka throughput

2015-05-20 Thread George Li
Hi, We are evaluating Samza's performance, and our sample job with TestPerformanceTask is much slower than a program reading directly from Kafka. Scenario: * Cluster: 1 master node for Zookeeper and yarn. 3 Kafka broker nodes 3 yarn worker nodes * Kafka: Topic has only 1 partition. Average me