"container-name":"samza-container-0",
"source":"samza-container-0",
"job-name":"container-performance",
"samza-version":"0.9.0",
"version":"0.0.1"
}
}{
"metrics":{
write to it?
>
> Thanks,
>
> George
>
>
>
> From: Roger Hoover
> To: "dev@samza.apache.org" ,
> Date: 21/05/2015 12:04 PM
> Subject:Re: Samza job throughput much lower than Kafka throughput
>
>
>
> Oops. Sent too s
000)
>> > outputSystemStream = Option(config.get("task.outputs",
>> > null)).map(Util.getSystemStreamFromNames(_))
>> > println("init!!")
>> > }
>> >
>> > def process(envelope: IncomingMessageEnvelope, colle
@samza.apache.org,
Date: 21/05/2015 12:31 AM
Subject:Re: Samza job throughput much lower than Kafka throughput
Hi George,
Is there any reason you need to set the following configs?
systems.kafka.consumer.fetch.wait.max.ms= 1
This setting will basically disable long pooling of the
outputSystemStream = Option(config.get("task.outputs",
>> > null)).map(Util.getSystemStreamFromNames(_))
>> > println("init!!")
>> > }
>> >
>> > def process(envelope: IncomingMessageEnvelope, collector:
>> > MessageCollector, coord
stem.currentTimeMillis
> > }
> >
> > if (outputSystemStream.isDefined) {
> > collector.send(new OutgoingMessageEnvelope(outputSystemStream.get,
> > envelope.getKey, envelope.getMessage))
> > }
> >
> > messagesProcessed +
gt; }
>
> messagesProcessed += 1
>
> if (messagesProcessed % logInterval == 0) {
> val seconds = (System.currentTimeMillis - startTime) / 1000
> println("Processed %s messages in %s seconds." format
> (messagesProcessed, seconds))
> }
1000
println("Processed %s messages in %s seconds." format
(messagesProcessed, seconds))
}
if (messagesProcessed >= maxMessages) {
coordinator.shutdown(RequestScope.ALL_TASKS_IN_CONTAINER)
}
}
}
From: Yi Pan
To: dev@samza.apache.org,
Date: 20/
Hi, George,
Could you share w/ us the code and configuration of your sample test job?
Thanks!
-Yi
On Wed, May 20, 2015 at 1:19 PM, George Li wrote:
> Hi,
>
> We are evaluating Samza's performance, and our sample job with
> TestPerformanceTask is much slower than a program reading directly from
Hi,
We are evaluating Samza's performance, and our sample job with
TestPerformanceTask is much slower than a program reading directly from
Kafka.
Scenario:
* Cluster:
1 master node for Zookeeper and yarn.
3 Kafka broker nodes
3 yarn worker nodes
* Kafka:
Topic has only 1 partition. Average me
10 matches
Mail list logo