Re: Flink streaming throughput

2016-03-15 Thread おぎばやしひろのり
Milinda, Thanks. I will try. Regards, Hironori 2016/03/16 1:31 "Milinda Pathirage" : > Hi Hironori, > > [1] and [2] describes the process of measuring Kafka performance. I think > the perf test code is under org.apache.kafka.tools package in 0.9, so you > may have to change commands in [2] to re

Re: Flink streaming throughput

2016-03-15 Thread Milinda Pathirage
Hi Hironori, [1] and [2] describes the process of measuring Kafka performance. I think the perf test code is under org.apache.kafka.tools package in 0.9, so you may have to change commands in [2] to reflect that. Thanks Milinda [1] https://engineering.linkedin.com/kafka/benchmarking-apache-kafka

Re: Flink streaming throughput

2016-03-15 Thread おぎばやしひろのり
Robert, Thank you for your response. I would like to try kafka-console-consumer but I have no idea about how to measure the consuming throughput. Are there any standard way? I would also try Kafka broker on physical servers. Regarding version, I have upgraded to Flink 1.0.0 and replaced FlinkKaf

Re: Flink streaming throughput

2016-03-11 Thread Robert Metzger
Hi Hironori, can you try with the kafka-console-consumer how many messages you can read in one minute? Maybe the broker's disk I/O is limited because everything is running in virtual machines (potentially sharing one hard disk?) I'm also not sure if running a Kafka 0.8 consumer against a 0.9 broke

Re: Flink streaming throughput

2016-03-11 Thread おぎばやしひろのり
Aljoscha, Thank you for your response. I tried no JSON parsing and no sink (DiscardingSink) case. The throughput was 8228msg/sec. Slightly better than JSON + Elasticsearch case. I also tried using socketTextStream instead of FlinkKafkaConsumer, in that case, the result was 60,000 msg/sec with jus

Re: Flink streaming throughput

2016-03-08 Thread Aljoscha Krettek
Hi, Another interesting test would be a combination of 3) and 2). I.e. no JSON parsing and no sink. This would show what the raw throughput can be before being slowed down by writing to Elasticsearch. Also .print() is also not feasible for production since it just prints every element to the st

Re: Flink streaming throughput

2016-03-08 Thread おぎばやしひろのり
Stephan, Sorry for the delay in my response. I tried 3 cases you suggested. This time, I set parallelism to 1 for simpicity. 0) base performance (same as the first e-mail): 1,480msg/sec 1) Disable checkpointing : almost same as 0) 2) No ES sink. just print() : 1,510msg/sec 3) JSON to TSV : 8,000

Re: Flink streaming throughput

2016-02-26 Thread おぎばやしひろのり
Stephan, Thank you for your quick response. I will try and post the result later. Regards, Hironori 2016-02-26 19:45 GMT+09:00 Stephan Ewen : > Hi! > > I would try and dig bit by bit into what the bottleneck is: > > 1) Disable the checkpointing, see what difference that makes > 2) Use a dummy

Re: Flink streaming throughput

2016-02-26 Thread Stephan Ewen
Hi! I would try and dig bit by bit into what the bottleneck is: 1) Disable the checkpointing, see what difference that makes 2) Use a dummy sink (discarding) rather than elastic search, to see if that is limiting 3) Check the JSON parsing. Many JSON libraries are very CPU intensive and easily

Flink streaming throughput

2016-02-26 Thread おぎばやしひろのり
Hello, I started evaluating Flink and tried simple performance test. The result was just about 4000 messages/sec with 300% CPU usage. I think this is quite low and wondering if it is a reasonable result. If someone could check it, it would be great. Here is the detail: [servers] - 3 Kafka broker