Re: Flume bechmarks
I've got a pretty well resourced pre-production environment that might do the trick quite nicely. -- Chris Horrocks On Thu, Oct 13, 2016 at 4:55 pm, Lior Zeno <'liorz...@gmail.com'> wrote: I think that we can come up with an initial version with little efforts. The simplest scenario I can think of is running a Flume instance (with a SeqGen source and a Null sink) for one minute, and then report the average events per second. On Thu, Oct 13, 2016 at 6:43 PM, Attila Simonwrote: > Good idea! What would be required to set up something similar for Flume? > ie initial time cost for setting up the infrastructure and periodic time > cost to add new use-cases. > > Cheers, > Attila > > > > On Thu, Oct 13, 2016 at 5:19 PM, Lior Zeno wrote: > > > Hi All, > > > > Monitoring Flume's performance over time is an important step in every > > production-level application. Benchmarking Flume on a nightly basis has > > the following advantages: > > > > * Better understanding of Flume's bottlenecks. > > * Allow users to compare the performance of different solutions, such as > > Logstash and Fluentd. > > * Better understanding of the influence of recent commits on performance. > > > > Logstash already conducts various performance tests, more details in this > > link: > > http://logstash-benchmarks.elastic.co/ > > > > I propose adding a few micro-benchmarks showing Flume's TPS vs date (of > > course, in the ideal case where the input and/or output do not bottleneck > > the system), e.g. using the SeqGen source. > > > > Thoughts? > > > > Thanks > > >
Re: Flume bechmarks
You may want to take a look at - https://cwiki.apache.org/confluence/display/FLUME/Performance+Measurements+ -+round+2 and the older.. - https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Meas urements when coming up with a list of configurations to benchmark. -roshan On 10/13/16, 9:12 AM, "Balazs Donat Bessenyei"wrote: >I have just proposed enabling Travis on a different thread. That should >help with this. (Having a separate machine would be best, but I don't know >how we could get one. I'll do the homework for this.) > >On Oct 13, 2016 5:57 PM, "Lior Zeno" wrote: > >> Maybe getting an isolated environment? The CI environment might be >>shared >> among multiple users, adding too much noise to the performance test. >> >> On Thu, Oct 13, 2016 at 6:53 PM, Balazs Donat Bessenyei < >> bes...@cloudera.com >> > wrote: >> >> > +1 >> > >> > I think this is a good idea! >> > >> > How can I help with setting it up? >> > >> > On Oct 13, 2016 5:20 PM, "Lior Zeno" wrote: >> > >> > > Hi All, >> > > >> > > Monitoring Flume's performance over time is an important step in >>every >> > > production-level application. Benchmarking Flume on a nightly basis >> has >> > > the following advantages: >> > > >> > > * Better understanding of Flume's bottlenecks. >> > > * Allow users to compare the performance of different solutions, >>such >> as >> > > Logstash and Fluentd. >> > > * Better understanding of the influence of recent commits on >> performance. >> > > >> > > Logstash already conducts various performance tests, more details in >> this >> > > link: >> > > http://logstash-benchmarks.elastic.co/ >> > > >> > > I propose adding a few micro-benchmarks showing Flume's TPS vs date >>(of >> > > course, in the ideal case where the input and/or output do not >> bottleneck >> > > the system), e.g. using the SeqGen source. >> > > >> > > Thoughts? >> > > >> > > Thanks >> > > >> > >>
Re: Flume bechmarks
I have just proposed enabling Travis on a different thread. That should help with this. (Having a separate machine would be best, but I don't know how we could get one. I'll do the homework for this.) On Oct 13, 2016 5:57 PM, "Lior Zeno"wrote: > Maybe getting an isolated environment? The CI environment might be shared > among multiple users, adding too much noise to the performance test. > > On Thu, Oct 13, 2016 at 6:53 PM, Balazs Donat Bessenyei < > bes...@cloudera.com > > wrote: > > > +1 > > > > I think this is a good idea! > > > > How can I help with setting it up? > > > > On Oct 13, 2016 5:20 PM, "Lior Zeno" wrote: > > > > > Hi All, > > > > > > Monitoring Flume's performance over time is an important step in every > > > production-level application. Benchmarking Flume on a nightly basis > has > > > the following advantages: > > > > > > * Better understanding of Flume's bottlenecks. > > > * Allow users to compare the performance of different solutions, such > as > > > Logstash and Fluentd. > > > * Better understanding of the influence of recent commits on > performance. > > > > > > Logstash already conducts various performance tests, more details in > this > > > link: > > > http://logstash-benchmarks.elastic.co/ > > > > > > I propose adding a few micro-benchmarks showing Flume's TPS vs date (of > > > course, in the ideal case where the input and/or output do not > bottleneck > > > the system), e.g. using the SeqGen source. > > > > > > Thoughts? > > > > > > Thanks > > > > > >
Re: Flume bechmarks
Maybe getting an isolated environment? The CI environment might be shared among multiple users, adding too much noise to the performance test. On Thu, Oct 13, 2016 at 6:53 PM, Balazs Donat Bessenyeiwrote: > +1 > > I think this is a good idea! > > How can I help with setting it up? > > On Oct 13, 2016 5:20 PM, "Lior Zeno" wrote: > > > Hi All, > > > > Monitoring Flume's performance over time is an important step in every > > production-level application. Benchmarking Flume on a nightly basis has > > the following advantages: > > > > * Better understanding of Flume's bottlenecks. > > * Allow users to compare the performance of different solutions, such as > > Logstash and Fluentd. > > * Better understanding of the influence of recent commits on performance. > > > > Logstash already conducts various performance tests, more details in this > > link: > > http://logstash-benchmarks.elastic.co/ > > > > I propose adding a few micro-benchmarks showing Flume's TPS vs date (of > > course, in the ideal case where the input and/or output do not bottleneck > > the system), e.g. using the SeqGen source. > > > > Thoughts? > > > > Thanks > > >
Re: Flume bechmarks
I think that we can come up with an initial version with little efforts. The simplest scenario I can think of is running a Flume instance (with a SeqGen source and a Null sink) for one minute, and then report the average events per second. On Thu, Oct 13, 2016 at 6:43 PM, Attila Simonwrote: > Good idea! What would be required to set up something similar for Flume? > ie initial time cost for setting up the infrastructure and periodic time > cost to add new use-cases. > > Cheers, > Attila > > > > On Thu, Oct 13, 2016 at 5:19 PM, Lior Zeno wrote: > > > Hi All, > > > > Monitoring Flume's performance over time is an important step in every > > production-level application. Benchmarking Flume on a nightly basis has > > the following advantages: > > > > * Better understanding of Flume's bottlenecks. > > * Allow users to compare the performance of different solutions, such as > > Logstash and Fluentd. > > * Better understanding of the influence of recent commits on performance. > > > > Logstash already conducts various performance tests, more details in this > > link: > > http://logstash-benchmarks.elastic.co/ > > > > I propose adding a few micro-benchmarks showing Flume's TPS vs date (of > > course, in the ideal case where the input and/or output do not bottleneck > > the system), e.g. using the SeqGen source. > > > > Thoughts? > > > > Thanks > > >
Re: Flume bechmarks
+1 I think this is a good idea! How can I help with setting it up? On Oct 13, 2016 5:20 PM, "Lior Zeno"wrote: > Hi All, > > Monitoring Flume's performance over time is an important step in every > production-level application. Benchmarking Flume on a nightly basis has > the following advantages: > > * Better understanding of Flume's bottlenecks. > * Allow users to compare the performance of different solutions, such as > Logstash and Fluentd. > * Better understanding of the influence of recent commits on performance. > > Logstash already conducts various performance tests, more details in this > link: > http://logstash-benchmarks.elastic.co/ > > I propose adding a few micro-benchmarks showing Flume's TPS vs date (of > course, in the ideal case where the input and/or output do not bottleneck > the system), e.g. using the SeqGen source. > > Thoughts? > > Thanks >
Re: Flume bechmarks
Good idea! What would be required to set up something similar for Flume? ie initial time cost for setting up the infrastructure and periodic time cost to add new use-cases. Cheers, Attila On Thu, Oct 13, 2016 at 5:19 PM, Lior Zenowrote: > Hi All, > > Monitoring Flume's performance over time is an important step in every > production-level application. Benchmarking Flume on a nightly basis has > the following advantages: > > * Better understanding of Flume's bottlenecks. > * Allow users to compare the performance of different solutions, such as > Logstash and Fluentd. > * Better understanding of the influence of recent commits on performance. > > Logstash already conducts various performance tests, more details in this > link: > http://logstash-benchmarks.elastic.co/ > > I propose adding a few micro-benchmarks showing Flume's TPS vs date (of > course, in the ideal case where the input and/or output do not bottleneck > the system), e.g. using the SeqGen source. > > Thoughts? > > Thanks >
Flume bechmarks
Hi All, Monitoring Flume's performance over time is an important step in every production-level application. Benchmarking Flume on a nightly basis has the following advantages: * Better understanding of Flume's bottlenecks. * Allow users to compare the performance of different solutions, such as Logstash and Fluentd. * Better understanding of the influence of recent commits on performance. Logstash already conducts various performance tests, more details in this link: http://logstash-benchmarks.elastic.co/ I propose adding a few micro-benchmarks showing Flume's TPS vs date (of course, in the ideal case where the input and/or output do not bottleneck the system), e.g. using the SeqGen source. Thoughts? Thanks