Just want to make sure I understand the setup:

1. 9 hadoop servers that were fed the data
2. 1 server was used to generate the syslog data that was spread accross
the 6 flume agent servers
3.  6 flume agent servers that collected data in memory and flushed to the
9 hadoop servers

Is that right?

On Tue, May 8, 2012 at 1:49 AM, Jarek Jarcec Cecho <jar...@apache.org>wrote:

> Thanks Mike,
> this is in deed very helpful!
>
> Jarcec
>
> On Mon, May 07, 2012 at 06:55:49PM -0700, Mike Percy wrote:
> > Hi folks,
> > Will McQueen and I have been doing some Flume NG stress and performance
> testing, and we wanted to share some of our recent findings. The focus of
> the most recent tests has been on the syslog TCP source, memory channel,
> and HDFS sink.
> >
> > I wrote some software to generate load in syslog format over TCP and to
> automate some of the analysis. The first thing we wanted to verify is that
> no data was lost during these tests (a.k.a. correctness), with a close
> second priority being of course throughput (performance). I used Pig and
> AvroStorage from piggybank in the data integrity analysis, and committed
> the compiled (0.11 trunk) piggybank jar so the load analysis scripts would
> be relatively easy to use. It seems to be compatible with Pig 0.8.1. I am a
> little wary of having to maintain that type of thing at the Apache org
> level so for now I have checked all the code in on Github under an ASL 2.0
> license:
> >
> > https://github.com/mpercy/flume-load-gen
> >
> > I have created a Wiki page with the performance metrics we have come up
> with so far. The executive summary is that at the time of this writing, we
> have observed Flume NG on a single machine processing events at a
> throughput rate of 70,000+ events/sec with no data loss.
> >
> >
> https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Measurements
> >
> > I have put more details on the wiki page itself. Please let me know if
> you want me to add more detail. I'll be looking into improving the
> performance of these components going forward, however we wanted to post
> these results to set a public performance baseline of Flume NG.
> >
> > If others have done performance testing, we would love to see your
> results if you can post the details.
> >
> > Regards,
> > Mike
> >
>

Reply via email to