Thanks Brock for your thoughts. A few related questions: 1) Is there an out-of-the-box flume source that can monitor a RDBMS and pick new rows from there, similar to a tailf on a file? 2) For systems that do not want to persist data into secondary storage, does flume provide an API for direct integration into the app generating the data? I guess the answer should be yes and in that case, is the app considered a flume agent or the app generates data in a form that can be consumed by another flume agent?
Thanks & Regards, MK KARTHIKEYAN M Ericsson India Global Services Pvt.Ltd., EGI/R `Tamarai Tech Park', 4th Floor, South Block, Inner Ring Road, Guindy, Chennai - 600032, India Phone +91 44 4501 2055 Fax +91 44 4501 2066 Mobile +91 96770 68559 [email protected] www.ericsson.com This Communication is Confidential. We only send and receive email on the basis of the term set out at www.ericsson.com/email_disclaimer -----Original Message----- From: Brock Noland [mailto:[email protected]] Sent: Thursday, April 19, 2012 8:50 PM To: [email protected] Subject: Re: Flume scalability & performance One mistake below of consequence. On Thu, Apr 19, 2012 at 2:44 PM, Brock Noland <[email protected]> wrote: > Hi, > > On Thu, Apr 19, 2012 at 10:04 AM, M. Karthikeyan > <[email protected]> wrote: >> Im trying to choose between Flume and JMS for data collection >> framework in our multi-node network. >> I have the following questions: >> 1) From a scalability point of view, how does Flume compare with JMS? >> Are there any numbers that can be referred to >> 2) My typical payload for a single message is 2 KB. I expect traffic >> of approx. 50 million messages/day. The messages are usually one >> sender one receiver type. I require a reasonable level of reliability >> (atleast the store-and-forward mode in Flume & durable/persistent >> messages in JMS). With these considerations, which will give better >> performance: Flume or JMS? > > All of this is extremely dependent on the implementation of JMS you > use. JMS is a specification, there are many implementations. Looking > at your numbers and assumption all the messages come in 8 hours > (representing peak load) that is about 4MB/second. > > Both Flume and most JMS implementations should be able to handle this > throughput. The advantage of Flume is really configuration. Purchasing > and configuring a JMS server and then writing code to interact with > the JMS Server is, IMHO, going to be less work than installing and > configuring Flume. I meant to say setting up all that JMS infrastructure is going to be *more* work than flume. Brock
