Have you looked at the cassandra stats to make sure cassandra or disk io is not the bottleneck?
On Thu, Apr 19, 2012 at 12:44 AM, Trevor Francis < trevor.fran...@tgrahamcapital.com> wrote: > I am new to flume and just setup a distributed scenario. I have one agent, > one collector, and one master, all on different hardware. > > I am currently having the agent taildir a log directory and I am sinking > the data into Cassandra. I am generating 1000 log entries per second, with > entries 150-bytes in size. > > I have VNstat running on the agent box and the ethernet bandwidth never > goes over 2mb/sec leaving the agent to the collector. My goal was to get > the 1000 log entries per second to write in near realtime to the collector, > which then inserts them into Cassandra. > > Right now Cassandra is yawning under the load and I am not sure where the > data is being throttled. I can create log entries for 1 minutes, then turn > off additional writes. The agent takes at least 10 minutes to process > through all of those entries. > > Thoughts? > > Trevor Francis > > >