I am new to flume and just setup a distributed scenario. I have one agent, one 
collector, and one master, all on different hardware. 

I am currently having the agent taildir a log directory and I am sinking the 
data into Cassandra. I am generating 1000 log entries per second, with entries 
150-bytes in size. 

I have VNstat running on the agent box and the ethernet bandwidth never goes 
over 2mb/sec leaving the agent to the collector. My goal was to get the 1000 
log entries per second to write in near realtime to the collector, which then 
inserts them into Cassandra.

Right now Cassandra is yawning under the load and I am not sure where the data 
is being throttled. I can create log entries for 1 minutes, then turn off 
additional writes. The agent takes at least 10 minutes to process through all 
of those entries.

Thoughts?

Trevor Francis


Reply via email to