Hi, Is anybody still looking into this question?
Should I log it in jira such that somebody can look into it later? thanks, Jan On 11/18/2013 11:28 AM, Jan Van Besien wrote: > Hi, > > Sorry it took me a while to answer this. I compiled a small test case > using only off the shelve flume components that shows what is going on. > > The setup is a single agent with http source, null sink and file > channel. I am using the default configuration as much as possible. > > The test goes as follows: > > - start the agent without sink > - run a script that sends http requests in multiple threads to the http > source (the script simply calls the url http://localhost:8080/?key=value > over and over a gain, whereby value is a random string of 100 chars). > - this script does about 100 requests per second on my machine. I leave > it running for a while, such that the file channel contains about 20000 > events. > - add the null sink to the configuration (around 11:14:33 in the log). > - observe the logging of the null sink. You'll see in the log file that > it takes more than 10 seconds per 1000 events (until about even 5000, > around 11:15:33) > - stop the http request generating script (i.e. no more writing in file > channel) > - observer the logging of the null sink: events 5000 until 20000 are all > processed within a few seconds. > > In the attachment: > - flume log > - thread dumps while the ingest was running and the null sink was enabled > - config (agent1.conf) > > I also tried with more sinks (4), see agent2.conf. The results are the same. > > Thanks for looking into this, > Jan > > > On 11/14/2013 05:08 PM, Brock Noland wrote: >> On Thu, Nov 14, 2013 at 2:50 AM, Jan Van Besien <[email protected] >> <mailto:[email protected]>> wrote: >> >> On 11/13/2013 03:04 PM, Brock Noland wrote: >> > The file channel uses a WAL which sits on disk. Each time an >> event is >> > committed an fsync is called to ensure that data is durable. Without >> > this fsync there is no durability guarantee. More details here: >> > https://blogs.apache.org/flume/entry/apache_flume_filechannel >> >> Yes indeed. I was just not expecting the performance impact to be >> that big. >> >> >> > The issue is that when the source is committing one-by-one it's >> > consuming the disk doing an fsync for each event. I would find a >> way to >> > batch up the requests so they are not written one-by-one or use >> multiple >> > disks for the file channel. >> >> I am already using multiple disks for the channel (4). >> >> >> Can you share your configuration? >> >> Batching the >> requests is indeed what I am doing to prevent the filechannel to be the >> bottleneck (using a flume agent with a memory channel in front of the >> agent with the file channel), but it inheritely means that I loose >> end-to-end durability because events are buffered in memory before being >> flushed to disk. >> >> >> I would be curious to know though if you doubled the sinks if that would >> give more time to readers. Could you take three-four thread dumps of the >> JVM while it's in this state and share them? >> >
