Mark: This is due to some kind of wacky state in the WAL (I'm assuming this is E2E) if I remember correctly. The issue is that events keep piling up in the WAL and it never gets drained properly. I don't have the details in front of me but I vaguely remember this being related to downstream ACK issues (or could be a remanent thereof). You could try taking a look at the logs on the errant host to confirm the retransmission and lack of ACK.
Let us know. On Thu, Dec 1, 2011 at 10:53 AM, Mark Lewandowski < mark.e.lewandow...@gmail.com> wrote: > Hi all, > > I've got 9 identical boxes configured in my flume cluster. 8 of them are > working correctly, with exactly the same config (source: tail, sink: > AutoE2EChain). 1 box is trying to send way too much data through my > collector. I'm tailing a log file that's currently ~200k, but when I start > the flume agent, and I the writing and sending directories I see that flume > is trying to send files of ~1GB. A new 1GB file is created in this > directory about every 10s. > > Any ideas? > > -Mark > -- Eric Sammer twitter: esammer data: www.cloudera.com