GitHub user turcsanyip opened a pull request:
https://github.com/apache/flume/pull/240
FLUME-3101 Add maxBatchCount config property to Taildir Source.
If there are multiple files in the path(s) that need to be tailed and there
is a file written by high frequency, then Taildir can read the batchSize
size
events from that file every time. This can lead to an endless loop and
Taildir
will only read data from the busy file, while other files will not be
processed.
Another problem is that in this case TaildirSource will be unresponsive to
stop requests too.
This commit handles this situation by introducing a new config property
called
maxBatchCount. It controls the number of batches being read consecutively
from the same file. After reading maxBatchCount rounds from a file, Taildir
will switch to another file / will have a break in the processing.
This change is based on hunshenshi's patch.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/turcsanyip/flume FLUME-3101
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flume/pull/240.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #240
----
commit 8ecb0ed1931d84e00962b996f89c6a5985b9d7c7
Author: turcsanyi <turcsanyi@...>
Date: 2018-11-21T15:06:04Z
FLUME-3101 Add maxBatchCount config property to Taildir Source.
If there are multiple files in the path(s) that need to be tailed and there
is a file written by high frequency, then Taildir can read the batchSize
size
events from that file every time. This can lead to an endless loop and
Taildir
will only read data from the busy file, while other files will not be
processed.
Another problem is that in this case TaildirSource will be unresponsive to
stop requests too.
This commit handles this situation by introducing a new config property
called
maxBatchCount. It controls the number of batches being read consecutively
from the same file. After reading maxBatchCount rounds from a file, Taildir
will switch to another file / will have a break in the processing.
This change is based on hunshenshi's patch.
----
---