[
https://issues.apache.org/jira/browse/FLUME-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410624#comment-13410624
]
Patrick Wendell edited comment on FLUME-1361 at 7/10/12 5:52 PM:
-----------------------------------------------------------------
Hey Juhani,
Yep - you've got it. The ideal setup for a FileChannel would either be:
1) Using a dedicated disk for Flume and flushing to disk on every event.
or
2) Using a shared disk for flume and batching disk sync's to prevent excess
seeking.
The first case is similar to using a WAL, frequent seeks but a dedicated disk,
so you can get high throughput. If you try to use FileChannel with a shared
disk, and you are sync'ing on every event, throughput is going to be bad.
So I'd expect adding batching to give better throughput, and it sounds like it
is.
One question is whether batching should happen as part of the source or if it
should be a first-order feature of the channel, since people will have this
problem with other types of sources (e.g. syslog source) whenever they want to
do durable writes at high throughput.
was (Author: [email protected]):
Hey Juhani,
Yep - you've got it. The ideal setup for a FileChannel would either be:
1) Using a dedicated disk for Flume and flushing to disk on every event.
- or -
2) Using a shared disk for flume and batching disk sync's to prevent excess
seeking.
The first case is similar to using a WAL, frequent seeks but a dedicated disk,
so you can get high throughput. If you try to use FileChannel with a shared
disk, and you are sync'ing on every event, throughput is going to be bad.
So I'd expect adding batching to give better throughput, and it sounds like it
is.
One question is whether batching should happen as part of the source or if it
should be a first-order feature of the channel, since people will have this
problem with other types of sources (e.g. syslog source) whenever they want to
do durable writes at high throughput.
> Add event batching to ExecSource
> --------------------------------
>
> Key: FLUME-1361
> URL: https://issues.apache.org/jira/browse/FLUME-1361
> Project: Flume
> Issue Type: Improvement
> Reporter: Juhani Connolly
> Assignee: Juhani Connolly
>
> Add a configuration option for the number of items to send to the channel in
> a single transaction.
> This will help a lot with FileChannel which needs to fsync every commit.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira