[
https://issues.apache.org/jira/browse/FLUME-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957347#comment-13957347
]
Hari Shreedharan commented on FLUME-2352:
-----------------------------------------
Do you have any specific data to prove that adding events as one batch helps? I
believe compression codecs actually hold events in a buffer and compress only
when a flush is called or the buffer is full.
> HDFSCompressedDataStream should support appendBatch
> ---------------------------------------------------
>
> Key: FLUME-2352
> URL: https://issues.apache.org/jira/browse/FLUME-2352
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
> Affects Versions: v1.5.0
> Reporter: chenshangan
> Assignee: chenshangan
> Fix For: v1.5.0
>
> Attachments: FLUME-2352.patch
>
>
> compressing events in batch is much more efficient than compressing one by
> one.
> I set hdfs.batchSize to 200000, when I use appendBatch() in BucketWriter, the
> append operation cost less than 1 seconds, while one by one might cost 10
> seconds.
--
This message was sent by Atlassian JIRA
(v6.2#6252)