[ 
https://issues.apache.org/jira/browse/FLUME-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062513#comment-15062513
 ] 

Roshan Naik commented on FLUME-2801:
------------------------------------

Thanks [~iijima_satoshi] for the review.  Im running tests.. will commit soon.

> Performance improvement on TailDir source
> -----------------------------------------
>
>                 Key: FLUME-2801
>                 URL: https://issues.apache.org/jira/browse/FLUME-2801
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: v1.7.0
>            Reporter: Jun Seok Hong
>            Assignee: Jun Seok Hong
>             Fix For: v1.7.0
>
>         Attachments: FLUME-2801-1.patch, FLUME-2801-2.patch, FLUME-2801.patch
>
>
> This a proposal of performance improvement for new tailing source FLUME-2498.
> Taildir source reads a file by 1byte, so the performance is very low compared 
> to tailing on exec source.
> I tested lot's of ways to improve performance and implemented the best one.
> Changes.
> * Reading a file by a 8k block instead of 1 byte.
> * Use byte[] for handling data instead of 
> ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance.
> * Don't convert byte[] to string and vice verse.
> Simple file reading test results.
> {quote}
>  File size: 100 MB, 
>  Line size: 500 byte
> Estimated time to read the file:
> |Reading 1byte(Using the code in Taildir)|32544 ms|
> |Reading 8K Block|431 ms|
> {quote}
> Testing on flume, it catches up the performance of tailing on exec source. 
> (30x performance boost)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to