[ https://issues.apache.org/jira/browse/FLUME-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062513#comment-15062513 ]
Roshan Naik commented on FLUME-2801: ------------------------------------ Thanks [~iijima_satoshi] for the review. Im running tests.. will commit soon. > Performance improvement on TailDir source > ----------------------------------------- > > Key: FLUME-2801 > URL: https://issues.apache.org/jira/browse/FLUME-2801 > Project: Flume > Issue Type: Improvement > Components: Sinks+Sources > Affects Versions: v1.7.0 > Reporter: Jun Seok Hong > Assignee: Jun Seok Hong > Fix For: v1.7.0 > > Attachments: FLUME-2801-1.patch, FLUME-2801-2.patch, FLUME-2801.patch > > > This a proposal of performance improvement for new tailing source FLUME-2498. > Taildir source reads a file by 1byte, so the performance is very low compared > to tailing on exec source. > I tested lot's of ways to improve performance and implemented the best one. > Changes. > * Reading a file by a 8k block instead of 1 byte. > * Use byte[] for handling data instead of > ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance. > * Don't convert byte[] to string and vice verse. > Simple file reading test results. > {quote} > File size: 100 MB, > Line size: 500 byte > Estimated time to read the file: > |Reading 1byte(Using the code in Taildir)|32544 ms| > |Reading 8K Block|431 ms| > {quote} > Testing on flume, it catches up the performance of tailing on exec source. > (30x performance boost) -- This message was sent by Atlassian JIRA (v6.3.4#6332)