Gopal V created HIVE-4423:
-----------------------------
Summary: Improve RCFile::sync(long) 10x
Key: HIVE-4423
URL: https://issues.apache.org/jira/browse/HIVE-4423
Project: Hive
Issue Type: Improvement
Environment: Ubuntu LXC (1 SSD, 1 disk, 32 gigs of RAM)
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
Fix For: 0.11.0
RCFile::sync(long) takes approx ~1 second everytime it gets called because of
the inner loops in the function.
>From what was observed with HDFS-4710, single byte reads are an order of
>magnitude slower than larger 512 byte buffer reads.
Even when disk I/O is buffered to this size, there is overhead due to the
synchronized read() methods in BlockReaderLocal & RemoteBlockReader classes.
Removing the readByte() calls in RCFile.sync(long) with a readFully(512 byte)
call will speed this function >10x.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira