[
https://issues.apache.org/jira/browse/HIVE-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071330#comment-13071330
]
John Sichi commented on HIVE-2065:
----------------------------------
This one has been sitting in Patch Available queue for a while...are there
issues that still need to be resolved?
> RCFile issues
> -------------
>
> Key: HIVE-2065
> URL: https://issues.apache.org/jira/browse/HIVE-2065
> Project: Hive
> Issue Type: Bug
> Reporter: Krishna Kumar
> Assignee: Krishna Kumar
> Priority: Minor
> Attachments: HIVE.2065.patch.0.txt, HIVE.2065.patch.1.txt,
> Slide1.png, proposal.png
>
>
> Some potential issues with RCFile
> 1. Remove unwanted synchronized modifiers on the methods of RCFile. As per
> yongqiang he, the class is not meant to be thread-safe (and it is not). Might
> as well get rid of the confusing and performance-impacting lock acquisitions.
> 2. Record Length overstated for compressed files. IIUC, the key compression
> happens after we have written the record length.
> {code}
> int keyLength = key.getSize();
> if (keyLength < 0) {
> throw new IOException("negative length keys not allowed: " + key);
> }
> out.writeInt(keyLength + valueLength); // total record length
> out.writeInt(keyLength); // key portion length
> if (!isCompressed()) {
> out.writeInt(keyLength);
> key.write(out); // key
> } else {
> keyCompressionBuffer.reset();
> keyDeflateFilter.resetState();
> key.write(keyDeflateOut);
> keyDeflateOut.flush();
> keyDeflateFilter.finish();
> int compressedKeyLen = keyCompressionBuffer.getLength();
> out.writeInt(compressedKeyLen);
> out.write(keyCompressionBuffer.getData(), 0, compressedKeyLen);
> }
> {code}
> 3. For sequence file compatibility, the compressed key length should be the
> next field to record length, not the uncompressed key length.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira