[
https://issues.apache.org/jira/browse/HIVE-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016443#comment-13016443
]
[email protected] commented on HIVE-2065:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/529/
-----------------------------------------------------------
(Updated 2011-04-06 17:13:30.910168)
Review request for hive and Yongqiang He.
Changes
-------
Updated patch where sequence file compliance is not addressed but the other two
issues are.
Summary
-------
Patch for HIVE-2065
This addresses bug HIVE-2065.
https://issues.apache.org/jira/browse/HIVE-2065
Diffs (updated)
-----
build-common.xml 9f21a69
data/files/test_v6dot0_compressed.rc PRE-CREATION
data/files/test_v6dot0_uncompressed.rc PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java eb5305b
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java
20d1f4e
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileKeyBufferWrapper.java
f7eacdc
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java
bb1e3c9
ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java 8bb6f3a
ql/src/test/results/clientpositive/alter_merge.q.out 25f36c0
ql/src/test/results/clientpositive/alter_merge_stats.q.out 243f7cc
ql/src/test/results/clientpositive/partition_wise_fileformat.q.out cee2e72
ql/src/test/results/clientpositive/partition_wise_fileformat3.q.out 067ab43
ql/src/test/results/clientpositive/sample10.q.out 50406c3
Diff: https://reviews.apache.org/r/529/diff
Testing
-------
Tests added, existing tests updated
Thanks,
Krishna
> RCFile issues
> -------------
>
> Key: HIVE-2065
> URL: https://issues.apache.org/jira/browse/HIVE-2065
> Project: Hive
> Issue Type: Bug
> Reporter: Krishna Kumar
> Assignee: Krishna Kumar
> Priority: Minor
> Attachments: HIVE.2065.patch.0.txt, HIVE.2065.patch.1.txt,
> Slide1.png, proposal.png
>
>
> Some potential issues with RCFile
> 1. Remove unwanted synchronized modifiers on the methods of RCFile. As per
> yongqiang he, the class is not meant to be thread-safe (and it is not). Might
> as well get rid of the confusing and performance-impacting lock acquisitions.
> 2. Record Length overstated for compressed files. IIUC, the key compression
> happens after we have written the record length.
> {code}
> int keyLength = key.getSize();
> if (keyLength < 0) {
> throw new IOException("negative length keys not allowed: " + key);
> }
> out.writeInt(keyLength + valueLength); // total record length
> out.writeInt(keyLength); // key portion length
> if (!isCompressed()) {
> out.writeInt(keyLength);
> key.write(out); // key
> } else {
> keyCompressionBuffer.reset();
> keyDeflateFilter.resetState();
> key.write(keyDeflateOut);
> keyDeflateOut.flush();
> keyDeflateFilter.finish();
> int compressedKeyLen = keyCompressionBuffer.getLength();
> out.writeInt(compressedKeyLen);
> out.write(keyCompressionBuffer.getData(), 0, compressedKeyLen);
> }
> {code}
> 3. For sequence file compatibility, the compressed key length should be the
> next field to record length, not the uncompressed key length.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira