[ https://issues.apache.org/jira/browse/HIVE-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016443#comment-13016443 ]
jirapos...@reviews.apache.org commented on HIVE-2065: ----------------------------------------------------- ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/529/ ----------------------------------------------------------- (Updated 2011-04-06 17:13:30.910168) Review request for hive and Yongqiang He. Changes ------- Updated patch where sequence file compliance is not addressed but the other two issues are. Summary ------- Patch for HIVE-2065 This addresses bug HIVE-2065. https://issues.apache.org/jira/browse/HIVE-2065 Diffs (updated) ----- build-common.xml 9f21a69 data/files/test_v6dot0_compressed.rc PRE-CREATION data/files/test_v6dot0_uncompressed.rc PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java eb5305b ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java 20d1f4e ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileKeyBufferWrapper.java f7eacdc ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java bb1e3c9 ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java 8bb6f3a ql/src/test/results/clientpositive/alter_merge.q.out 25f36c0 ql/src/test/results/clientpositive/alter_merge_stats.q.out 243f7cc ql/src/test/results/clientpositive/partition_wise_fileformat.q.out cee2e72 ql/src/test/results/clientpositive/partition_wise_fileformat3.q.out 067ab43 ql/src/test/results/clientpositive/sample10.q.out 50406c3 Diff: https://reviews.apache.org/r/529/diff Testing ------- Tests added, existing tests updated Thanks, Krishna > RCFile issues > ------------- > > Key: HIVE-2065 > URL: https://issues.apache.org/jira/browse/HIVE-2065 > Project: Hive > Issue Type: Bug > Reporter: Krishna Kumar > Assignee: Krishna Kumar > Priority: Minor > Attachments: HIVE.2065.patch.0.txt, HIVE.2065.patch.1.txt, > Slide1.png, proposal.png > > > Some potential issues with RCFile > 1. Remove unwanted synchronized modifiers on the methods of RCFile. As per > yongqiang he, the class is not meant to be thread-safe (and it is not). Might > as well get rid of the confusing and performance-impacting lock acquisitions. > 2. Record Length overstated for compressed files. IIUC, the key compression > happens after we have written the record length. > {code} > int keyLength = key.getSize(); > if (keyLength < 0) { > throw new IOException("negative length keys not allowed: " + key); > } > out.writeInt(keyLength + valueLength); // total record length > out.writeInt(keyLength); // key portion length > if (!isCompressed()) { > out.writeInt(keyLength); > key.write(out); // key > } else { > keyCompressionBuffer.reset(); > keyDeflateFilter.resetState(); > key.write(keyDeflateOut); > keyDeflateOut.flush(); > keyDeflateFilter.finish(); > int compressedKeyLen = keyCompressionBuffer.getLength(); > out.writeInt(compressedKeyLen); > out.write(keyCompressionBuffer.getData(), 0, compressedKeyLen); > } > {code} > 3. For sequence file compatibility, the compressed key length should be the > next field to record length, not the uncompressed key length. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira