----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/50816/ -----------------------------------------------------------
Review request for hive, Ashutosh Chauhan and Gopal V. Repository: hive-git Description ------- HIVE-7239 Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when input backed by Sequence/RC files In case of sequence files, it's crucial that splits are calculated around the boundaries enforced by the input sequence file. However by default hadoop creates input splits depending on the configuration parameters which may not match the boundaries for the input sequence file. Hive provides HiveIndexedInputFormat that provides extra logic and recalculates the split boundaries for each split depending on the sequence file's boundaries. However we noticed this behavior of "over" reporting from data backed by sequence file. We've a sample data on which we experimented and fixed this bug, we have verified this fix by comparing the query output for input being sequence file format, rc file and regular format. https://issues.apache.org/jira/browse/HIVE-7239 Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java 33cc5c3 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java 5247ece ql/src/java/org/apache/hadoop/hive/ql/index/IndexResult.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/SplitFilter.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/index/MockHiveInputSplits.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/index/MockIndexResult.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/index/MockInputFile.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/index/SplitFilterTestCase.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/index/TestHiveInputSplitComparator.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/index/TestSplitFilter.java PRE-CREATION Diff: https://reviews.apache.org/r/50816/diff/ Testing ------- Manually tested on a cluster. HiveQA: Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/674/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/674/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-674/ Thanks, Illya Yalovyy