Alessandro Solimando created HIVE-26150: -------------------------------------------
Summary: OrcRawRecordMerger reads each row twice Key: HIVE-26150 URL: https://issues.apache.org/jira/browse/HIVE-26150 Project: Hive Issue Type: Bug Components: ORC, Transactions Affects Versions: 4.0.0-alpha-2 Reporter: Alessandro Solimando OrcRawRecordMerger reads each row twice, the issue does not surface since the merger is only used with the parameter "collapseEvents" as true, which filters out one of the two rows. collapseEvents true and false should produce the same result, since in current acid implementation, each event has a distinct rowid, so two identical rows cannot be there, this is the case only for the bug. In order to reproduce the issue, it is sufficient to set the second parameter to false [here|https://github.com/apache/hive/blob/61d4ff2be48b20df9fd24692c372ee9c2606babe/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L2103-L2106], and run tests in TestOrcRawRecordMerger and observe two tests failing. -- This message was sent by Atlassian Jira (v8.20.1#820001)