[ https://issues.apache.org/jira/browse/IMPALA-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zoltán Borók-Nagy resolved IMPALA-9952. --------------------------------------- Fix Version/s: Impala 4.0 Resolution: Fixed > Invalid offset index in Parquet file > ------------------------------------- > > Key: IMPALA-9952 > URL: https://issues.apache.org/jira/browse/IMPALA-9952 > Project: IMPALA > Issue Type: Bug > Components: Backend > Affects Versions: Impala 3.4.0 > Reporter: guojingfeng > Assignee: Zoltán Borók-Nagy > Priority: Major > Labels: Parquet > Fix For: Impala 4.0 > > > When reading parquet file in impala 3.4, encountered the following error: > {code:java} > I0714 16:11:48.307806 1075820 runtime-state.cc:207] > 8c43203adb2d4fc8:0478df9b0000018b] Error from query > 8c43203adb2d4fc8:0478df9b00000000: Invalid offset index in Parquet file > hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq. > I0714 16:11:48.834901 1075838 status.cc:126] > 8c43203adb2d4fc8:0478df9b000002c0] Invalid offset index in Parquet file > hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq. > @ 0xbf4ef9 > @ 0x1748c41 > @ 0x174e170 > @ 0x1750e58 > @ 0x17519f0 > @ 0x1748559 > @ 0x1510b41 > @ 0x1512c8f > @ 0x137488a > @ 0x1375759 > @ 0x1b48a19 > @ 0x7f34509f5e24 > @ 0x7f344d5ed35c > I0714 16:11:48.835763 1075838 runtime-state.cc:207] > 8c43203adb2d4fc8:0478df9b000002c0] Error from query > 8c43203adb2d4fc8:0478df9b00000000: Invalid offset index in Parquet file > hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq. > I0714 16:11:48.893784 1075820 status.cc:126] > 8c43203adb2d4fc8:0478df9b0000018b] Top level rows aren't in sync during page > filtering in file > hdfs://path/4844de7af4545a39-e8ebc7da0000005f_2015704758_data.0.parq. > @ 0xbf4ef9 > @ 0x1749104 > @ 0x17494cc > @ 0x1751aee > @ 0x1748559 > @ 0x1510b41 > @ 0x1512c8f > @ 0x137488a > @ 0x1375759 > @ 0x1b48a19 > @ 0x7f34509f5e24 > @ 0x7f344d5ed35c > {code} > Corresponding source code: > {code:java} > Status HdfsParquetScanner::CheckPageFiltering() { > if (candidate_ranges_.empty() || scalar_readers_.empty()) return > Status::OK(); int64_t current_row = scalar_readers_[0]->LastProcessedRow(); > for (int i = 1; i < scalar_readers_.size(); ++i) { > if (current_row != scalar_readers_[i]->LastProcessedRow()) { > DCHECK(false); > return Status(Substitute( > "Top level rows aren't in sync during page filtering in file $0.", > filename())); > } > } > return Status::OK(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org