Deneche A. Hakim created DRILL-4376: ---------------------------------------
Summary: Wrong results when doing a count(*) on part of directories with metadata cache Key: DRILL-4376 URL: https://issues.apache.org/jira/browse/DRILL-4376 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.4.0 Reporter: Deneche A. Hakim Assignee: Deneche A. Hakim Priority: Critical Fix For: 1.6.0 First create some parquet tables in multiple subfolders: {noformat} create table dfs.tmp.`test/201501` as select employee_id, full_name from cp.`employee.json` limit 2; create table dfs.tmp.`test/201502` as select employee_id, full_name from cp.`employee.json` limit 2; create table dfs.tmp.`test/201601` as select employee_id, full_name from cp.`employee.json` limit 2; create table dfs.tmp.`test/201602` as select employee_id, full_name from cp.`employee.json` limit 2; {noformat} Running the following query gives the expected count: {noformat} select count(*) from dfs.tmp.`test/20160*`; +---------+ | EXPR$0 | +---------+ | 4 | +---------+ {noformat} But once you create the metadata cache files, the query no longer returns the correct results: {noformat} refresh table metadata dfs.tmp.`test`; select count(*) from dfs.tmp.`test/20160*`; +---------+ | EXPR$0 | +---------+ | 2 | +---------+ {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)