[ https://issues.apache.org/jira/browse/DRILL-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Parth Chandra updated DRILL-1906: --------------------------------- Assignee: Steven Phillips (was: Parth Chandra) > Parquet reader error when reading a subdirectory > ------------------------------------------------ > > Key: DRILL-1906 > URL: https://issues.apache.org/jira/browse/DRILL-1906 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet > Reporter: Aman Sinha > Assignee: Steven Phillips > Fix For: 0.8.0 > > > I am not sure if this is a regression but on current master branch, Drill is > unable to read subdirectories if there are parquet files in the parent > directory and subdirectory. It's trying to read the footer for the > subdirectory itself instead of recursing below. JSON works fine. > For example, here's my directory structure: > {code} > ls -lR /tmp/foo1 > -rw-r--r-- 1 asinha wheel 132 Dec 20 11:10 0_0_0.parquet > drwxr-xr-x 3 asinha wheel 102 Dec 20 09:54 foo2 > /tmp/foo1/foo2: > -rw-r--r-- 1 asinha wheel 132 Dec 16 16:14 0_0_0.parquet > {code} > Here's the failure and stack trace: > {code} > 0: jdbc:drill:zk=local> select * from foo1; > Query failed: Query failed: Unexpected exception during fragment > initialization: Internal error: Error while applying rule DrillTableRule, > args [rel#660:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[dfs, tmp, > foo1])] > <skip> > Caused by: java.io.IOException: Could not read footer: java.io.IOException: > Could not read footer for file > DeprecatedRawLocalFileStatus{path=file:/tmp/foo1/foo2; isDirectory=true; > modifica > tion_time=1419098040000; access_time=0; owner=; group=; permission=rwxrwxrwx; > isSymlink=false} > at > parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:195) > ~[parquet-hadoop-1.5.1-drill-r4.jar:0.8.0-SNAPSHOT] > at > parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:208) > ~[parquet-hadoop-1.5.1-drill-r4.jar:0.8.0-SNAPSHOT] > at > parquet.hadoop.ParquetFileReader.readFooters(ParquetFileReader.java:224) > ~[parquet-hadoop-1.5.1-drill-r4.jar:0.8.0-SNAPSHOT] > at > org.apache.drill.exec.store.parquet.ParquetGroupScan.readFooter(ParquetGroupScan.java:208) > ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)