Paul Rogers created DRILL-5145: ---------------------------------- Summary: Planner causes exception due to Drill CTAS Parquet file Key: DRILL-5145 URL: https://issues.apache.org/jira/browse/DRILL-5145 Project: Apache Drill Issue Type: Bug Affects Versions: 1.9.0 Reporter: Paul Rogers
The {{TestWindowFrame}} unit test displays an (ignored) exception to the console about a bad format in a Drill-created Parquet file. The query continues and the test succeeds, so the exception must be ignored within the planner. It is not entirely clear which specific test produces the error; it may be occurring when the Drillbit is shutting down after the test completes. Here I am assuming that the file comes from CTAS since no Parquet files appear to be included in test resource; but this assessment is not certain. {code} WARNING: org.apache.parquet.CorruptStatistics: Ignoring statistics because created_by could not be parsed (see PARQUET-251): parquet-mr org.apache.parquet.VersionParser$VersionParseException: Could not parse created_by: parquet-mr using format: (.+) version ((.*) )?\(build ?(.*)\) at org.apache.parquet.VersionParser.parse(VersionParser.java:112) at org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:66) at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:264) at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:568) at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:545) at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:455) at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:412) at org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:391) at org.apache.drill.exec.store.parquet.Metadata.access$0(Metadata.java:389) at org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:326) at org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:1) at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:56) at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:122) at org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:288) at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:267) at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:252) at org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:122) at org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:733) at org.apache.drill.exec.store.parquet.ParquetGroupScan.<init>(ParquetGroupScan.java:230) at org.apache.drill.exec.store.parquet.ParquetGroupScan.<init>(ParquetGroupScan.java:190) at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:169) at org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:1) at org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144) at org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100) at org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85) at org.apache.drill.exec.planner.logical.DrillScanRel.<init>(DrillScanRel.java:89) at org.apache.drill.exec.planner.logical.DrillScanRel.<init>(DrillScanRel.java:69) at org.apache.drill.exec.planner.logical.DrillScanRel.<init>(DrillScanRel.java:62) at org.apache.drill.exec.planner.logical.DrillScanRule.onMatch(DrillScanRule.java:37) at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:404) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:343) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:240) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:290) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:168) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:122) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:96) at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1019) at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:264) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Dec 20, 2016 11:14:49 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 50B for [c0] BINARY: 3 values, 21B raw, 23B comp, 1 pages, encodings: [RLE, BIT_PACKED, PLAIN] Dec 20, 2016 11:14:49 PM INFO: org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 45B for [c1] BINARY: 3 values, 16B raw, 18B comp, 1 pages, encodings: [RLE, BIT_PACKED, PLAIN] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)