Paul Rogers created DRILL-5145:
----------------------------------

             Summary: Planner causes exception due to Drill CTAS Parquet file
                 Key: DRILL-5145
                 URL: https://issues.apache.org/jira/browse/DRILL-5145
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.9.0
            Reporter: Paul Rogers


The {{TestWindowFrame}} unit test displays an (ignored) exception to the 
console about a bad format in a Drill-created Parquet file. The query continues 
and the test succeeds, so the exception must be ignored within the planner.

It is not entirely clear which specific test produces the error; it may be 
occurring when the Drillbit is shutting down after the test completes.

Here I am assuming that the file comes from CTAS since no Parquet files appear 
to be included in test resource; but this assessment is not certain.

{code}
WARNING: org.apache.parquet.CorruptStatistics: Ignoring statistics because 
created_by could not be parsed (see PARQUET-251): parquet-mr
org.apache.parquet.VersionParser$VersionParseException: Could not parse 
created_by: parquet-mr using format: (.+) version ((.*) )?\(build ?(.*)\)
        at org.apache.parquet.VersionParser.parse(VersionParser.java:112)
        at 
org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:66)
        at 
org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:264)
        at 
org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:568)
        at 
org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:545)
        at 
org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:455)
        at 
org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:412)
        at 
org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:391)
        at 
org.apache.drill.exec.store.parquet.Metadata.access$0(Metadata.java:389)
        at 
org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:326)
        at 
org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:1)
        at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:56)
        at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:122)
        at 
org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:288)
        at 
org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:267)
        at 
org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:252)
        at 
org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:122)
        at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:733)
        at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.<init>(ParquetGroupScan.java:230)
        at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.<init>(ParquetGroupScan.java:190)
        at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:169)
        at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:1)
        at 
org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144)
        at 
org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100)
        at 
org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85)
        at 
org.apache.drill.exec.planner.logical.DrillScanRel.<init>(DrillScanRel.java:89)
        at 
org.apache.drill.exec.planner.logical.DrillScanRel.<init>(DrillScanRel.java:69)
        at 
org.apache.drill.exec.planner.logical.DrillScanRel.<init>(DrillScanRel.java:62)
        at 
org.apache.drill.exec.planner.logical.DrillScanRule.onMatch(DrillScanRule.java:37)
        at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
        at 
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
        at 
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303)
        at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:404)
        at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:343)
        at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:240)
        at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:290)
        at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:168)
        at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:122)
        at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:96)
        at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1019)
        at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:264)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Dec 20, 2016 11:14:49 PM INFO: 
org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 50B for [c0] 
BINARY: 3 values, 21B raw, 23B comp, 1 pages, encodings: [RLE, BIT_PACKED, 
PLAIN]
Dec 20, 2016 11:14:49 PM INFO: 
org.apache.parquet.hadoop.ColumnChunkPageWriteStore: written 45B for [c1] 
BINARY: 3 values, 16B raw, 18B comp, 1 pages, encodings: [RLE, BIT_PACKED, 
PLAIN]
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to