James Turton created DRILL-8388: ----------------------------------- Summary: Zero-record Parquet writer fragments result in query cancellation and 0 byte Parquet files Key: DRILL-8388 URL: https://issues.apache.org/jira/browse/DRILL-8388 Project: Apache Drill Issue Type: Bug Components: Storage - Writer Affects Versions: 1.20.3 Reporter: James Turton Assignee: James Turton Fix For: 1.21.0
I'll refine this ticket as I discover more but at the current time I believe this bug can reproduced as follows. # The Drill writer format is set to Parquet. # A CTAS statement is issued over JDBC (the bug does not appear to manifest for the same query received over REST). # The CTAS statement spawns multiple Parquet writer fragments. It may also be necessary that these fragments are distributed over more than one Drillbit (unconfirmed on a single Drillbit). # Some of the Parquet writer fragments receive batches containing zero records. # The query completes but ends in the cancelled state. # Invalid, zero-byte Parquet files are written to the writer destination by the writer fragments that received zero records. # A subsequent query against the destination fails because it encounters zero-byte Parquet files. -- This message was sent by Atlassian Jira (v8.20.10#820010)