James Turton created DRILL-8388:
-----------------------------------
Summary: Zero-record Parquet writer fragments result in query
cancellation and 0 byte Parquet files
Key: DRILL-8388
URL: https://issues.apache.org/jira/browse/DRILL-8388
Project: Apache Drill
Issue Type: Bug
Components: Storage - Writer
Affects Versions: 1.20.3
Reporter: James Turton
Assignee: James Turton
Fix For: 1.21.0
I'll refine this ticket as I discover more but at the current time I believe
this bug can reproduced as follows.
# The Drill writer format is set to Parquet.
# A CTAS statement is issued over JDBC (the bug does not appear to manifest
for the same query received over REST).
# The CTAS statement spawns multiple Parquet writer fragments. It may also be
necessary that these fragments are distributed over more than one Drillbit
(unconfirmed on a single Drillbit).
# Some of the Parquet writer fragments receive batches containing zero records.
# The query completes but ends in the cancelled state.
# Invalid, zero-byte Parquet files are written to the writer destination by
the writer fragments that received zero records.
# A subsequent query against the destination fails because it encounters
zero-byte Parquet files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)