[
https://issues.apache.org/jira/browse/DRILL-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
James Turton updated DRILL-8388:
--------------------------------
Summary: Zero-record Parquet writer fragments result in query cancellation
and zero-byte Parquet files (was: Zero-record Parquet writer fragments result
in query cancellation and 0 byte Parquet files)
> Zero-record Parquet writer fragments result in query cancellation and
> zero-byte Parquet files
> ---------------------------------------------------------------------------------------------
>
> Key: DRILL-8388
> URL: https://issues.apache.org/jira/browse/DRILL-8388
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Writer
> Affects Versions: 1.20.3
> Reporter: James Turton
> Assignee: James Turton
> Priority: Major
> Fix For: 1.21.0
>
>
> I'll refine this ticket as I discover more but at the current time I believe
> this bug can reproduced as follows.
> # The Drill writer format is set to Parquet.
> # A CTAS statement is issued over JDBC (the bug does not appear to manifest
> for the same query received over REST).
> # The CTAS statement spawns multiple Parquet writer fragments. It may also
> be necessary that these fragments are distributed over more than one Drillbit
> (unconfirmed on a single Drillbit).
> # Some of the Parquet writer fragments receive batches containing zero
> records.
> # The query completes but ends in the cancelled state.
> # Invalid, zero-byte Parquet files are written to the writer destination by
> the writer fragments that received zero records.
> # A subsequent query against the destination fails because it encounters
> zero-byte Parquet files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)