[jira] [Commented] (DRILL-5982) CTAS creates parquet files with inconsistent nullable column

James Turton (Jira) Thu, 28 Jul 2022 05:24:06 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572416#comment-17572416
 ]


James Turton commented on DRILL-5982:
-------------------------------------

Running this test with the UNION type enabled also fails but with a different 
error. CC [~vitalii].
{code:java}
Caused by: java.lang.ClassCastException: class 
org.apache.drill.exec.vector.IntVector cannot be cast to class 
org.apache.drill.exec.vector.NullableIntVector 
(org.apache.drill.exec.vector.IntVector and 
org.apache.drill.exec.vector.NullableIntVector are in unnamed module of loader 
'app')
        at 
org.apache.drill.exec.vector.complex.UnionVector.getIntVector(UnionVector.java:313)
        at 
org.apache.drill.exec.vector.complex.UnionVector.getMember(UnionVector.java:472)
        at 
org.apache.drill.exec.vector.complex.UnionVector$Mutator.setValueCount(UnionVector.java:885)
        at 
org.apache.drill.exec.vector.complex.UnionVector.setFirstType(UnionVector.java:656)
        at 
org.apache.drill.exec.record.SchemaUtil.coerceVector(SchemaUtil.java:122)
        at 
org.apache.drill.exec.record.SchemaUtil.coerceContainer(SchemaUtil.java:178)
        at 
org.apache.drill.exec.physical.impl.xsort.BufferedBatches.convertBatch(BufferedBatches.java:121)
        at 
org.apache.drill.exec.physical.impl.xsort.BufferedBatches.add(BufferedBatches.java:88)
        at 
org.apache.drill.exec.physical.impl.xsort.SortImpl.addBatch(SortImpl.java:309)
        at 
org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.loadBatch(ExternalSortBatch.java:455)
        at 
org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.load(ExternalSortBatch.java:400)
        at 
org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:355)
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:160)
{code}
 

> CTAS creates parquet files with inconsistent nullable column
> ------------------------------------------------------------
>
>                 Key: DRILL-5982
>                 URL: https://issues.apache.org/jira/browse/DRILL-5982
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.11.0
>         Environment: windows 10
>            Reporter: Raymond Wong
>            Priority: Major
>
> Create two CTAS parquet files. One CTAS statement uses a MySQL as data 
> source. The other one uses {{(Values(1))}} as data source. Both files have 
> the same schema - same column names and data type.
> The Parquet file created with MySQL data source has nullable columns and the 
> file created with {{(Values(1))}} has non-nullable columns.
> {quote}
> DROP TABLE dfs.tmp.table1;
> CREATE TABLE dfs.tmp.table1 AS
> SELECT 'CA' AS state, CAST(1 AS BIGINT) AS id
> FROM `mysql_dw_reporting.datawarehouse1`.DW_Qualbe_Cust_And_CustPay
> LIMIT 1;
> DROP TABLE dfs.tmp.table2;
> CREATE TABLE dfs.tmp.table2 AS
> SELECT 'NY' AS state, CAST(2 AS BIGINT) AS id
> FROM (Values(1))
> ;
> {quote}
> The result of this inconsistency impacts the ability to apply SQL window 
> function across parquet tables. Querying table1 and table2 with a SQL window 
> function generates an error message as follows
> {quote}
> SELECT id, FIRST_VALUE(state) OVER( PARTITION BY id ) AS state 
> FROM  dfs.tmp.`table*`
> SQL Error: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts 
> with changing schemas
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (DRILL-5982) CTAS creates parquet files with inconsistent nullable column

Reply via email to