David Perkins created HUDI-7930:
-----------------------------------

             Summary: Merge On Read table Unsupported type in the list: 
optional binary xxx (STRING)
                 Key: HUDI-7930
                 URL: https://issues.apache.org/jira/browse/HUDI-7930
             Project: Apache Hudi
          Issue Type: Bug
          Components: compaction
    Affects Versions: 0.14.1
            Reporter: David Perkins


I have run into an issue with Merge On Read tables that have an array of rows 
in Flink. I am able to write data, but after compaction reads produce this 
exception. 

{{java.lang.RuntimeException: Unsupported type in the list: optional binary 
item1 (STRING)}}

The error only occurs after a compaction happens and produces parquet files. 
I'm using Hudi 0.14.1 and Flink 1.17.2 writing to Azure ADLS. I haven't tried 
switching to Copy on Right tables, but will try that next.

Steps to reproduce the error.
1. Create a table with an array of rows

{{CREATE temporary TABLE TestTable (}}
{{– Additional keys and foreign keys}}
{{rowId STRING NOT NULL,}}
{{myArray ARRAY< ROW< item1 STRING, item2 STRING > >}}
{{) WITH (}}
{{'connector' = 'hudi',}}
{{'path' = 
'abfs://<container>@<storage_account>.dfs.core.windows.net/hudi/testtable',}}
{{'table.type' = 'MERGE_ON_READ',}}
{{'write.batch.size' = '1',}}
{{'hoodie.compact.inline' = 'true',}}
{{'hoodie.compact.inline.max.delta.commits' = '1',}}
{{'compaction.async.enabled' = 'false',}}
{{'compaction.delta_commits' = '1',}}
{{'hoodie.datasource.write.recordkey.field' = 'rowId'}}
{{);}}

2. Insert some data
{{insert into TestTable values}}
{{('1', ARRAY[ROW('1.item1', '1.item2')]),}}
{{('2', ARRAY[ROW('2.item1', '2.item2')]),}}
{{('3', ARRAY[ROW('3.item1', '3.item2')]),}}
{{('4', ARRAY[ROW('4.item1', '4.item2')]),}}
{{('5', ARRAY[ROW('5.item1', '5.item2')]),}}
{{('6', ARRAY[ROW('6.item1', '6.item2')]),}}
{{('7', ARRAY[ROW('7.item1', '7.item2')]),}}
{{('8', ARRAY[ROW('8.item1', '8.item2')]),}}
{{('9', ARRAY[ROW('9.item1', '9.item2')]),}}
{{('10', ARRAY[ROW('10.item1', '10.item2')])}}
{{;}}

3. Query
{{Select * from TestTable;}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to