[ https://issues.apache.org/jira/browse/HUDI-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Kudinkin updated HUDI-5392: ---------------------------------- Story Points: 1 (was: 2) > Fix Bootstrap files reader to configure arrays to be read in the new format > --------------------------------------------------------------------------- > > Key: HUDI-5392 > URL: https://issues.apache.org/jira/browse/HUDI-5392 > Project: Apache Hudi > Issue Type: Bug > Components: bootstrap > Reporter: Alexey Kudinkin > Assignee: Alexey Kudinkin > Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > > When writing Bootstrap file we’re using Spark writer that writes arrays in > the new format, while Hudi reads it in the old (Avro compatible) format: > {code:java} > // Old > optional group tip_history (LIST) { > repeated group array { > optional double amount; > optional binary currency (UTF8); > } > } > // new > optional group tip_history (LIST) { > repeated group list { > optional group element { > optional double amount; > optional binary currency (UTF8); > } > } > } {code} > > To fix that we need to make sure that Bootstrap files are *always* read in a > new format (Spark default) unlike Hudi's Parquet files > We also need to fix TestDataSourceForBootstrap, as it currently doesn't > actually assert that the records are written correctly. -- This message was sent by Atlassian Jira (v8.20.10#820010)