[jira] [Commented] (HUDI-2023) Validate Schema evolution in hudi
[ https://issues.apache.org/jira/browse/HUDI-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370692#comment-17370692 ] Sagar Sumit commented on HUDI-2023: --- Validated with delta streamer and the results are summarized as below: || ||COW||MOR|| |Adding a new nullable column at root level at the end|succeeds|succeeds| |Adding a new nullable column to inner struct (at the end)|succeeds|succeeds| |Adding a new non-nullable column at root level at the end|fails|fails| |Adding a new non-nullable column to inner struct (at the end)|fails|fails | The failure after adding a new non-nullable column in case of MOR is: {code:java} Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record into new file for key impression_598 from old file file:/tmp/hudi-deltastreamer-op/impressions_mor/user_86/08046c02-14e3-4629-899a-614518dfc545-0_53-6-148_20210628211956.parquet to new file file:/tmp/hudi-deltastreamer-op/impressions_mor/user_86/08046c02-14e3-4629-899a-614518dfc545-0_8-22-301_20210628212147.parquet ... at org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:320) at org.apache.hudi.table.action.commit.AbstractMergeHelper$UpdateHandler.consumeOneRecord(AbstractMergeHelper.java:122) at org.apache.hudi.table.action.commit.AbstractMergeHelper$UpdateHandler.consumeOneRecord(AbstractMergeHelper.java:112) at org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:37) at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:121) at java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more Caused by: java.lang.RuntimeException: Null-value for required field: evolvedField at org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:194) at org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:165) at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128) at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299) at org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:89) at org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:315) {code} > Validate Schema evolution in hudi > - > > Key: HUDI-2023 > URL: https://issues.apache.org/jira/browse/HUDI-2023 > Project: Apache Hudi > Issue Type: Test > Components: Testing >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Major > > Test schema evolution in hudi and document the same -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2023) Validate Schema evolution in hudi
[ https://issues.apache.org/jira/browse/HUDI-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363845#comment-17363845 ] sivabalan narayanan commented on HUDI-2023: --- dump of steps : https://gist.github.com/nsivabalan/33147072fabf5afa9cf2dfee1734e57a > Validate Schema evolution in hudi > - > > Key: HUDI-2023 > URL: https://issues.apache.org/jira/browse/HUDI-2023 > Project: Apache Hudi > Issue Type: Test > Components: Testing >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Major > > Test schema evolution in hudi and document the same -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2023) Validate Schema evolution in hudi
[ https://issues.apache.org/jira/browse/HUDI-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363761#comment-17363761 ] sivabalan narayanan commented on HUDI-2023: --- I tested both COW and MOR for simple schema evolution of adding a new column. Here are my findings. // "succeeds" refers to write succeeded and a read following the write succeeded to read entire dataset. || ||COW||MOR|| |Adding a new nullable column at root level at the end|succeeds|succeeds| |Adding a new nullable column to inner struct (at the end)|succeeds|succeeds| |Adding a new non-nullable column at root level at the end|fails|write succeeds, but read fails as expected| |Adding a new non-nullable column to inner struct (at the end)|fails|write succeeds, but read fails as expected| Validated so far w/ spark datasource. Will update once I have results w/ delta streamer. > Validate Schema evolution in hudi > - > > Key: HUDI-2023 > URL: https://issues.apache.org/jira/browse/HUDI-2023 > Project: Apache Hudi > Issue Type: Test > Components: Testing >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Major > > Test schema evolution in hudi and document the same -- This message was sent by Atlassian Jira (v8.3.4#803005)