[ https://issues.apache.org/jira/browse/HUDI-6627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinish Reddy updated HUDI-6627: ------------------------------- Description: When source returns an empty option in deltastreamer, the writer schema is null. This causes an NPE with the table schema validation in spark write client causing the below exception. We should skip this validation when writer schema is null. {code:java} org.apache.hudi.exception.HoodieInsertException: Failed insert schema compability check. at org.apache.hudi.table.HoodieTable.validateInsertSchema(HoodieTable.java:851) at org.apache.hudi.client.SparkRDDWriteClient.insert(SparkRDDWriteClient.java:185) at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:690) at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:396) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.ingestOnce(HoodieDeltaStreamer.java:876) at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) at com.onehouse.hudi.OnehouseDeltaStreamer$MultiTableSyncService.lambda$null$1(OnehouseDeltaStreamer.java:319) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hudi.exception.HoodieException: Failed to read schema/check compatibility for base path s3a://onehouse-customer-bucket-2451e78f/data-lake/chandra_data_lake_default/xml_flatten_struct_test at org.apache.hudi.table.HoodieTable.validateSchema(HoodieTable.java:830) at org.apache.hudi.table.HoodieTable.validateInsertSchema(HoodieTable.java:849) ... 10 more Caused by: java.lang.NullPointerException at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1158) at org.apache.avro.Schema$Parser.parse(Schema.java:1418) at org.apache.hudi.avro.HoodieAvroUtils.createHoodieWriteSchema(HoodieAvroUtils.java:302) at org.apache.hudi.table.HoodieTable.validateSchema(HoodieTable.java:826) ... 11 more {code} was: When source returns an empty option in deltastreamer, the writer schema is null. This causes an NPE with the table schema validation in spark write client causing the below exception. We should skip this validation when writer schema is null. {quote}org.apache.hudi.exception.HoodieInsertException: Failed insert schema compability check. at org.apache.hudi.table.HoodieTable.validateInsertSchema(HoodieTable.java:851) at org.apache.hudi.client.SparkRDDWriteClient.insert(SparkRDDWriteClient.java:185) at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:690) at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:396) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.ingestOnce(HoodieDeltaStreamer.java:876) at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) at com.onehouse.hudi.OnehouseDeltaStreamer$MultiTableSyncService.lambda$null$1(OnehouseDeltaStreamer.java:319) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hudi.exception.HoodieException: Failed to read schema/check compatibility for base path s3a://onehouse-customer-bucket-2451e78f/data-lake/chandra_data_lake_default/xml_flatten_struct_test at org.apache.hudi.table.HoodieTable.validateSchema(HoodieTable.java:830) at org.apache.hudi.table.HoodieTable.validateInsertSchema(HoodieTable.java:849) ... 10 more Caused by: java.lang.NullPointerException at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1158) at org.apache.avro.Schema$Parser.parse(Schema.java:1418) at org.apache.hudi.avro.HoodieAvroUtils.createHoodieWriteSchema(HoodieAvroUtils.java:302) at org.apache.hudi.table.HoodieTable.validateSchema(HoodieTable.java:826) ... 11 more{quote} > Spark write client fails when write schema is null > -------------------------------------------------- > > Key: HUDI-6627 > URL: https://issues.apache.org/jira/browse/HUDI-6627 > Project: Apache Hudi > Issue Type: Bug > Reporter: Vinish Reddy > Priority: Minor > > When source returns an empty option in deltastreamer, the writer schema is > null. This causes an NPE with the table schema validation in spark write > client causing the below exception. We should skip this validation when > writer schema is null. > {code:java} > org.apache.hudi.exception.HoodieInsertException: Failed insert schema > compability check. > at > org.apache.hudi.table.HoodieTable.validateInsertSchema(HoodieTable.java:851) > at > org.apache.hudi.client.SparkRDDWriteClient.insert(SparkRDDWriteClient.java:185) > at > org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:690) > at > org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:396) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.ingestOnce(HoodieDeltaStreamer.java:876) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > com.onehouse.hudi.OnehouseDeltaStreamer$MultiTableSyncService.lambda$null$1(OnehouseDeltaStreamer.java:319) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: org.apache.hudi.exception.HoodieException: Failed to read > schema/check compatibility for base path > s3a://onehouse-customer-bucket-2451e78f/data-lake/chandra_data_lake_default/xml_flatten_struct_test > at > org.apache.hudi.table.HoodieTable.validateSchema(HoodieTable.java:830) > at > org.apache.hudi.table.HoodieTable.validateInsertSchema(HoodieTable.java:849) > ... 10 more > Caused by: java.lang.NullPointerException > at > com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1158) > at org.apache.avro.Schema$Parser.parse(Schema.java:1418) > at > org.apache.hudi.avro.HoodieAvroUtils.createHoodieWriteSchema(HoodieAvroUtils.java:302) > at > org.apache.hudi.table.HoodieTable.validateSchema(HoodieTable.java:826) > ... 11 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)