hudi-bot opened a new issue, #15850:
URL: https://github.com/apache/hudi/issues/15850

   *Background:*
   
   We find that Spark-Hudi insert data will return a *HoodieException: (Part -) 
field not found in record. Acceptable fields were :[uuid, name, price]*
   {code:bash}
     ......
        at 
org.apache.hudi.index.simple.HoodieSimpleIndex.fetchRecordLocationsForAffectedPartitions(HoodieSimpleIndex.java:142)
        at 
org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocationInternal(HoodieSimpleIndex.java:113)
        at 
org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocation(HoodieSimpleIndex.java:91)
        at 
org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:51)
        at 
org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:34)
        at 
org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:53)
        ... 52 more
   Caused by: org.apache.hudi.exception.HoodieException: (Part -) field not 
found in record. Acceptable fields were :[uuid, name, price]
        at 
org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:530)
        at 
org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$write$11(HoodieSparkSqlWriter.scala:305)
        at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
        at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
        at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194)
        at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
        at 
org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
        at org.apache.spark.scheduler.Task.run(Task.scala:131)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1509)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit 
time 20230317222153522
        at 
org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:64)
   
   {code}
   {*}Steps to Reproduce{*}:
   {code:sql}
   -- 1. create a table without preCombineKey
   CREATE TABLE default.test_hudi_default (
     uuid int,
     name string,
     price double
   ) USING hudi;
   
   -- 2. config write operation to upsert
   set hoodie.datasource.write.operation=upsert;
   
   -- 3. insert data and exception occurs
   insert into default.test_hudi_default select 1, 'name1', 1.1;
   {code}
   
   *Root Cause:*
   Hudi does not support upsert for table without preCombineKey, but this 
exception message may confuse the users.
   
   *Improvement:*
   We can check the user configured write operation and provide a more specific 
exception message, it will help user understand what's wrong immediately. 
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-5949
   - Type: Improvement


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to