hudi-bot opened a new issue, #15850:
URL: https://github.com/apache/hudi/issues/15850
*Background:*
We find that Spark-Hudi insert data will return a *HoodieException: (Part -)
field not found in record. Acceptable fields were :[uuid, name, price]*
{code:bash}
......
at
org.apache.hudi.index.simple.HoodieSimpleIndex.fetchRecordLocationsForAffectedPartitions(HoodieSimpleIndex.java:142)
at
org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocationInternal(HoodieSimpleIndex.java:113)
at
org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocation(HoodieSimpleIndex.java:91)
at
org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:51)
at
org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:34)
at
org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:53)
... 52 more
Caused by: org.apache.hudi.exception.HoodieException: (Part -) field not
found in record. Acceptable fields were :[uuid, name, price]
at
org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:530)
at
org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$write$11(HoodieSparkSqlWriter.scala:305)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194)
at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
at
org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1509)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit
time 20230317222153522
at
org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:64)
{code}
{*}Steps to Reproduce{*}:
{code:sql}
-- 1. create a table without preCombineKey
CREATE TABLE default.test_hudi_default (
uuid int,
name string,
price double
) USING hudi;
-- 2. config write operation to upsert
set hoodie.datasource.write.operation=upsert;
-- 3. insert data and exception occurs
insert into default.test_hudi_default select 1, 'name1', 1.1;
{code}
*Root Cause:*
Hudi does not support upsert for table without preCombineKey, but this
exception message may confuse the users.
*Improvement:*
We can check the user configured write operation and provide a more specific
exception message, it will help user understand what's wrong immediately.
## JIRA info
- Link: https://issues.apache.org/jira/browse/HUDI-5949
- Type: Improvement
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]