[ https://issues.apache.org/jira/browse/HUDI-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhaojing Yu updated HUDI-4330: ------------------------------ Fix Version/s: 0.13.0 > NPE when trying to upsert into a dataset with no Meta Fields > ------------------------------------------------------------ > > Key: HUDI-4330 > URL: https://issues.apache.org/jira/browse/HUDI-4330 > Project: Apache Hudi > Issue Type: Bug > Reporter: Alexey Kudinkin > Assignee: Raymond Xu > Priority: Critical > Fix For: 0.13.0 > > > When trying to upsert into a dataset with Meta Fields being disabled, you > will encounter obscure NPE like below: > {code:java} > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 25 in stage 20.0 failed 4 times, most recent failure: Lost task 25.3 in > stage 20.0 (TID 4110) (ip-172-31-20-53.us-west-2.compute.internal executor > 7): java.lang.RuntimeException: > org.apache.hudi.exception.HoodieIndexException: Error checking bloom filter > index. > at > org.apache.hudi.client.utils.LazyIterableIterator.next(LazyIterableIterator.java:121) > at > scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:46) > at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:513) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:140) > at > org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) > at org.apache.spark.scheduler.Task.run(Task.scala:131) > at > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: org.apache.hudi.exception.HoodieIndexException: Error checking > bloom filter index. > at > org.apache.hudi.index.bloom.HoodieBloomIndexCheckFunction$LazyKeyCheckIterator.computeNext(HoodieBloomIndexCheckFunction.java:110) > at > org.apache.hudi.index.bloom.HoodieBloomIndexCheckFunction$LazyKeyCheckIterator.computeNext(HoodieBloomIndexCheckFunction.java:60) > at > org.apache.hudi.client.utils.LazyIterableIterator.next(LazyIterableIterator.java:119) > ... 16 more > Caused by: java.lang.NullPointerException > at > org.apache.hudi.io.HoodieKeyLookupHandle.addKey(HoodieKeyLookupHandle.java:88) > at > org.apache.hudi.index.bloom.HoodieBloomIndexCheckFunction$LazyKeyCheckIterator.computeNext(HoodieBloomIndexCheckFunction.java:92) > ... 18 more {code} > Instead, we could be more explicit as to why this could have happened > (meta-fields disabled -> no bloom filter created -> unable to do upserts) -- This message was sent by Atlassian Jira (v8.20.10#820010)