[ https://issues.apache.org/jira/browse/HUDI-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sagar Sumit updated HUDI-4238: ------------------------------ Fix Version/s: 0.12.1 (was: 0.12.0) > Revisit TestCOWDataSourceStorage#testCopyOnWriteStorage > ------------------------------------------------------- > > Key: HUDI-4238 > URL: https://issues.apache.org/jira/browse/HUDI-4238 > Project: Apache Hudi > Issue Type: Task > Components: tests-ci > Reporter: Rahil Chertara > Priority: Major > Fix For: 0.12.1 > > > Within the pr (Support Hadoop 3.x Hive 3.x and Spark 3.x) > [https://github.com/apache/hudi/pull/5786,|https://github.com/apache/hudi/pull/5786] > > > The testCopyOnWriteStorage has an issue with the test case where `nation` is > added to the recordKeys. When debugging further it seems that this is due to > an issue with avro 1.10.2 being used since it adds the following to the schema > ``` > "nation":"Canada" > ``` > instead of adding > ``` > "nation": { > "bytes":"Canada" > } > ``` > This leads to the exception later for this test case since when nation is > being retrieved from the record, since `getNestedFieldVal` expects the value > to be nested as opposed to a String. > ``` > > at > org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:514) > > at > org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldValAsString(HoodieAvroUtils.java:487) > > at org.apache.hudi.keygen.KeyGenUtils.getRecordKey(KeyGenUtils.java:96) > at > org.apache.hudi.keygen.ComplexAvroKeyGenerator.getRecordKey(ComplexAvroKeyGenerator.java:47) > > at > org.apache.hudi.keygen.ComplexKeyGenerator.getRecordKey(ComplexKeyGenerator.java:53) > > at org.apache.hudi.keygen.BaseKeyGenerator.getKey(BaseKeyGenerator.java:65) > at > org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$write$10(HoodieSparkSqlWriter.scala:279) > > at scala.collection.Iterator$$anon$10.next(Iterator.scala:459) > at > org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:224) > > at > org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:352) > > at > org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1498) > > at > org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1408) > > at > org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1472) > at > org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1295) > > at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:335) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) > at > org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) > > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) > at org.apache.spark.scheduler.Task.run(Task.scala:131) > at > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) > > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:750) > ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)