rubenssoto commented on issue #2563: URL: https://github.com/apache/hudi/issues/2563#issuecomment-783862335
`Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record into new file for key 7176859 from old file s3://dl/courier_api/customer_address/3ee388f2-fa45-437a-a279-d9e3e3369bbd-0_9-137-2635_20210223033155.parquet to new file s3://ld/courier_api/customer_address/3ee388f2-fa45-437a-a279-d9e3e3369bbd-0_9-377-7189_20210223035129.parquet with writerSchema { "type" : "record", "name" : "customer_address_record", "namespace" : "hoodie.customer_address", "fields" : [ { "name" : "_hoodie_commit_time", "type" : [ "null", "string" ], "doc" : "", "default" : null }, { "name" : "_hoodie_commit_seqno", "type" : [ "null", "string" ], "doc" : "", "default" : null }, { "name" : "_hoodie_record_key", "type" : [ "null", "string" ], "doc" : "", "default" : null }, { "name" : "_hoodie_partition_path", "type" : [ "null", "string" ], "doc" : "", "default" : null }, { "name" : "_hoodie_file_name", "type" : [ "null", "string" ], "doc" : "", "default" : null }, { "name" : "Op", "type" : [ "string", "null" ] }, { "name" : "LineCreatedTimestamp", "type" : [ "string", "null" ] }, { "name" : "created_date", "type" : [ { "type" : "long", "logicalType" : "timestamp-micros" }, "null" ] }, { "name" : "updated_date", "type" : [ { "type" : "long", "logicalType" : "timestamp-micros" }, "null" ] }, { "name" : "id", "type" : [ "int", "null" ] }, { "name" : "address_type", "type" : [ "string", "null" ] }, { "name" : "name", "type" : [ "string", "null" ] }, { "name" : "customer_email", "type" : [ "string", "null" ] }, { "name" : "street", "type" : [ "string", "null" ] }, { "name" : "number", "type" : [ "string", "null" ] }, { "name" : "address_line2", "type" : [ "string", "null" ] }, { "name" : "city", "type" : [ "string", "null" ] }, { "name" : "province", "type" : [ "string", "null" ] }, { "name" : "zipcode", "type" : [ "string", "null" ] }, { "name" : "country", "type" : [ "string", "null" ] }, { "name" : "neighborhood", "type" : [ "string", "null" ] }, { "name" : "latitude", "type" : [ "double", "null" ] }, { "name" : "longitude", "type" : [ "double", "null" ] }, { "name" : "commit_version", "type" : "long" }, { "name" : "_hoodie_is_deleted", "type" : "boolean" } ] } at org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:256) at org.apache.hudi.table.action.commit.AbstractMergeHelper$UpdateHandler.consumeOneRecord(AbstractMergeHelper.java:122) at org.apache.hudi.table.action.commit.AbstractMergeHelper$UpdateHandler.consumeOneRecord(AbstractMergeHelper.java:112) at org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:37) at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:121) at java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more Caused by: java.lang.RuntimeException: Null-value for required field: commit_version at org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:194) at org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:165) at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128) at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:299) at org.apache.hudi.io.storage.HoodieParquetWriter.writeAvro(HoodieParquetWriter.java:94) at org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:251) ... 8 more Driver stacktrace: at jobs.TableProcessor.start(TableProcessor.scala:104) at TableProcessorWrapper$.$anonfun$main$2(TableProcessorWrapper.scala:23) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659) at scala.util.Success.$anonfun$map$1(Try.scala:255) at scala.util.Success.map(Try.scala:213) at scala.concurrent.Future.$anonfun$map$1(Future.scala:292) at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33) at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) ApplicationMaster host: ip-10-0-53-212.us-west-2.compute.internal ApplicationMaster RPC port: 41723 queue: default start time: 1614052265461 final status: FAILED tracking URL: http://ip-10-0-49-168.us-west-2.compute.internal:20888/proxy/application_1613496813774_2805/ user: hadoop` @nsivabalan I had this error, I have a table without the column commit_version(it is a column that I created), I add the column commit_version in my script and the new data try to update the old one. Is this problem is addressed too? Thank you so much. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org