zlinsc opened a new issue, #10187: URL: https://github.com/apache/hudi/issues/10187
### Describe the problem you faced I ran two flink job to write same rows into a MOR table but got NumberFormatException. The error shows the job had wrong formatted logfile name. my flink hudi options: ```scala FlinkOptions.PATH.key() -> "hdfs://master-node:50070/tmp/order_hudi", FlinkOptions.TABLE_TYPE.key() -> HoodieTableType.MERGE_ON_READ.name(), FlinkOptions.OPERATION.key() -> WriteOperationType.UPSERT.value(), HoodieIndexConfig.INDEX_TYPE.key() -> BUCKET.name, HoodieIndexConfig.BUCKET_INDEX_ENGINE_TYPE.key() -> BucketIndexEngineType.CONSISTENT_HASHING.name(), HoodieLayoutConfig.LAYOUT_TYPE.key() -> HoodieStorageLayout.LayoutType.BUCKET.name, HoodieLayoutConfig.LAYOUT_PARTITIONER_CLASS_NAME.key() -> HoodieLayoutConfig.SIMPLE_BUCKET_LAYOUT_PARTITIONER_CLASS_NAME, HoodieWriteConfig.NUM_RETRIES_ON_CONFLICT_FAILURES.key() -> "3", HoodieWriteConfig.WRITE_CONCURRENCY_MODE.key() -> "optimistic_concurrency_control", HoodieCleanConfig.FAILED_WRITES_CLEANER_POLICY.key() -> "LAZY", HoodieLockConfig.LOCK_PROVIDER_CLASS_NAME.key() -> "org.apache.hudi.hive.transaction.lock.HiveMetastoreBasedLockProvider", HoodieLockConfig.WRITE_CONFLICT_RESOLUTION_STRATEGY_CLASS_NAME.key() -> classOf[BucketIndexConcurrentFileWritesConflictResolutionStrategy].getName, LockConfiguration.HIVE_DATABASE_NAME_PROP_KEY -> "hudi_db", LockConfiguration.HIVE_TABLE_NAME_PROP_KEY -> "order", LockConfiguration.HIVE_METASTORE_URI_PROP_KEY -> "thrift://slave-node:9083", ``` ### Steps to reproduce the behavior: 1. set bucket index and OCC options to flink hudi sink builder 2. start up the flink job twice 3. wait for data generation and get exception after checkpoint ### Environment Description Hudi version : 0.11 (master in 2022-03-23 UTC+8) Flink version : 1.17 Storage (HDFS/S3/GCS..) : HDFS Running on Docker? (yes/no) : yes ### Additional context <img width="662" alt="image" src="https://github.com/apache/hudi/assets/5824531/61951f8a-6fc7-4f3c-8ad6-1a379b2143bf"> ### Stacktrace org.apache.flink.util.FlinkException: Global failure triggered by OperatorCoordinator for 'consistent_bucket_write: default_database.order_hudi -> Sink: clean_commits' (operator fcdc4c0d75f8e2dd2bb05e2b43035080). at org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder$LazyInitializedCoordinatorContext.failJob(OperatorCoordinatorHolder.java:600) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$start$0(StreamWriteOperatorCoordinator.java:196) at org.apache.hudi.sink.utils.NonThrownExecutor.handleException(NonThrownExecutor.java:142) at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:133) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hudi.exception.HoodieException: Executor executes action [initialize instant 20231128100512025] error ... 6 more Caused by: java.lang.NumberFormatException: For input string: "74e26508" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.hudi.index.bucket.BucketIdentifier.bucketIdFromFileId(BucketIdentifier.java:79) at org.apache.hudi.client.transaction.BucketIndexConcurrentFileWritesConflictResolutionStrategy.lambda$hasConflict$0(BucketIndexConcurrentFileWritesConflictResolutionStrategy.java:44) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1580) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.apache.hudi.client.transaction.BucketIndexConcurrentFileWritesConflictResolutionStrategy.hasConflict(BucketIndexConcurrentFileWritesConflictResolutionStrategy.java:45) at org.apache.hudi.client.utils.TransactionUtils.lambda$resolveWriteConflictIfAny$0(TransactionUtils.java:86) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742) at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742) at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) at org.apache.hudi.client.utils.TransactionUtils.resolveWriteConflictIfAny(TransactionUtils.java:83) at org.apache.hudi.client.BaseHoodieClient.resolveWriteConflict(BaseHoodieClient.java:202) at org.apache.hudi.client.BaseHoodieWriteClient.preCommit(BaseHoodieWriteClient.java:346) at org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:232) at org.apache.hudi.client.HoodieFlinkWriteClient.commit(HoodieFlinkWriteClient.java:112) at org.apache.hudi.client.HoodieFlinkWriteClient.commit(HoodieFlinkWriteClient.java:75) at org.apache.hudi.client.BaseHoodieWriteClient.commit(BaseHoodieWriteClient.java:201) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.doCommit(StreamWriteOperatorCoordinator.java:564) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.commitInstant(StreamWriteOperatorCoordinator.java:540) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.commitInstant(StreamWriteOperatorCoordinator.java:509) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$initInstant$6(StreamWriteOperatorCoordinator.java:419) at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:130) ... 3 more -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org