Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]
subash-metica commented on issue #7057: URL: https://github.com/apache/hudi/issues/7057#issuecomment-1973269182 @cbomgit - What is the version you are using ? Unfortunately, I had to not use multi-writer at this point to circumvent this problem. Not sure whether this exists in 0.14 version as well - I might upgrade it and test it later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]
cbomgit commented on issue #7057: URL: https://github.com/apache/hudi/issues/7057#issuecomment-1973206032 Any update on a root cause/fix? We are facing a similar issue suddenly. We have multi-writer with OCC. Each writer writes distinct partitions and uses insert_overwrite. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]
KnightChess commented on issue #7057: URL: https://github.com/apache/hudi/issues/7057#issuecomment-1892033786 The internal method we use for fixing the issue is quite aggressive; we directly catch and handle exceptions during the reading process. Our version now is 0.13.1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]
subash-metica commented on issue #7057: URL: https://github.com/apache/hudi/issues/7057#issuecomment-1878374530 Hi, I am facing the issue again, the problem is happening for random instances and no common pattern I could see. Hudi version: 0.13.1 The error stack trace, ``` org.apache.hudi.exception.HoodieIOException: Could not read commit details from s3:///.hoodie/20240103002413315.replacecommit.requested at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.readDataFromPath(HoodieActiveTimeline.java:824) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.getInstantDetails(HoodieActiveTimeline.java:310) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.common.util.ClusteringUtils.getRequestedReplaceMetadata(ClusteringUtils.java:93) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.common.util.ClusteringUtils.getClusteringPlan(ClusteringUtils.java:109) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.client.BaseHoodieTableServiceClient.lambda$getInflightTimelineExcludeCompactionAndClustering$7(BaseHoodieTableServiceClient.java:595) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:178) ~[?:?] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625) ~[?:?] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[?:?] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[?:?] at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) ~[?:?] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[?:?] at org.apache.hudi.common.table.timeline.HoodieDefaultTimeline.(HoodieDefaultTimeline.java:58) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.common.table.timeline.HoodieDefaultTimeline.filter(HoodieDefaultTimeline.java:236) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.client.BaseHoodieTableServiceClient.getInflightTimelineExcludeCompactionAndClustering(BaseHoodieTableServiceClient.java:593) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.client.BaseHoodieTableServiceClient.getInstantsToRollback(BaseHoodieTableServiceClient.java:737) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.client.BaseHoodieTableServiceClient.rollbackFailedWrites(BaseHoodieTableServiceClient.java:706) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.client.BaseHoodieWriteClient.lambda$startCommitWithTime$97cdbdca$1(BaseHoodieWriteClient.java:844) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.common.util.CleanerUtils.rollbackFailedWrites(CleanerUtils.java:156) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.client.BaseHoodieWriteClient.startCommitWithTime(BaseHoodieWriteClient.java:843) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.client.BaseHoodieWriteClient.startCommitWithTime(BaseHoodieWriteClient.java:836) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:371) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:151) ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1] at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:47) ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0] at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75) ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0] at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73) ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0] at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84) ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0] at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:104) ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0] at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107) ~[spark-catalyst_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0] at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:250) ~[spark-s
Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]
subash-metica commented on issue #7057: URL: https://github.com/apache/hudi/issues/7057#issuecomment-1859967411 Hi @KnightChess , @xushiyan and @jjtjiang - have you found a fix for this issue or a workaround ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]
jjtjiang commented on issue #7057: URL: https://github.com/apache/hudi/issues/7057#issuecomment-1846470553 @ad1happy2go i also face this problem . version : hudi 0.12.3 how to reproduce the issue: just use the insert overwirte sql when insert a big table . here is my case: row: 114800 (if the rows is smaller .eg 100 ,there will can't reproduce this issue) ddl :` create table temp_db.ods_cis_corp_history_profile_hudi_t1_20231208( `_hoodie_is_deleted` BOOLEAN, `t_pre_combine_field` long, order_type int , order_no int , profile_no int , profile_type string , profile_cat string , u_version string , order_line_no int , profile_c string , profile_i int , profile_f decimal(20,8) , profile_d timestamp , active string , entry_datetime timestamp , entry_id int , h_version int ) USING hudi TBLPROPERTIES ( 'hoodie.write.concurrency.mode'='optimistic_concurrency_control' , 'hoodie.cleaner.policy.failed.writes'='LAZY', 'hoodie.write.lock.provider'='org.apache.hudi.client.transaction.lock.FileSystemBasedLockProvider', 'hoodie.write.lock.filesystem.expire'= 5, 'primaryKey' = 'order_no,profile_type,profile_no,order_type,profile_cat', 'type' = 'cow', 'preCombineField' = 't_pre_combine_field') CLUSTERED BY ( order_no,profile_type,profile_no,order_type,profile_cat) INTO 2 BUCKETS;` sql: ` insert overwrite table temp_db.ods_cis_corp_history_profile_hudi_t1_20231208 select false , 1, order_type , order_no , profile_no , profile_type , profile_cat , u_version , order_line_no , profile_c , profile_i , profile_f , profile_d , active , entry_datetime , entry_id , h_version from temp_db.ods_cis_dbo_history_profile_tmp ; ` insert overwrite table temp_db.ods_cis_corp_history_profile_hudi_t1_20231208 select false , 1, order_type , order_no , profile_no , profile_type , profile_cat , u_version , order_line_no , profile_c , profile_i , profile_f , profile_d , active , entry_datetime , entry_id , h_version from temp_db.ods_cis_dbo_history_profile_tmp ; ./hoodie dir file list: .hoodie/.aux .hoodie/.heartbeat .hoodie/.schema .hoodie/.temp .hoodie/20231207055239027.replacecommit .hoodie/20231207055239027.replacecommit.inflight .hoodie/20231207055239027.replacecommit.requested .hoodie/20231207084620796.replacecommit .hoodie/20231207084620796.replacecommit.inflight .hoodie/20231207084620796.replacecommit.requested .hoodie/20231207100918624.rollback .hoodie/20231207100918624.rollback.inflight .hoodie/20231207100918624.rollback.requested .hoodie/20231207100923823.rollback .hoodie/20231207100923823.rollback.inflight .hoodie/20231207100923823.rollback.requested .hoodie/20231207102003686.replacecommit .hoodie/20231207102003686.replacecommit.inflight .hoodie/20231207102003686.replacecommit.requested .hoodie/archived .hoodie/hoodie.properties .hoodie/metadata we cant see there is no file 20231207071610343.replacecommit.requested . but the program needs to find this file. so it failed .this make me wonder. hoodie.properties: hoodie.table.precombine.field=t_pre_combine_field hoodie.datasource.write.drop.partition.columns=false hoodie.table.type=COPY_ON_WRITE hoodie.archivelog.folder=archived hoodie.timeline.layout.version=1 hoodie.table.version=5 hoodie.table.metadata.partitions=files hoodie.table.recordkey.fields=order_no,profile_type,profile_no,order_type,profile_cat hoodie.database.name=temp_db hoodie.datasource.write.partitionpath.urlencode=false hoodie.table.keygenerator.class=org.apache.hudi.keygen.NonpartitionedKeyGenerator hoodie.table.name=ods_cis_corp_history_profile_hudi_t1_20231207 hoodie.datasource.write.hive_style_partitioning=true hoodie.table.checksum=2702244832 hoodie.table.create.schema={"type"\:"record","name"\:"ods_cis_corp_history_profile_hudi_t1_20231207_record","namespace"\:"hoodie.ods_cis_corp_history_profile_hudi_t1_20231207","fields"\:[{"name"\:"_hoodie_commit_time","type"\:["string","null"]},{"name"\:"_hoodie_commit_seqno","type"\:["string","null"]},{"name"\:"_hoodie_record_key","type"\:["string","null"]},{"name"\:"_hoodie_partition_path","type"\:["string","null"]},{"name"\:"_hoodie_file_name","type"\:["string","null"]},{"name"\:"_hoodie_is_deleted","type"\:["boolean","null"]},{"name"\:"t_pre_combine_field","type"\:["long","null"]},{"name"\:"order_type","type"\:["int","null"]},{"name"\:"order_no","type"\:["int","null"]},{"name"\:"profile_n