Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]

2024-03-01 Thread via GitHub


subash-metica commented on issue #7057:
URL: https://github.com/apache/hudi/issues/7057#issuecomment-1973269182

   @cbomgit - What is the version you are using ? 
   
   Unfortunately, I had to not use multi-writer at this point to circumvent 
this problem. Not sure whether this exists in 0.14 version as well - I might 
upgrade it and test it later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]

2024-03-01 Thread via GitHub


cbomgit commented on issue #7057:
URL: https://github.com/apache/hudi/issues/7057#issuecomment-1973206032

   Any update on a root cause/fix? We are facing a similar issue suddenly. We 
have multi-writer with OCC. Each writer writes distinct partitions and uses 
insert_overwrite.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]

2024-01-15 Thread via GitHub


KnightChess commented on issue #7057:
URL: https://github.com/apache/hudi/issues/7057#issuecomment-1892033786

   The internal method we use for fixing the issue is quite aggressive; we 
directly catch and handle exceptions during the reading process. Our version 
now is 0.13.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]

2024-01-05 Thread via GitHub


subash-metica commented on issue #7057:
URL: https://github.com/apache/hudi/issues/7057#issuecomment-1878374530

   Hi,
   I am facing the issue again, the problem is happening for random instances 
and no common pattern I could see. 
   
   Hudi version: 0.13.1
   
   The error stack trace,
   
   ```
   org.apache.hudi.exception.HoodieIOException: Could not read commit details 
from s3:///.hoodie/20240103002413315.replacecommit.requested
at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.readDataFromPath(HoodieActiveTimeline.java:824)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.getInstantDetails(HoodieActiveTimeline.java:310)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.common.util.ClusteringUtils.getRequestedReplaceMetadata(ClusteringUtils.java:93)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.common.util.ClusteringUtils.getClusteringPlan(ClusteringUtils.java:109)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.client.BaseHoodieTableServiceClient.lambda$getInflightTimelineExcludeCompactionAndClustering$7(BaseHoodieTableServiceClient.java:595)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:178) ~[?:?]
at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625) 
~[?:?]
at 
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[?:?]
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) 
~[?:?]
at 
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) 
~[?:?]
at 
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
at 
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[?:?]
at 
org.apache.hudi.common.table.timeline.HoodieDefaultTimeline.(HoodieDefaultTimeline.java:58)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.common.table.timeline.HoodieDefaultTimeline.filter(HoodieDefaultTimeline.java:236)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.client.BaseHoodieTableServiceClient.getInflightTimelineExcludeCompactionAndClustering(BaseHoodieTableServiceClient.java:593)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.client.BaseHoodieTableServiceClient.getInstantsToRollback(BaseHoodieTableServiceClient.java:737)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollbackFailedWrites(BaseHoodieTableServiceClient.java:706)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.client.BaseHoodieWriteClient.lambda$startCommitWithTime$97cdbdca$1(BaseHoodieWriteClient.java:844)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.common.util.CleanerUtils.rollbackFailedWrites(CleanerUtils.java:156)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.client.BaseHoodieWriteClient.startCommitWithTime(BaseHoodieWriteClient.java:843)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.client.BaseHoodieWriteClient.startCommitWithTime(BaseHoodieWriteClient.java:836)
 ~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:371) 
~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:151) 
~[hudi-spark3-bundle_2.12-0.13.1-amzn-1.jar:0.13.1-amzn-1]
at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:47)
 ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0]
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
 ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0]
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
 ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0]
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
 ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0]
at 
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:104)
 ~[spark-sql_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0]
at 
org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
 ~[spark-catalyst_2.12-3.4.1-amzn-0.jar:3.4.1-amzn-0]
at 
org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:250)
 

Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]

2023-12-18 Thread via GitHub


subash-metica commented on issue #7057:
URL: https://github.com/apache/hudi/issues/7057#issuecomment-1859967411

   Hi @KnightChess , @xushiyan and @jjtjiang - have you found a fix for this 
issue or a workaround ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering [hudi]

2023-12-07 Thread via GitHub


jjtjiang commented on issue #7057:
URL: https://github.com/apache/hudi/issues/7057#issuecomment-1846470553

   @ad1happy2go 
i also face this problem  . 
   version : hudi 0.12.3
   how to   reproduce the issue:  just use the insert overwirte sql when  
insert a big table .
   here is my case: 
   row: 114800  (if the rows is smaller .eg 100 ,there will can't 
reproduce this issue)
ddl :` create table temp_db.ods_cis_corp_history_profile_hudi_t1_20231208(
 `_hoodie_is_deleted` BOOLEAN,
 `t_pre_combine_field` long,
   order_type int , 
   order_no int , 
   profile_no int , 
   profile_type string , 
   profile_cat string , 
   u_version string , 
   order_line_no int , 
   profile_c string , 
   profile_i int , 
   profile_f decimal(20,8) , 
   profile_d timestamp , 
   active string , 
   entry_datetime timestamp , 
   entry_id int , 
   h_version int )
   USING hudi
   TBLPROPERTIES (
 'hoodie.write.concurrency.mode'='optimistic_concurrency_control' ,
 'hoodie.cleaner.policy.failed.writes'='LAZY',
 
'hoodie.write.lock.provider'='org.apache.hudi.client.transaction.lock.FileSystemBasedLockProvider',
 'hoodie.write.lock.filesystem.expire'= 5,
 'primaryKey' = 'order_no,profile_type,profile_no,order_type,profile_cat',
 'type' = 'cow',
 'preCombineField' = 't_pre_combine_field')
   CLUSTERED BY ( 
 order_no,profile_type,profile_no,order_type,profile_cat) 
   INTO 2 BUCKETS;`
   
   sql: `
   insert
overwrite table temp_db.ods_cis_corp_history_profile_hudi_t1_20231208
   select
false ,
1,
order_type ,
order_no ,
profile_no ,
profile_type ,
profile_cat ,
u_version ,
order_line_no ,
profile_c ,
profile_i ,
profile_f ,
profile_d ,
active ,
entry_datetime ,
entry_id ,
h_version
   from
 temp_db.ods_cis_dbo_history_profile_tmp ; `
   insert
overwrite table temp_db.ods_cis_corp_history_profile_hudi_t1_20231208
   select
false ,
1,
order_type ,
order_no ,
profile_no ,
profile_type ,
profile_cat ,
u_version ,
order_line_no ,
profile_c ,
profile_i ,
profile_f ,
profile_d ,
active ,
entry_datetime ,
entry_id ,
h_version
   from
 temp_db.ods_cis_dbo_history_profile_tmp ; 
   ./hoodie  dir file list:
   .hoodie/.aux
   .hoodie/.heartbeat
   .hoodie/.schema
   .hoodie/.temp
   .hoodie/20231207055239027.replacecommit
   .hoodie/20231207055239027.replacecommit.inflight
   .hoodie/20231207055239027.replacecommit.requested
   .hoodie/20231207084620796.replacecommit
   .hoodie/20231207084620796.replacecommit.inflight
   .hoodie/20231207084620796.replacecommit.requested
   .hoodie/20231207100918624.rollback
   .hoodie/20231207100918624.rollback.inflight
   .hoodie/20231207100918624.rollback.requested
   .hoodie/20231207100923823.rollback
   .hoodie/20231207100923823.rollback.inflight
   .hoodie/20231207100923823.rollback.requested
   .hoodie/20231207102003686.replacecommit
   .hoodie/20231207102003686.replacecommit.inflight
   .hoodie/20231207102003686.replacecommit.requested
   .hoodie/archived
   .hoodie/hoodie.properties
   .hoodie/metadata
   
   
   we cant see there is no file 20231207071610343.replacecommit.requested  . 
but  the program needs to find this file. so it failed .this make me wonder.
   
   hoodie.properties:
   hoodie.table.precombine.field=t_pre_combine_field
   hoodie.datasource.write.drop.partition.columns=false
   hoodie.table.type=COPY_ON_WRITE
   hoodie.archivelog.folder=archived
   hoodie.timeline.layout.version=1
   hoodie.table.version=5
   hoodie.table.metadata.partitions=files
   
hoodie.table.recordkey.fields=order_no,profile_type,profile_no,order_type,profile_cat
   hoodie.database.name=temp_db
   hoodie.datasource.write.partitionpath.urlencode=false
   
hoodie.table.keygenerator.class=org.apache.hudi.keygen.NonpartitionedKeyGenerator
   hoodie.table.name=ods_cis_corp_history_profile_hudi_t1_20231207
   hoodie.datasource.write.hive_style_partitioning=true
   hoodie.table.checksum=2702244832