[ 
https://issues.apache.org/jira/browse/HUDI-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5464:
-----------------------------
    Sprint: 0.13.0 Final Sprint, 0.13.0 Final Sprint 2, 0.13.0 Final Sprint 3, 
Sprint 2023-01-31, Sprint 2023-02-14  (was: 0.13.0 Final Sprint, 0.13.0 Final 
Sprint 2, 0.13.0 Final Sprint 3, Sprint 2023-01-31)

> Fix instantiation of a new partition in MDT re-using the same instant time as 
> a regular commit
> ----------------------------------------------------------------------------------------------
>
>                 Key: HUDI-5464
>                 URL: https://issues.apache.org/jira/browse/HUDI-5464
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: metadata
>            Reporter: sivabalan narayanan
>            Assignee: Raymond Xu
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 0.14.0
>
>
> we re-use the same instant time as the commit being applied to MDT while 
> instantiating a new partition in MDT. this needs to be fixed. 
>  
> for eg, lets say we have 10 commits w/ already FILES enabled. 
> for C11, we are enabling col-stats. 
> after data table business, when we enter metadata writer instantiation, we 
> deduct that col-stats has to be instantiated and then instantiate using DC11. 
> in MDT timeline, we see dc11.req. dc11.inflight and dc11.complete. and then 
> we go ahead and apply actual C11 from DT to MDT (dc11.inflight and 
> dc11.complete is updated). here, we overwrite the same DC11 w/ records 
> pertaining to C11. 
> which is buggy. we definitely need to fix this. 
> We can add a suffix to C11 (say C11_003 or C11_001) as we do for compaction 
> and clean in MDT so that any additional operation in MDT has a diff commit 
> time format. For everything else, it should match w/ DT 1 on 1. 
>  
>  
> Impact:
> We are over-riding the same DC for two purposes which is bad. if there is a 
> crash after initializing col-stats and before applying actual C11(in above 
> context), we might mistakenly rollback col-stats initialization, but still 
> table config could say that col stats is fully ready to be served. But while 
> reading MDT, we may not read DC11 since its a failed commit. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to