[ 
https://issues.apache.org/jira/browse/HUDI-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yue Zhang updated HUDI-5588:
----------------------------
    Fix Version/s: 0.14.0
                       (was: 0.13.1)

> Fix Metadata table validator to deduce valid partitions when first commit 
> where partition was added is failed
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-5588
>                 URL: https://issues.apache.org/jira/browse/HUDI-5588
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: tests-ci
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Critical
>             Fix For: 0.14.0
>
>
> Metadata validation sometimes fails due to test code issue. 
> FS based listing shows 0 partitions, while MDT listing shows all 100 
> partitions. Its an issue w/ validator code.
>  
> actual timeline:
> ls -ltr tbl1/hoodie_table/.hoodie/ total 720 drwxr-xr-x 2 nsb staff 64 Jan 17 
> 18:45 archived drwxr-xr-x 4 nsb staff 128 Jan 17 18:45 metadata -rw-r--r-- 1 
> nsb staff 808 Jan 17 18:45 hoodie.properties -rw-r--r-- 1 nsb staff 1230 Jan 
> 17 18:45 20230117214546000.rollback.requested -rw-r--r-- 1 nsb staff 0 Jan 17 
> 18:45 20230117214546000.rollback.inflight -rw-r--r-- 1 nsb staff 1414 Jan 17 
> 18:46 20230117214546000.rollback -rw-r--r-- 1 nsb staff 1230 Jan 17 18:47 
> 20230117214701512.rollback.requested -rw-r--r-- 1 nsb staff 0 Jan 17 18:47 
> 20230117214701512.rollback.inflight -rw-r--r-- 1 nsb staff 1414 Jan 17 18:47 
> 20230117214701512.rollback -rw-r--r-- 1 nsb staff 15492 Jan 17 18:48 
> 20230117214831503.rollback.requested -rw-r--r-- 1 nsb staff 0 Jan 17 18:48 
> 20230117214831503.rollback.inflight -rw-r--r-- 1 nsb staff 0 Jan 17 18:48 
> 20230117214848714.deltacommit.requested -rw-r--r-- 1 nsb staff 16359 Jan 17 
> 18:48 20230117214831503.rollback -rw-r--r-- 1 nsb staff 69698 Jan 17 18:49 
> 20230117214848714.deltacommit.inflight -rw-r--r-- 1 nsb staff 0 Jan 17 18:50 
> 20230117215006714.deltacommit.requested -rw-r--r-- 1 nsb staff 94423 Jan 17 
> 18:50 20230117214848714.deltacommit -rw-r--r-- 1 nsb staff 142198 Jan 17 
> 18:50 20230117215006714.deltacommit.inflight
>  
>  
> atleast there is one successfull commit 20230117214848714.deltacommit.
>  
> but our validator code checks for creation time of partition and considers 
> that as valid partition only if that particular commit is succeded.
> {code:java}
> List<String> allPartitionPathsFromFS = 
> FSUtils.getAllPartitionPaths(engineContext, basePath, false, 
> cfg.assumeDatePartitioning);
> HoodieTimeline completedTimeline = 
> metaClient.getActiveTimeline().filterCompletedInstants();
> // ignore partitions created by uncommitted ingestion.
> allPartitionPathsFromFS = 
> allPartitionPathsFromFS.stream().parallel().filter(part -> {
>   HoodiePartitionMetadata hoodiePartitionMetadata =
>       new HoodiePartitionMetadata(metaClient.getFs(), 
> FSUtils.getPartitionPath(basePath, part));
>   Option<String> instantOption = 
> hoodiePartitionMetadata.readPartitionCreatedCommitTime();
>   if (instantOption.isPresent()) {
>     String instantTime = instantOption.get();
>     return completedTimeline.containsOrBeforeTimelineStarts(instantTime);
>   } else {
>     return false;
>   }
> }).collect(Collectors.toList()); {code}
>  
> we need to fix this
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to