[ 
https://issues.apache.org/jira/browse/HUDI-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398447#comment-17398447
 ] 

Prashant Wason commented on HUDI-2303:
--------------------------------------

So this is an issue with HoodieMetadataFileSystemView not overriding sync() 
where it should update the Metadata Reader to reflect the new state of the 
dataset. So the MetadataReader opened by the TimelineServer is not refreshed 
correctly and hence it returns the older (not containing the latest log file) 
listing and the compaction misses the latest log block.

This patch is already covered in [https://github.com/apache/hudi/pull/3210] 
which is about to be commited. So I wont be raising a new PR for this fix. 

Let's verify this one 3210 is merged.

> TestMereIntoLogOnlyTable with metadata enabled surfaces likely bug
> ------------------------------------------------------------------
>
>                 Key: HUDI-2303
>                 URL: https://issues.apache.org/jira/browse/HUDI-2303
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Common Core
>            Reporter: Udit Mehrotra
>            Assignee: Prashant Wason
>            Priority: Major
>
> While enabling Metadata as part of 
> [https://github.com/apache/hudi/pull/3411/] one of the test that fails is 
> *TestMereIntoLogOnlyTable*.
> Upon looking a bit, what I found is after the final *Merge* command there is 
> an inline compaction that is triggered. The parquet file formed as part of 
> the compaction misses out on the data from the latest log file right before 
> compaction.
> I think it might be because of metadata returning an incorrect list for 
> compaction, missing out on the latest log file.
> cc [~pwason] [~vinoth]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to