[ 
https://issues.apache.org/jira/browse/HUDI-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-3301:
----------------------------
    Labels: HUDI-bug  (was: )

> MergedLogRecordReader inline reading should be stateless and thread safe
> ------------------------------------------------------------------------
>
>                 Key: HUDI-3301
>                 URL: https://issues.apache.org/jira/browse/HUDI-3301
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: metadata
>            Reporter: Manoj Govindassamy
>            Assignee: Ethan Guo
>            Priority: Blocker
>              Labels: HUDI-bug
>             Fix For: 0.11.0
>
>
> Metadata table inline reading (enable.full.scan.log.files = false) today 
> alters instance member fields and not thread safe.
>  
> When the inline reading is enabled, HoodieMetadataMergedLogRecordReader 
> doesn't do full read of log and base files and doesn't fill in the 
> ExternalSpillableMap records cache. Each getRecordsByKeys() thereby will 
> re-read the log and base files by design. But the issue here is this reading 
> alters the instance members and the filled in records are relevant only for 
> that request. Any concurrent getRecordsByKeys() is also modifying the member 
> variable leading to NPE.
>  
> To avoid this, a temporary fix of making getRecordsByKeys() a synchronized 
> method has been pushed to master. But this fix doesn't solve all usecases. We 
> need to make the whole class stateless and thread safe for inline reading.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to