[ 
https://issues.apache.org/jira/browse/HUDI-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18005225#comment-18005225
 ] 

sivabalan narayanan edited comment on HUDI-9590 at 7/14/25 7:09 PM:
--------------------------------------------------------------------

So, the core is intact. We just need to check the writer, reader and how 
configs are enabled. 

 

We have two configs for this.

1. "hoodie.optimized.log.blocks.scan.enable" for data table. default is false. 

2. "hoodie.metadata.optimized.log.blocks.scan.enable" for metadata table. 
default is false.

 

Lets go through old way of reading the logs (AbstractHoodieLogRecordScanner) : 

1 -> only compactor has this config wired in. For readers, I only see 

RealtimeCompactedRecordReader fetching configs from jobConf if set. 

So, already for readers, we have a gap. 

2 -> enable only on writer side. I don't think we ever wired this in for 
readers properly. I don't see any table level config. So, we can auto detect 
this for readers as well.

Here also, readers have a gap on enabling it.

 

Now, lets go through FG reader way of reading log files. 

Writers will be the same as above. So, from a reader standpoint, I don't see we 
are wiring the "enableOptimizedLogBlocksScan" anywhere (

HoodieMergedLogRecordReader.withOptimizedLogBlocksScan()) 

 

So, readers in general need to be fixed properly across both old and new code 
paths. I remember Uber is using LogCompaction within their env. So, they will 
likely expect feature support for 0.x tables on this. 

 

 

 


was (Author: shivnarayan):
So, the core is intact. We just need to check the writer, reader and how 
configs are enabled. 

 

We have two configs for this.

1. "hoodie.optimized.log.blocks.scan.enable" for data table. default is false. 

2. "hoodie.metadata.optimized.log.blocks.scan.enable" for metadata table. 
default is false.

 

1 -> only compactor has this config wired in. For readers, I only see 

RealtimeCompactedRecordReader fetching configs from jobConf if set. 

So, already for readers, we have a gap. 



2 -> enable only on writer side. I don't think we ever wired this in for 
readers properly. I don't see any table level config. So, we can auto detect 
this for readers as well.

Here also, readers have a gap on enabling it.

 

 

> OptimizedLogBlockScan support w/ FG reader
> ------------------------------------------
>
>                 Key: HUDI-9590
>                 URL: https://issues.apache.org/jira/browse/HUDI-9590
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: reader-core, writer-core
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Major
>
> We have compacted log block support in older version of Log Record Reader. 
> we need to bring in parity to FG reader as well. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to