[jira] [Created] (HBASE-29295) Optimize in-memory representation of mapreduce's TableSnapshotInputFormat's split objects

Ujjawal Kumar (Jira) Thu, 08 May 2025 06:17:42 -0700

Ujjawal Kumar created HBASE-29295:
-------------------------------------

             Summary: Optimize in-memory representation of mapreduce's 
TableSnapshotInputFormat's split objects
                 Key: HBASE-29295
                 URL: https://issues.apache.org/jira/browse/HBASE-29295
             Project: HBase
          Issue Type: Improvement
          Components: mapreduce, snapshots
            Reporter: Ujjawal Kumar



Similar to HBASE-24859, we have seen that while performing reads via snapshot 
in an MR job the memory consumption increases a lot on the client side.

It happens due to the same reason mentioned in the HBASE-24859, scan is 
embedded within TableSnapshotInputFormat's split object and can explode 
client's memory usage for tables with large no of reasons. 

The solution would be same as the other Jira - Don't store the scan object in 
the split, instead read it via the conf while initializing the record reader 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HBASE-29295) Optimize in-memory representation of mapreduce's TableSnapshotInputFormat's split objects

Reply via email to