[ https://issues.apache.org/jira/browse/HBASE-29295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ujjawal Kumar updated HBASE-29295: ---------------------------------- Attachment: Screenshot 2025-05-08 at 6.26.15 PM.png > Optimize in-memory representation of mapreduce's TableSnapshotInputFormat's > split objects > ----------------------------------------------------------------------------------------- > > Key: HBASE-29295 > URL: https://issues.apache.org/jira/browse/HBASE-29295 > Project: HBase > Issue Type: Improvement > Components: mapreduce, snapshots > Reporter: Ujjawal Kumar > Priority: Major > Attachments: Screenshot 2025-05-08 at 6.26.15 PM.png > > > Similar to HBASE-24859, we have seen that while performing reads via snapshot > in an MR job the memory consumption increases a lot on the client side. > It happens due to the same reason mentioned in the HBASE-24859, scan is > embedded within TableSnapshotInputFormat's split object and can explode > client's memory usage for tables with large no of reasons. > The solution would be same as the other Jira - Don't store the scan object in > the split, instead read it via the conf while initializing the record reader -- This message was sent by Atlassian Jira (v8.20.10#820010)