[jira] [Updated] (HBASE-29295) Optimize in-memory representation of mapreduce's TableSnapshotInputFormat's split objects
[ https://issues.apache.org/jira/browse/HBASE-29295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ujjawal Kumar updated HBASE-29295: -- Description: Similar to HBASE-24859, we have seen that while performing reads via snapshot in an MR job the memory consumption increases a lot on the client side. It happens due to the same reason mentioned in the HBASE-24859, scan is embedded within TableSnapshotInputFormat's split object and can explode client's memory usage for tables with large no of regions. The solution would be same as the other Jira - Don't store the scan object in the split, instead read it via the conf while initializing the record reader was: Similar to HBASE-24859, we have seen that while performing reads via snapshot in an MR job the memory consumption increases a lot on the client side. It happens due to the same reason mentioned in the HBASE-24859, scan is embedded within TableSnapshotInputFormat's split object and can explode client's memory usage for tables with large no of reasons. The solution would be same as the other Jira - Don't store the scan object in the split, instead read it via the conf while initializing the record reader > Optimize in-memory representation of mapreduce's TableSnapshotInputFormat's > split objects > - > > Key: HBASE-29295 > URL: https://issues.apache.org/jira/browse/HBASE-29295 > Project: HBase > Issue Type: Improvement > Components: mapreduce, snapshots >Affects Versions: 2.5.10 >Reporter: Ujjawal Kumar >Assignee: Ujjawal Kumar >Priority: Major > Attachments: Screenshot 2025-05-08 at 6.26.15 PM.png > > > Similar to HBASE-24859, we have seen that while performing reads via snapshot > in an MR job the memory consumption increases a lot on the client side. > It happens due to the same reason mentioned in the HBASE-24859, scan is > embedded within TableSnapshotInputFormat's split object and can explode > client's memory usage for tables with large no of regions. > The solution would be same as the other Jira - Don't store the scan object in > the split, instead read it via the conf while initializing the record reader -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-29295) Optimize in-memory representation of mapreduce's TableSnapshotInputFormat's split objects
[ https://issues.apache.org/jira/browse/HBASE-29295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ujjawal Kumar updated HBASE-29295: -- Affects Version/s: 2.5.10 > Optimize in-memory representation of mapreduce's TableSnapshotInputFormat's > split objects > - > > Key: HBASE-29295 > URL: https://issues.apache.org/jira/browse/HBASE-29295 > Project: HBase > Issue Type: Improvement > Components: mapreduce, snapshots >Affects Versions: 2.5.10 >Reporter: Ujjawal Kumar >Assignee: Ujjawal Kumar >Priority: Major > Attachments: Screenshot 2025-05-08 at 6.26.15 PM.png > > > Similar to HBASE-24859, we have seen that while performing reads via snapshot > in an MR job the memory consumption increases a lot on the client side. > It happens due to the same reason mentioned in the HBASE-24859, scan is > embedded within TableSnapshotInputFormat's split object and can explode > client's memory usage for tables with large no of reasons. > The solution would be same as the other Jira - Don't store the scan object in > the split, instead read it via the conf while initializing the record reader -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-29295) Optimize in-memory representation of mapreduce's TableSnapshotInputFormat's split objects
[ https://issues.apache.org/jira/browse/HBASE-29295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ujjawal Kumar updated HBASE-29295: -- Attachment: Screenshot 2025-05-08 at 6.26.15 PM.png > Optimize in-memory representation of mapreduce's TableSnapshotInputFormat's > split objects > - > > Key: HBASE-29295 > URL: https://issues.apache.org/jira/browse/HBASE-29295 > Project: HBase > Issue Type: Improvement > Components: mapreduce, snapshots >Reporter: Ujjawal Kumar >Priority: Major > Attachments: Screenshot 2025-05-08 at 6.26.15 PM.png > > > Similar to HBASE-24859, we have seen that while performing reads via snapshot > in an MR job the memory consumption increases a lot on the client side. > It happens due to the same reason mentioned in the HBASE-24859, scan is > embedded within TableSnapshotInputFormat's split object and can explode > client's memory usage for tables with large no of reasons. > The solution would be same as the other Jira - Don't store the scan object in > the split, instead read it via the conf while initializing the record reader -- This message was sent by Atlassian Jira (v8.20.10#820010)
