[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-11646: --- Attachment: hbase-11646-v10.patch I've commited v10 -- it has the same minor tweek v8 had. Thanks jingcheng, ram and anoop. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Affects Versions: hbase-11339 Reporter: Jingcheng Du Assignee: Jingcheng Du Fix For: hbase-11339 Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646-V5.diff, HBASE-11646-V6.diff, HBASE-11646-V7.diff, HBASE-11646-V9.diff, HBASE-11646.diff, hbase-11646-v10.patch, hbase-11646-v8.patch In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-11646: - Attachment: HBASE-11646-V7.diff Update the patch according Anoop's comments in RB. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Affects Versions: hbase-11339 Reporter: Jingcheng Du Assignee: Jingcheng Du Fix For: hbase-11339 Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646-V5.diff, HBASE-11646-V6.diff, HBASE-11646-V7.diff, HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-11646: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Affects Versions: hbase-11339 Reporter: Jingcheng Du Assignee: Jingcheng Du Fix For: hbase-11339 Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646-V5.diff, HBASE-11646-V6.diff, HBASE-11646-V7.diff, HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-11646: --- Attachment: hbase-11646-v8.patch attached th v8 version I tried to commit. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Affects Versions: hbase-11339 Reporter: Jingcheng Du Assignee: Jingcheng Du Fix For: hbase-11339 Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646-V5.diff, HBASE-11646-V6.diff, HBASE-11646-V7.diff, HBASE-11646.diff, hbase-11646-v8.patch In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-11646: - Attachment: HBASE-11646-V9.diff Update the patch (V9) according to Jon and Ram's comments in RB. 1. !Bytes.equals(emptyBytes, value) - valueLength!=0 2. Add comments to describe why we have to case the Store to HMobStore in DefaultMobCompactor. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Affects Versions: hbase-11339 Reporter: Jingcheng Du Assignee: Jingcheng Du Fix For: hbase-11339 Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646-V5.diff, HBASE-11646-V6.diff, HBASE-11646-V7.diff, HBASE-11646-V9.diff, HBASE-11646.diff, hbase-11646-v8.patch In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-11646: - Attachment: HBASE-11646-V6.diff Upload the latest patch (V6). Refine the code and comments, add methods to parse the value of mob ref cell. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Affects Versions: hbase-11339 Reporter: Jingcheng Du Assignee: Jingcheng Du Fix For: hbase-11339 Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646-V5.diff, HBASE-11646-V6.diff, HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-11646: --- Status: Patch Available (was: Open) Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Affects Versions: hbase-11339 Reporter: Jingcheng Du Assignee: Jingcheng Du Fix For: hbase-11339 Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646-V5.diff, HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-11646: --- Affects Version/s: hbase-11339 Fix Version/s: hbase-11339 Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Affects Versions: hbase-11339 Reporter: Jingcheng Du Assignee: Jingcheng Du Fix For: hbase-11339 Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646-V5.diff, HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-11646: - Attachment: HBASE-11646-V5.diff Update the patch after the value size of a reference cell in a mob-enabled column is changed from a long to an int. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646-V5.diff, HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-11646: - Attachment: HBASE-11646-V4.diff Update the patch to refine the class import in the Compactor. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646-V4.diff, HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-11646: - Attachment: HBASE-11646-V2.diff Refine the patch according to Jon, Ram and Anoop's comments and improvements. Thanks a lot. Upload the latest patch, please help review and comment. Thanks. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBASE-11646-V2.diff, HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-11646: - Attachment: HBASE-11646-V3.diff Refactor the code after changing the mob flag and threshold in the hcd from string to byte[]. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBASE-11646-V2.diff, HBASE-11646-V3.diff, HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-11646: - Attachment: HBASE-11646.diff The threshold might be changed. After that, some cells in the mob files need to be added back to HBase store file, some cells in the HBase store files needs to be added to mob files. These are done in the HBase compaction. And it's implemented in this patch. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-11646: --- Description: In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. was:For those MOB Cells loaded by the bulk load, they're saved in HBase. We need handle them in HBase compaction to write them to the MOB files. Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Reporter: Jingcheng Du Assignee: Jingcheng Du In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.2#6252)