[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Kyle Purtell updated HBASE-22749: Fix Version/s: 2.5.0 > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0 > > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, > HBASE-22749_nightly_Unit_Test_Results.csv, > HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-22749: Fix Version/s: 3.0.0 Release Note: MOB compaction is now handled in-line with per-region compaction on region servers - regions with mob data store per-hfile metadata about which mob hfiles are referenced - admin requested major compaction will also rewrite MOB files; periodic RS initiated major compaction will not - periodically a chore in the master will initiate a major compaction that will rewrite MOB values to ensure it happens. controlled by 'hbase.mob.compaction.chore.period'. default is weekly - control how many RS the chore requests major compaction on in parallel with 'hbase.mob.major.compaction.region.batch.size'. default is as parallel as possible. - periodic chore in master will scan backing hfiles from regions to get the set of referenced mob hfiles and archive those that are no longer referenced. control period with 'hbase.master.mob.cleaner.period' - Optionally, RS that are compacting mob files can limit write amplification by not rewriting values from mob hfiles over a certain size limit. opt-in by setting 'hbase.mob.compaction.type' to 'optimized'. control threshold by 'hbase.mob.compactions.max.file.size'. default is 1GiB - Should smoothly integrate with existing MOB users via rolling upgrade. will delay old MOB file cleanup until per-region compaction has managed to compact each region at least once so that used mob hfile metadata can be gathered. This improvement obviates the dataloss in HBASE-22075. Resolution: Fixed Status: Resolved (was: Patch Available) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, > HBASE-22749_nightly_Unit_Test_Results.csv, > HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-22749: Attachment: HBASE-22749_nightly_Unit_Test_Results.csv > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, > HBASE-22749_nightly_Unit_Test_Results.csv, > HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-22749: Attachment: HBASE-22749_nightly_unit_test_analyzer.pdf > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, > HBASE-22749_nightly_Unit_Test_Results.csv, > HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Status: Patch Available (was: Open) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, > HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: HBASE-22749-master-v4.patch > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, > HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Status: Open (was: Patch Available) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, > HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Status: Patch Available (was: Open) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBASE-22749-master-v3.patch, HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: HBASE-22749-master-v3.patch > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBASE-22749-master-v3.patch, HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Status: Open (was: Patch Available) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: (was: HBase-MOB-2.0-v2.1.pdf) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: (was: HBase-MOB-2.0-v2.2.pdf) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: (was: HBase-MOB-2.0-v1.pdf) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: (was: HBase-MOB-2.0-v2.pdf) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: (was: HBase-MOB-2.0-v2.3.pdf) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: HBase-MOB-2.0-v3.0.pdf > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, > HBase-MOB-2.0-v2.3.pdf, HBase-MOB-2.0-v2.pdf, HBase-MOB-2.0-v3.0.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: HBase-MOB-2.0-v2.3.pdf > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, > HBase-MOB-2.0-v2.3.pdf, HBase-MOB-2.0-v2.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: HBASE-22749-master-v2.patch > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, > HBase-MOB-2.0-v2.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Status: Patch Available (was: Open) Code cleanup. unit tests fixes. > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, > HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, > HBase-MOB-2.0-v2.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: HBASE-22749-master-v1.patch > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, > HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: (was: HBASE-22749-branch-2.2-v3.patch) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v4.patch, > HBASE-22749-master-v1.patch, HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, > HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: HBASE-22749-branch-2.2-v4.patch > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v3.patch, > HBASE-22749-branch-2.2-v4.patch, HBase-MOB-2.0-v1.pdf, > HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: HBASE-22749-branch-2.2-v3.patch > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBASE-22749-branch-2.2-v3.patch, HBase-MOB-2.0-v1.pdf, > HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Attachment: HBase-MOB-2.0-v2.2.pdf > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, > HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (HBASE-22749) Distributed MOB compactions
[ https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-22749: -- Summary: Distributed MOB compactions (was: HBase MOB 2.0) > Distributed MOB compactions > > > Key: HBASE-22749 > URL: https://issues.apache.org/jira/browse/HBASE-22749 > Project: HBase > Issue Type: New Feature > Components: mob >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Major > Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, > HBase-MOB-2.0-v2.pdf > > > There are several drawbacks in the original MOB 1.0 (Moderate Object > Storage) implementation, which can limit the adoption of the MOB feature: > # MOB compactions are executed in a Master as a chore, which limits > scalability because all I/O goes through a single HBase Master server. > # Yarn/Mapreduce framework is required to run MOB compactions in a scalable > way, but this won’t work in a stand-alone HBase cluster. > # Two separate compactors for MOB and for regular store files and their > interactions can result in a data loss (see HBASE-22075) > The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible > implementation, which is free of the above drawbacks and can be used as a > drop in replacement in existing MOB deployments. So, these are design goals > of a MOB 2.0: > # Make MOB compactions scalable without relying on Yarn/Mapreduce framework > # Provide unified compactor for both MOB and regular store files > # Make it more robust especially w.r.t. to data losses. > # Simplify and reduce the overall MOB code. > # Provide 100% compatible implementation with MOB 1.0. > # No migration of data should be required between MOB 1.0 and MOB 2.0 - just > software upgrade. -- This message was sent by Atlassian JIRA (v7.6.14#76016)