[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2022-07-12 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell updated HBASE-22749:

Fix Version/s: 2.5.0

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBASE-22749_nightly_Unit_Test_Results.csv, 
> HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2020-02-19 Thread Sean Busbey (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-22749:

Fix Version/s: 3.0.0
 Release Note: 

MOB compaction is now handled in-line with per-region compaction on region
  servers

- regions with mob data store per-hfile metadata about which mob hfiles are
  referenced
- admin requested major compaction will also rewrite MOB files; periodic RS
  initiated major compaction will not
- periodically a chore in the master will initiate a major compaction that
  will rewrite MOB values to ensure it happens. controlled by
  'hbase.mob.compaction.chore.period'. default is weekly
- control how many RS the chore requests major compaction on in parallel
  with 'hbase.mob.major.compaction.region.batch.size'. default is as
  parallel as possible.
- periodic chore in master will scan backing hfiles from regions to get the
  set of referenced mob hfiles and archive those that are no longer
  referenced. control period with 'hbase.master.mob.cleaner.period'
- Optionally, RS that are compacting mob files can limit write
  amplification by not rewriting values from mob hfiles over a certain size
  limit. opt-in by setting 'hbase.mob.compaction.type' to 'optimized'.
  control threshold by 'hbase.mob.compactions.max.file.size'.
  default is 1GiB
- Should smoothly integrate with existing MOB users via rolling upgrade.
  will delay old MOB file cleanup until per-region compaction has managed
  to compact each region at least once so that used mob hfile metadata can
  be gathered.

This improvement obviates the dataloss in HBASE-22075.
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBASE-22749_nightly_Unit_Test_Results.csv, 
> HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2020-02-19 Thread Sean Busbey (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-22749:

Attachment: HBASE-22749_nightly_Unit_Test_Results.csv

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBASE-22749_nightly_Unit_Test_Results.csv, 
> HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2020-02-19 Thread Sean Busbey (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-22749:

Attachment: HBASE-22749_nightly_unit_test_analyzer.pdf

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBASE-22749_nightly_Unit_Test_Results.csv, 
> HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Patch Available  (was: Open)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-master-v4.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Open  (was: Patch Available)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Patch Available  (was: Open)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-master-v3.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Open  (was: Patch Available)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.1.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.2.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v1.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.3.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v3.0.pdf

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, 
> HBase-MOB-2.0-v2.3.pdf, HBase-MOB-2.0-v2.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-13 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v2.3.pdf

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, 
> HBase-MOB-2.0-v2.3.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-10-12 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-master-v2.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-10-12 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Patch Available  (was: Open)

Code cleanup. unit tests fixes. 

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-13 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-master-v1.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-13 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBASE-22749-branch-2.2-v3.patch)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-12 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-branch-2.2-v4.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v3.patch, 
> HBASE-22749-branch-2.2-v4.patch, HBase-MOB-2.0-v1.pdf, 
> HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-11 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-branch-2.2-v3.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v3.patch, HBase-MOB-2.0-v1.pdf, 
> HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-04 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v2.2.pdf

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-08-07 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Summary: Distributed MOB compactions   (was: HBase MOB 2.0)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)