[jira] [Commented] (HBASE-23887) BlockCache performance improve by reduce eviction rate

2020-09-22 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200361#comment-17200361
 ] 

Vladimir Rodionov commented on HBASE-23887:
---

{quote}
Thats why this feature will work when the data really can't fit into BlockCache 
-> eviction rate really work hard and it usually means reading blocks evenly 
distributed.
{quote}

For such use cases (if they exist in a wild) cache is no help and must be 
disabled. HBase can do it per table/cf. I do not see any improvements in this 
"feature". It is just ""use cache slightly when data is not catchable" type of 
improvemt.

> BlockCache performance improve by reduce eviction rate
> --
>
> Key: HBASE-23887
> URL: https://issues.apache.org/jira/browse/HBASE-23887
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache, Performance
>Reporter: Danil Lipovoy
>Assignee: Danil Lipovoy
>Priority: Minor
> Attachments: 1582787018434_rs_metrics.jpg, 
> 1582801838065_rs_metrics_new.png, BC_LongRun.png, 
> BlockCacheEvictionProcess.gif, BlockCacheEvictionProcess.gif, cmp.png, 
> evict_BC100_vs_BC23.png, eviction_100p.png, eviction_100p.png, 
> eviction_100p.png, gc_100p.png, graph.png, image-2020-06-07-08-11-11-929.png, 
> image-2020-06-07-08-19-00-922.png, image-2020-06-07-12-07-24-903.png, 
> image-2020-06-07-12-07-30-307.png, image-2020-06-08-17-38-45-159.png, 
> image-2020-06-08-17-38-52-579.png, image-2020-06-08-18-35-48-366.png, 
> image-2020-06-14-20-51-11-905.png, image-2020-06-22-05-57-45-578.png, 
> ratio.png, ratio2.png, read_requests_100pBC_vs_23pBC.png, requests_100p.png, 
> requests_100p.png, requests_new2_100p.png, requests_new_100p.png, scan.png, 
> scan_and_gets.png, scan_and_gets2.png, wave.png
>
>
> Hi!
> I first time here, correct me please if something wrong.
> All latest information is here:
> [https://docs.google.com/document/d/1X8jVnK_3lp9ibpX6lnISf_He-6xrHZL0jQQ7hoTV0-g/edit?usp=sharing]
> I want propose how to improve performance when data in HFiles much more than 
> BlockChache (usual story in BigData). The idea - caching only part of DATA 
> blocks. It is good becouse LruBlockCache starts to work and save huge amount 
> of GC.
> Sometimes we have more data than can fit into BlockCache and it is cause a 
> high rate of evictions. In this case we can skip cache a block N and insted 
> cache the N+1th block. Anyway we would evict N block quite soon and that why 
> that skipping good for performance.
> ---
> Some information below isn't  actual
> ---
>  
>  
> Example:
> Imagine we have little cache, just can fit only 1 block and we are trying to 
> read 3 blocks with offsets:
>  124
>  198
>  223
> Current way - we put the block 124, then put 198, evict 124, put 223, evict 
> 198. A lot of work (5 actions).
> With the feature - last few digits evenly distributed from 0 to 99. When we 
> divide by modulus we got:
>  124 -> 24
>  198 -> 98
>  223 -> 23
> It helps to sort them. Some part, for example below 50 (if we set 
> *hbase.lru.cache.data.block.percent* = 50) go into the cache. And skip 
> others. It means we will not try to handle the block 198 and save CPU for 
> other job. In the result - we put block 124, then put 223, evict 124 (3 
> actions).
> See the picture in attachment with test below. Requests per second is higher, 
> GC is lower.
>  
>  The key point of the code:
>  Added the parameter: *hbase.lru.cache.data.block.percent* which by default = 
> 100
>   
>  But if we set it 1-99, then will work the next logic:
>   
>   
> {code:java}
> public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory) {   
>   if (cacheDataBlockPercent != 100 && buf.getBlockType().isData())      
> if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) 
>   return;    
> ... 
> // the same code as usual
> }
> {code}
>  
> Other parameters help to control when this logic will be enabled. It means it 
> will work only while heavy reading going on.
> hbase.lru.cache.heavy.eviction.count.limit - set how many times have to run 
> eviction process that start to avoid of putting data to BlockCache
>  hbase.lru.cache.heavy.eviction.bytes.size.limit - set how many bytes have to 
> evicted each time that start to avoid of putting data to BlockCache
> By default: if 10 times (100 secunds) evicted more than 10 MB (each time) 
> then we start to skip 50% of data blocks.
>  When heavy evitions process end then new logic off and will put into 
> BlockCache all blocks again.
>   
> Descriptions of the test:
> 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.
> 4 RegionServers
> 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF)
> Total BlockCache Size = 48 Gb (8 % of data in HFiles)
> Random read in 20 threads
>  
> I am going to ma

[jira] [Commented] (HBASE-14847) Add FIFO compaction section to HBase book

2020-08-18 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-14847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17179985#comment-17179985
 ] 

Vladimir Rodionov commented on HBASE-14847:
---

Sure, go ahead.

> Add FIFO compaction section to HBase book
> -
>
> Key: HBASE-14847
> URL: https://issues.apache.org/jira/browse/HBASE-14847
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Vladimir Rodionov
>Priority: Major
>
> HBASE-14468 introduced new compaction policy. Book needs to be updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24101) Correct snapshot handling

2020-04-18 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-24101.
---
Resolution: Not A Problem

> Correct snapshot handling
> -
>
> Key: HBASE-24101
> URL: https://issues.apache.org/jira/browse/HBASE-24101
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob, snapshots
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>
> Reopening this umbrella to address correct snapshot handling. Particularly, 
> the following scenario must be verified:
> # load data to a table
> # take snapshot
> # major compact table
> # run mob file cleaner chore
> # load data to table
> # restore table from snapshot into another table
> # verify data integrity
> # restore table from snapshot into original table
> # verify data integrity



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-22749) Distributed MOB compactions

2020-04-18 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-22749.
---
Resolution: Fixed

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBASE-22749_nightly_Unit_Test_Results.csv, 
> HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24101) Correct snapshot handling

2020-04-18 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17086728#comment-17086728
 ] 

Vladimir Rodionov commented on HBASE-24101:
---

Verified, not the issue.

> Correct snapshot handling
> -
>
> Key: HBASE-24101
> URL: https://issues.apache.org/jira/browse/HBASE-24101
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob, snapshots
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>
> Reopening this umbrella to address correct snapshot handling. Particularly, 
> the following scenario must be verified:
> # load data to a table
> # take snapshot
> # major compact table
> # run mob file cleaner chore
> # load data to table
> # restore table from snapshot into another table
> # verify data integrity
> # restore table from snapshot into original table
> # verify data integrity



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24101) Correct snapshot handling

2020-04-01 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-24101:
-

 Summary: Correct snapshot handling
 Key: HBASE-24101
 URL: https://issues.apache.org/jira/browse/HBASE-24101
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23723) Add tests for MOB compaction on a table created from snapshot

2020-04-01 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073132#comment-17073132
 ] 

Vladimir Rodionov commented on HBASE-23723:
---

Reopening this umbrella to address correct snapshot handling. Particularly, the 
following scenario must be verified:
#load data to a table
#take snapshot
#major compact table
#run mob file cleaner chore
#load data to table
#restore table from snapshot into another table
#verify data integrity
#restore table from snapshot into original table
#verify data integrity

> Add tests for MOB compaction on a table created from snapshot
> -
>
> Key: HBASE-23723
> URL: https://issues.apache.org/jira/browse/HBASE-23723
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction, mob
>Reporter: Vladimir Rodionov
>Assignee: Sean Busbey
>Priority: Blocker
>
> How does code handle snapshot naming convention for MOB files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-23723) Add tests for MOB compaction on a table created from snapshot

2020-04-01 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073132#comment-17073132
 ] 

Vladimir Rodionov edited comment on HBASE-23723 at 4/1/20, 7:38 PM:


Reopening this umbrella to address correct snapshot handling. Particularly, the 
following scenario must be verified:
# load data to a table
# take snapshot
# major compact table
# run mob file cleaner chore
# load data to table
# restore table from snapshot into another table
# verify data integrity
# restore table from snapshot into original table
# verify data integrity


was (Author: vrodionov):
Reopening this umbrella to address correct snapshot handling. Particularly, the 
following scenario must be verified:
#load data to a table
#take snapshot
#major compact table
#run mob file cleaner chore
#load data to table
#restore table from snapshot into another table
#verify data integrity
#restore table from snapshot into original table
#verify data integrity

> Add tests for MOB compaction on a table created from snapshot
> -
>
> Key: HBASE-23723
> URL: https://issues.apache.org/jira/browse/HBASE-23723
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction, mob
>Reporter: Vladimir Rodionov
>Assignee: Sean Busbey
>Priority: Blocker
>
> How does code handle snapshot naming convention for MOB files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-22749) Distributed MOB compactions

2020-04-01 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073122#comment-17073122
 ] 

Vladimir Rodionov edited comment on HBASE-22749 at 4/1/20, 7:37 PM:


Reopening this umbrella to address correct snapshot handling. Particularly, the 
following scenario must be verified:

# load data to a table
# take snapshot
# major compact table
# run mob file cleaner chore
# load data to table
# restore table from snapshot into another table
# verify data integrity 
# restore table from snapshot into original table
# verify data integrity


was (Author: vrodionov):
Reopening this umbrella to address correct snapshot handling 

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBASE-22749_nightly_Unit_Test_Results.csv, 
> HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-22749) Distributed MOB compactions

2020-04-01 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov reopened HBASE-22749:
---

Reopening this umbrella to address correct snapshot handling 

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBASE-22749_nightly_Unit_Test_Results.csv, 
> HBASE-22749_nightly_unit_test_analyzer.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23363) MobCompactionChore takes a long time to complete once job

2020-03-05 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23363.
---
Resolution: Won't Fix

HBASE-22749 has introduced distributed MOB compaction, which significantly 
improves performance. Distributed MOB compaction will be back-ported to 2.x 
branches soon. 

> MobCompactionChore takes a long time to complete once job
> -
>
> Key: HBASE-23363
> URL: https://issues.apache.org/jira/browse/HBASE-23363
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.1, 2.2.2
>Reporter: Bo Cui
>Priority: Major
> Attachments: image-2019-12-04-11-01-20-352.png
>
>
> mob table compcation is done in master
>  poolSize of hbase choreService is 1
>  if hbase has 1000 mob table,MobCompactionChore takes a long time to complete 
> once job, other chore need to wait
> !image-2019-12-04-11-01-20-352.png!
> {code:java}
> MobCompactionChore#chore() {
>...
>for (TableDescriptor htd : map.values()) {
>   ...
>   for (ColumnFamilyDescriptor hcd : htd.getColumnFamilies()) {
>  if hcd is mob{
> MobUtils.doMobCompaction;
>  }
>   }
>   ...
>}
>...
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22075) Potential data loss when MOB compaction fails

2020-03-05 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22075:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

This is problem has been addressed in HBASE-22749.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.1.10, 2.2.5, 2.0.7
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23887) BlockCache performance improve

2020-02-24 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17043810#comment-17043810
 ] 

Vladimir Rodionov commented on HBASE-23887:
---

So, basically, you propose to cache only some %% of data blocks randomly? Hmm. 
Do you do a lot of large scans? Large scans trash block cache if not set to 
bypass it. Setting block cache disabled for large scan operation can help.

> BlockCache performance improve
> --
>
> Key: HBASE-23887
> URL: https://issues.apache.org/jira/browse/HBASE-23887
> Project: HBase
>  Issue Type: New Feature
>Reporter: Danil Lipovoy
>Priority: Minor
> Attachments: cmp.png
>
>
> Hi!
> I first time here, correct me please if something wrong.
> I want propose how to improve performance when data in HFiles much more than 
> BlockChache (usual story in BigData). The idea - caching only part of DATA 
> blocks. It is good becouse LruBlockCache starts to work and save huge amount 
> of GC. See the picture in attachment with test below. Requests per second is 
> higher, GC is lower.
>  
> The key point of the code:
> Added the parameter: *hbase.lru.cache.data.block.percent* which by default = 
> 100
>  
> But if we set it 0-99, then will work the next logic:
>  
>  
> {code:java}
> public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean 
> inMemory) {   
>   if (cacheDataBlockPercent != 100 && buf.getBlockType().isData())      
> if (cacheKey.getOffset() % 100 >= cacheDataBlockPercent) 
>   return;    
> ... 
> // the same code as usual
> }
> {code}
>  
>  
> Descriptions of the test:
> 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.
> 4 RegionServers
> 4 tables by 64 regions by 1.88 Gb data in each = 600 Gb total (only FAST_DIFF)
> Total BlockCache Size = 48 Gb (8 % of data in HFiles)
> Random read in 20 threads
>  
> I am going to make Pull Request, hope it is right way to make some 
> contribution in this cool product.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-23854) Documentation update of external_apis.adoc#example-scala-code

2020-02-18 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov reassigned HBASE-23854:
-

Assignee: (was: Vladimir Rodionov)

> Documentation update of external_apis.adoc#example-scala-code
> -
>
> Key: HBASE-23854
> URL: https://issues.apache.org/jira/browse/HBASE-23854
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Michael Heil
>Priority: Trivial
>  Labels: beginner
> Attachments: HBASE-23854.patch
>
>
> Update the Example Scala Code in the Reference Guide as it contains 
> deprecated content such as 
>  * new HBaseConfiguration()
>  * new HTable(conf, "mytable")
>  * add(Bytes.toBytes("ids"),Bytes.toBytes("id1"),Bytes.toBytes("one"))
> Replace it with:
>  * HBaseConfiguration.create()
>  * TableName.valueOf({color:#6a8759}"mytable"{color})
>  * 
> addColumn(Bytes.toBytes({color:#6a8759}"ids"{color}){color:#cc7832},{color}Bytes.toBytes({color:#6a8759}"id1"{color}){color:#cc7832},{color}Bytes.toBytes({color:#6a8759}"one"{color}))



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23840) Revert optimized IO back to general compaction during upgrade/migration process

2020-02-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23840.
---
Resolution: Fixed

> Revert optimized IO back to general compaction during upgrade/migration 
> process 
> 
>
> Key: HBASE-23840
> URL: https://issues.apache.org/jira/browse/HBASE-23840
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> Optimized mode IO compaction may leave old MOB file, which size is above 
> threshold as is and don't compact it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23840) Revert optimized IO back to general compaction during upgrade/migration process

2020-02-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-23840:
--
Summary: Revert optimized IO back to general compaction during 
upgrade/migration process   (was: Revert optimized IO backt to general 
compaction during upgrade/migration process )

> Revert optimized IO back to general compaction during upgrade/migration 
> process 
> 
>
> Key: HBASE-23840
> URL: https://issues.apache.org/jira/browse/HBASE-23840
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> Optimized mode IO compaction may leave old MOB file, which size is above 
> threshold as is and don't compact it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23840) Revert optimized IO backt to general compaction during upgrade/migration process

2020-02-14 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23840:
-

 Summary: Revert optimized IO backt to general compaction during 
upgrade/migration process 
 Key: HBASE-23840
 URL: https://issues.apache.org/jira/browse/HBASE-23840
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Optimized mode IO compaction may leave old MOB file, which size is above 
threshold as is and don't compact it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23724) Change code in StoreFileInfo to use regex matcher for mob files.

2020-01-22 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23724:
-

 Summary: Change code in StoreFileInfo to use regex matcher for mob 
files.
 Key: HBASE-23724
 URL: https://issues.apache.org/jira/browse/HBASE-23724
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Currently it sits on top of other regex with additional logic added. Code 
should simplified.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23723) Add tests for MOB compaction on a table created from snapshot

2020-01-22 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23723:
-

 Summary: Add tests for MOB compaction on a table created from 
snapshot
 Key: HBASE-23723
 URL: https://issues.apache.org/jira/browse/HBASE-23723
 Project: HBase
  Issue Type: Sub-task
 Environment: How does code  handle snapshot naming convention for MOB 
files.
Reporter: Vladimir Rodionov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23571) Handle CompactType.MOB correctly

2019-12-12 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23571:
-

 Summary: Handle CompactType.MOB correctly
 Key: HBASE-23571
 URL: https://issues.apache.org/jira/browse/HBASE-23571
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Client facing feature, should be supported or at least properly handled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-12-09 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991906#comment-16991906
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

Created new PR, old one was closed as obsolete.

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23363) MobCompactionChore takes a long time to complete once job

2019-12-03 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987548#comment-16987548
 ] 

Vladimir Rodionov commented on HBASE-23363:
---

Please, refer to HBASE-22749, for the new distributed MOB compaction 
implementation.  This is going to be next MOB soon. I do not think, anybody 
will be working on optimizing old MOB compaction.

> MobCompactionChore takes a long time to complete once job
> -
>
> Key: HBASE-23363
> URL: https://issues.apache.org/jira/browse/HBASE-23363
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.1, 2.2.2
>Reporter: Bo Cui
>Priority: Major
> Attachments: image-2019-12-04-11-01-20-352.png
>
>
> mob table compcation is done in master
>  poolSize of hbase choreService is 1
>  if hbase has 1000 mob table,MobCompactionChore takes a long time to complete 
> once job, other chore need to wait
> !image-2019-12-04-11-01-20-352.png!
> {code:java}
> MobCompactionChore#chore() {
>...
>for (TableDescriptor htd : map.values()) {
>   ...
>   for (ColumnFamilyDescriptor hcd : htd.getColumnFamilies()) {
>  if hcd is mob{
> MobUtils.doMobCompaction;
>  }
>   }
>   ...
>}
>...
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Patch Available  (was: Open)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-master-v4.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Open  (was: Patch Available)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Patch Available  (was: Open)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-master-v3.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-27 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Open  (was: Patch Available)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23189) Finalize I/O optimized MOB compaction

2019-11-22 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980442#comment-16980442
 ] 

Vladimir Rodionov commented on HBASE-23189:
---

Closing, passes stress tests up to 6M (above 6M HBase fails with 
NotServingRegionExceptions, which is not related to the feature but a master 
branch stability issue). will mark this feature as *experimental* in release 
notes. 

> Finalize I/O optimized MOB compaction
> -
>
> Key: HBASE-23189
> URL: https://issues.apache.org/jira/browse/HBASE-23189
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> +corresponding test cases
> The current code for I/O optimized compaction has not been tested and 
> verified yet. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23189) Finalize I/O optimized MOB compaction

2019-11-22 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23189.
---
Resolution: Fixed

> Finalize I/O optimized MOB compaction
> -
>
> Key: HBASE-23189
> URL: https://issues.apache.org/jira/browse/HBASE-23189
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> +corresponding test cases
> The current code for I/O optimized compaction has not been tested and 
> verified yet. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23189) Finalize I/O optimized MOB compaction

2019-11-20 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978874#comment-16978874
 ] 

Vladimir Rodionov commented on HBASE-23189:
---

Pushed first implementation to parent's PR branch.

> Finalize I/O optimized MOB compaction
> -
>
> Key: HBASE-23189
> URL: https://issues.apache.org/jira/browse/HBASE-23189
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> +corresponding test cases
> The current code for I/O optimized compaction has not been tested and 
> verified yet. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23189) Finalize I/O optimized MOB compaction

2019-11-20 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-23189:
--
Description: 
+corresponding test cases

The current code for I/O optimized compaction has not been tested and verified 
yet. 

  was:
+corresponding test cases

The current code for generational compaction has not been tested and verified 
yet. 


> Finalize I/O optimized MOB compaction
> -
>
> Key: HBASE-23189
> URL: https://issues.apache.org/jira/browse/HBASE-23189
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> +corresponding test cases
> The current code for I/O optimized compaction has not been tested and 
> verified yet. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23189) Finalize I/O optimized MOB compaction

2019-11-20 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-23189:
--
Summary: Finalize I/O optimized MOB compaction  (was: Finalize generational 
compaction)

> Finalize I/O optimized MOB compaction
> -
>
> Key: HBASE-23189
> URL: https://issues.apache.org/jira/browse/HBASE-23189
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> +corresponding test cases
> The current code for generational compaction has not been tested and verified 
> yet. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974684#comment-16974684
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

Updated design document, bumped version to 3.0.

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.1.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.2.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v1.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.3.pdf)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-14 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v3.0.pdf

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, 
> HBase-MOB-2.0-v2.3.pdf, HBase-MOB-2.0-v2.pdf, HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-11-13 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v2.3.pdf

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, 
> HBase-MOB-2.0-v2.3.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-11-13 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973871#comment-16973871
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

Updated design doc with new I/O optimized compaction algorithm description.

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, 
> HBase-MOB-2.0-v2.3.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23267) Test case for MOB compaction in a regular mode.

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23267.
---
Resolution: Fixed

Resolved. Pushed to the parent's PR branch.

> Test case for MOB compaction in a regular mode.
> ---
>
> Key: HBASE-23267
> URL: https://issues.apache.org/jira/browse/HBASE-23267
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> We need this test case too. 
> Test case description (similar to HBASE-23266):
> {code}
> /**
>   * Mob file compaction chore in default regular mode test.
>   * 1. Enables non-batch mode (default) for regular MOB compaction, 
>   *Sets batch size to 7 regions.
>   * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 
> sec   
>   * 3. Creates MOB table with 20 regions
>   * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
>   * 5. Repeats 4. two more times
>   * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of 
> regions x 3)
>   * 7. Runs major MOB compaction.
>   * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
>   * 9. Waits for a period of time larger than minimum age to archive 
>   * 10. Runs Mob cleaner chore
>   * 11 Verifies that number of MOB files in a mob directory is 20.
>   * 12 Runs scanner and checks all 3 * 1000 rows.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23267) Test case for MOB compaction in a regular mode.

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-23267:
--
Description: 
We need this test case too. 

Test case description (similar to HBASE-23266):
{code}
/**
  * Mob file compaction chore in batch mode test.
  * 1. Enables non-batch mode (default) for regular MOB compaction, 
  *Sets batch size to 7 regions.
  * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 sec 
  
  * 3. Creates MOB table with 20 regions
  * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
  * 5. Repeats 4. two more times
  * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of regions 
x 3)
  * 7. Runs major MOB compaction.
  * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
  * 9. Waits for a period of time larger than minimum age to archive 
  * 10. Runs Mob cleaner chore
  * 11 Verifies that number of MOB files in a mob directory is 20.
  * 12 Runs scanner and checks all 3 * 1000 rows.
{code}


  was:We need this test case too. 


> Test case for MOB compaction in a regular mode.
> ---
>
> Key: HBASE-23267
> URL: https://issues.apache.org/jira/browse/HBASE-23267
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> We need this test case too. 
> Test case description (similar to HBASE-23266):
> {code}
> /**
>   * Mob file compaction chore in batch mode test.
>   * 1. Enables non-batch mode (default) for regular MOB compaction, 
>   *Sets batch size to 7 regions.
>   * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 
> sec   
>   * 3. Creates MOB table with 20 regions
>   * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
>   * 5. Repeats 4. two more times
>   * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of 
> regions x 3)
>   * 7. Runs major MOB compaction.
>   * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
>   * 9. Waits for a period of time larger than minimum age to archive 
>   * 10. Runs Mob cleaner chore
>   * 11 Verifies that number of MOB files in a mob directory is 20.
>   * 12 Runs scanner and checks all 3 * 1000 rows.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23267) Test case for MOB compaction in a regular mode.

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-23267:
--
Description: 
We need this test case too. 

Test case description (similar to HBASE-23266):
{code}
/**
  * Mob file compaction chore in default regular mode test.
  * 1. Enables non-batch mode (default) for regular MOB compaction, 
  *Sets batch size to 7 regions.
  * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 sec 
  
  * 3. Creates MOB table with 20 regions
  * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
  * 5. Repeats 4. two more times
  * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of regions 
x 3)
  * 7. Runs major MOB compaction.
  * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
  * 9. Waits for a period of time larger than minimum age to archive 
  * 10. Runs Mob cleaner chore
  * 11 Verifies that number of MOB files in a mob directory is 20.
  * 12 Runs scanner and checks all 3 * 1000 rows.
{code}


  was:
We need this test case too. 

Test case description (similar to HBASE-23266):
{code}
/**
  * Mob file compaction chore in batch mode test.
  * 1. Enables non-batch mode (default) for regular MOB compaction, 
  *Sets batch size to 7 regions.
  * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 sec 
  
  * 3. Creates MOB table with 20 regions
  * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
  * 5. Repeats 4. two more times
  * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of regions 
x 3)
  * 7. Runs major MOB compaction.
  * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
  * 9. Waits for a period of time larger than minimum age to archive 
  * 10. Runs Mob cleaner chore
  * 11 Verifies that number of MOB files in a mob directory is 20.
  * 12 Runs scanner and checks all 3 * 1000 rows.
{code}



> Test case for MOB compaction in a regular mode.
> ---
>
> Key: HBASE-23267
> URL: https://issues.apache.org/jira/browse/HBASE-23267
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> We need this test case too. 
> Test case description (similar to HBASE-23266):
> {code}
> /**
>   * Mob file compaction chore in default regular mode test.
>   * 1. Enables non-batch mode (default) for regular MOB compaction, 
>   *Sets batch size to 7 regions.
>   * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 
> sec   
>   * 3. Creates MOB table with 20 regions
>   * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
>   * 5. Repeats 4. two more times
>   * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of 
> regions x 3)
>   * 7. Runs major MOB compaction.
>   * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
>   * 9. Waits for a period of time larger than minimum age to archive 
>   * 10. Runs Mob cleaner chore
>   * 11 Verifies that number of MOB files in a mob directory is 20.
>   * 12 Runs scanner and checks all 3 * 1000 rows.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HBASE-23267) Test case for MOB compaction in a regular mode.

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-23267 started by Vladimir Rodionov.
-
> Test case for MOB compaction in a regular mode.
> ---
>
> Key: HBASE-23267
> URL: https://issues.apache.org/jira/browse/HBASE-23267
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> We need this test case too. 
> Test case description (similar to HBASE-23266):
> {code}
> /**
>   * Mob file compaction chore in default regular mode test.
>   * 1. Enables non-batch mode (default) for regular MOB compaction, 
>   *Sets batch size to 7 regions.
>   * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 
> sec   
>   * 3. Creates MOB table with 20 regions
>   * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
>   * 5. Repeats 4. two more times
>   * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of 
> regions x 3)
>   * 7. Runs major MOB compaction.
>   * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
>   * 9. Waits for a period of time larger than minimum age to archive 
>   * 10. Runs Mob cleaner chore
>   * 11 Verifies that number of MOB files in a mob directory is 20.
>   * 12 Runs scanner and checks all 3 * 1000 rows.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23266) Test case for MOB compaction in a region's batch mode.

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-23266:
--
Description: 
Major MOB compaction in a general (non-generational) mode can be run in a 
batched mode (disabled by default). In this mode, only subset of regions at a 
time are compacted to mitigate possible compaction storms. We need test case 
for this mode.

Test case description:
{code}
/**
  * Mob file compaction chore in batch mode test.
  * 1. Enables batch mode for regular MOB compaction, 
  *Sets batch size to 7 regions.
  * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 sec 
  
  * 3. Creates MOB table with 20 regions
  * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
  * 5. Repeats 4. two more times
  * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of regions 
x 3)
  * 7. Runs major MOB compaction.
  * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
  * 9. Waits for a period of time larger than minimum age to archive 
  * 10. Runs Mob cleaner chore
  * 11 Verifies that number of MOB files in a mob directory is 20.
  * 12 Runs scanner and checks all 3 * 1000 rows.
*/
{code} 

  was:Major MOB compaction in a general (non-generational) mode can be run in a 
batched mode (disabled by default). In this mode, only subset of regions at a 
time are compacted to mitigate possible compaction storms. We need test case 
for this mode.


> Test case for MOB compaction in a region's batch mode.
> --
>
> Key: HBASE-23266
> URL: https://issues.apache.org/jira/browse/HBASE-23266
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> Major MOB compaction in a general (non-generational) mode can be run in a 
> batched mode (disabled by default). In this mode, only subset of regions at a 
> time are compacted to mitigate possible compaction storms. We need test case 
> for this mode.
> Test case description:
> {code}
> /**
>   * Mob file compaction chore in batch mode test.
>   * 1. Enables batch mode for regular MOB compaction, 
>   *Sets batch size to 7 regions.
>   * 2. Disables periodic MOB compactions, sets minimum age to archive to 10 
> sec   
>   * 3. Creates MOB table with 20 regions
>   * 4. Loads MOB data (randomized keys, 1000 rows), flushes data.
>   * 5. Repeats 4. two more times
>   * 6. Verifies that we have 20 *3 = 60 mob files (equals to number of 
> regions x 3)
>   * 7. Runs major MOB compaction.
>   * 8. Verifies that number of MOB files in a mob directory is 20 x4 = 80
>   * 9. Waits for a period of time larger than minimum age to archive 
>   * 10. Runs Mob cleaner chore
>   * 11 Verifies that number of MOB files in a mob directory is 20.
>   * 12 Runs scanner and checks all 3 * 1000 rows.
> */
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23267) Test case for MOB compaction in a regular mode.

2019-11-06 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23267:
-

 Summary: Test case for MOB compaction in a regular mode.
 Key: HBASE-23267
 URL: https://issues.apache.org/jira/browse/HBASE-23267
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


We need this test case too. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HBASE-23189) Finalize generational compaction

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-23189 started by Vladimir Rodionov.
-
> Finalize generational compaction
> 
>
> Key: HBASE-23189
> URL: https://issues.apache.org/jira/browse/HBASE-23189
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> +corresponding test cases
> The current code for generational compaction has not been tested and verified 
> yet. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23266) Test case for MOB compaction in a region's batch mode.

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23266.
---
Resolution: Fixed

Resolved. Pushed change to parent's PR branch.

> Test case for MOB compaction in a region's batch mode.
> --
>
> Key: HBASE-23266
> URL: https://issues.apache.org/jira/browse/HBASE-23266
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> Major MOB compaction in a general (non-generational) mode can be run in a 
> batched mode (disabled by default). In this mode, only subset of regions at a 
> time are compacted to mitigate possible compaction storms. We need test case 
> for this mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HBASE-23266) Test case for MOB compaction in a region's batch mode.

2019-11-06 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-23266 started by Vladimir Rodionov.
-
> Test case for MOB compaction in a region's batch mode.
> --
>
> Key: HBASE-23266
> URL: https://issues.apache.org/jira/browse/HBASE-23266
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> Major MOB compaction in a general (non-generational) mode can be run in a 
> batched mode (disabled by default). In this mode, only subset of regions at a 
> time are compacted to mitigate possible compaction storms. We need test case 
> for this mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23266) Test case for MOB compaction in a region's batch mode.

2019-11-06 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23266:
-

 Summary: Test case for MOB compaction in a region's batch mode.
 Key: HBASE-23266
 URL: https://issues.apache.org/jira/browse/HBASE-23266
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Major MOB compaction in a general (non-generational) mode can be run in a 
batched mode (disabled by default). In this mode, only subset of regions at a 
time are compacted to mitigate possible compaction storms. We need test case 
for this mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23188) MobFileCleanerChore test case

2019-11-05 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23188.
---
Resolution: Fixed

Resolved. Pushed to parent PR branch.

> MobFileCleanerChore test case
> -
>
> Key: HBASE-23188
> URL: https://issues.apache.org/jira/browse/HBASE-23188
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Priority: Major
>
> The test should do the following:
> a) properly remove obsolete files as expected
> b) dot not remove mob files from prior to the reference accounting added in 
> this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23190) Convert MobCompactionTest into integration test

2019-11-05 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23190.
---
Resolution: Fixed

Resolved in a last parent PR commit (11/5). 

> Convert MobCompactionTest into integration test
> ---
>
> Key: HBASE-23190
> URL: https://issues.apache.org/jira/browse/HBASE-23190
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23209) Simplify logic in DefaultMobStoreCompactor

2019-10-31 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964331#comment-16964331
 ] 

Vladimir Rodionov commented on HBASE-23209:
---

Changed was pushed to parents PR branch.

> Simplify logic in DefaultMobStoreCompactor
> --
>
> Key: HBASE-23209
> URL: https://issues.apache.org/jira/browse/HBASE-23209
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> The major compaction loop is quite large and has many branches, especially in 
> a non-MOB mode. Consider moving MOB data only Ain a MOB compaction mode and 
> simplify non-MOB case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23209) Simplify logic in DefaultMobStoreCompactor

2019-10-31 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-23209.
---
Resolution: Fixed

> Simplify logic in DefaultMobStoreCompactor
> --
>
> Key: HBASE-23209
> URL: https://issues.apache.org/jira/browse/HBASE-23209
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> The major compaction loop is quite large and has many branches, especially in 
> a non-MOB mode. Consider moving MOB data only Ain a MOB compaction mode and 
> simplify non-MOB case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23209) Simplify logic in DefaultMobStoreCompactor

2019-10-31 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964330#comment-16964330
 ] 

Vladimir Rodionov commented on HBASE-23209:
---

Reduced code, by leaving handling of a changed mobStoreThreshold to a MOB 
compaction only. Now, during regular compactions we do not check if MOB 
threshold was changed and do not handle this case. 

> Simplify logic in DefaultMobStoreCompactor
> --
>
> Key: HBASE-23209
> URL: https://issues.apache.org/jira/browse/HBASE-23209
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> The major compaction loop is quite large and has many branches, especially in 
> a non-MOB mode. Consider moving MOB data only Ain a MOB compaction mode and 
> simplify non-MOB case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23209) Simplify logic in DefaultMobStoreCompactor

2019-10-23 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23209:
-

 Summary: Simplify logic in DefaultMobStoreCompactor
 Key: HBASE-23209
 URL: https://issues.apache.org/jira/browse/HBASE-23209
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


The major compaction loop is quite large and has many branches, especially in a 
non-MOB mode. Consider moving MOB data only Ain a MOB compaction mode and 
simplify non-MOB case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23198) Documentation and release notes

2019-10-21 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23198:
-

 Summary: Documentation and release notes
 Key: HBASE-23198
 URL: https://issues.apache.org/jira/browse/HBASE-23198
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


Document all the changes: algorithms, new configuration options, obsolete 
configurations, upgrade procedure and possibility of downgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-23188) MobFileCleanerChore test case

2019-10-18 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov reassigned HBASE-23188:
-

Assignee: (was: Vladimir Rodionov)

> MobFileCleanerChore test case
> -
>
> Key: HBASE-23188
> URL: https://issues.apache.org/jira/browse/HBASE-23188
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Vladimir Rodionov
>Priority: Major
>
> The test should do the following:
> a) properly remove obsolete files as expected
> b) dot not remove mob files from prior to the reference accounting added in 
> this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23190) Convert MobCompactionTest into integration test

2019-10-18 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23190:
-

 Summary: Convert MobCompactionTest into integration test
 Key: HBASE-23190
 URL: https://issues.apache.org/jira/browse/HBASE-23190
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23189) Finalize generational compaction

2019-10-18 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23189:
-

 Summary: Finalize generational compaction
 Key: HBASE-23189
 URL: https://issues.apache.org/jira/browse/HBASE-23189
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


+corresponding test cases

The current code for generational compaction has not been tested and verified 
yet. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23188) MobFileCleanerChore test case

2019-10-18 Thread Vladimir Rodionov (Jira)
Vladimir Rodionov created HBASE-23188:
-

 Summary: MobFileCleanerChore test case
 Key: HBASE-23188
 URL: https://issues.apache.org/jira/browse/HBASE-23188
 Project: HBase
  Issue Type: Sub-task
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


The test should do the following:
a) properly remove obsolete files as expected
b) dot not remove mob files from prior to the reference accounting added in 
this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-10-12 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-master-v2.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-10-12 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Status: Patch Available  (was: Open)

Code cleanup. unit tests fixes. 

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-09-13 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929511#comment-16929511
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

PR has been created.

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-13 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-master-v1.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-13 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBASE-22749-branch-2.2-v3.patch)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-12 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-branch-2.2-v4.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v3.patch, 
> HBASE-22749-branch-2.2-v4.patch, HBase-MOB-2.0-v1.pdf, 
> HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-09-12 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928879#comment-16928879
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

v4 should build on 2.2.

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v3.patch, 
> HBASE-22749-branch-2.2-v4.patch, HBase-MOB-2.0-v1.pdf, 
> HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-09-12 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928832#comment-16928832
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

Nevertheless, failed again:
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-shade-plugin:3.1.1:shade 
(aggregate-into-a-jar-with-relocated-third-parties) on project 
hbase-shaded-client: Error creating shaded jar: duplicate entry: 
META-INF/services/org.apache.hadoop.hbase.shaded.com.fasterxml.jackson.core.ObjectCodec
 -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hbase-shaded-client
{code}

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v3.patch, HBase-MOB-2.0-v1.pdf, 
> HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-09-12 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928796#comment-16928796
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

It seems, that tip of branch-2.2 is broken. Not a patch related. I tried to 
build 2.2 w/o patch and it failed with multiple errors.

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v3.patch, HBase-MOB-2.0-v1.pdf, 
> HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-09-12 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928789#comment-16928789
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

Oops, will fix it.

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v3.patch, HBase-MOB-2.0-v1.pdf, 
> HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-09-11 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928139#comment-16928139
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

Uploaded patch for 2.2 branch. Master version will follow shortly. 

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v3.patch, HBase-MOB-2.0-v1.pdf, 
> HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-11 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBASE-22749-branch-2.2-v3.patch

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22749-branch-2.2-v3.patch, HBase-MOB-2.0-v1.pdf, 
> HBase-MOB-2.0-v2.1.pdf, HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-09-04 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922943#comment-16922943
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

Updated design document to v2.2. Added totally new MOB compaction algorithm 
section, which now can limit for sure, overall Read/Write I/O amplification 
(major concern so far) The initial patch is almost done, just need to fix the 
algorithm and run tests. 

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-09-04 Thread Vladimir Rodionov (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v2.2.pdf

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.2.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (HBASE-22749) Distributed MOB compactions

2019-08-21 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912908#comment-16912908
 ] 

Vladimir Rodionov edited comment on HBASE-22749 at 8/22/19 3:33 AM:


It is the big list [~busbey].  Below are some answers:
{quote}
region sizing - splitting, normalizers, etc
Need to expressly state wether or not this change to per-region accounting 
plans to alter the current assumptions that use of the feature means that the 
MOB data isn’t counted when determining region size for decisions to normalize 
or split.
{quote}

This part has not been touched - meaning that MOB 2.0 does exactly the same 
what MOB 1.0 does. If MOB is not counted for normalize/split decision now in 
MOB it won'y be in 2.0. Should it? Probably, yes. But it is not part of  
scalable compactions.

{quote}
write amplification
{quote}

Good question. Default (non partial) major compaction does have the same or 
similar to regular HBase tiered compaction WA. I would not call this unbounded, 
but it is probably worse than in MOB 1.0. Partial MOB compaction will 
definetely have a  bounded WA comparable to what we have in MOB 1.0 (where 
compaction is done by partitions and partitions are date-based)
The idea of partial major MOB compaction is to keep total number of MOB files 
in a system under control (say - around 1 M) by not compacting MOB files, which 
reached some size threshold (say 1GB).   If you exclude all MOB files above T 
bytes from compaction - your WA will be bounded by logK(T/S), where logK - 
logarithm base K (K is average number of files in compaction selection), T - 
maximum MOB file size (threshold) and S - average size of Memstore flush. This 
is approximation of course. How it compares to MOB 1.0 partitioned compaction? 
By varying T we can get any WA we want. Say, if we set limit on  number of MOB 
files to 10M we can decrease T to 100MB and it will give us total capacity for 
MOB data to 1PB. With 100MB threshold, WA can be very low (low one's). I will 
update the document and will add more info on partial major MOB compactions, 
including file selection policy.   



was (Author: vrodionov):
It is the big list [~busbey].  Below are some answers:
{quote}
region sizing - splitting, normalizers, etc
Need to expressly state wether or not this change to per-region accounting 
plans to alter the current assumptions that use of the feature means that the 
MOB data isn’t counted when determining region size for decisions to normalize 
or split.
{quote}

This part has not been touched - meaning that MOB 2.0 does exactly the same 
what MOB 1.0 does. If MOB is not counted for normalize/split decision now in 
MOB it won'y be in 2.0. Should it? Probably, yes. But it is not part of  
scalable compactions.

{quote}
write amplification
{quote}

Good question. Default (non partial) major compaction does have the same or 
similar to regular HBase tiered compaction WA. I would not call this unbounded, 
but it is probably worse than in MOB 1.0. Partial MOB compaction will 
definetely have a  bounded WA comparable to what we have in MOB 1.0 (where 
compaction is done by partitions and partitions are date-based)
The idea of partial major MOB compaction is to keep total number of MOB files 
in a system under control (say - around 1 M) by not compacting MOB files, which 
reached some size threshold (say 1GB).   If you exclude all MOB files above 1GB 
from compaction - your WA will be bounded by logK(T/S), where logK - logarithm 
base K (K is average number of files in compaction selection), T - maximum MOB 
file size (threshold) and S - average size of Memstore flush. This is 
approximation of course. How it compares to MOB 1.0 partitioned compaction? By 
varying T we can get any WA we want. Say, if we set limit on  number of MOB 
files to 10M we can decrease T to 100MB and it will give us total capacity for 
MOB data to 1PB. With 100MB threshold, WA can be very low (low one's). I will 
update the document and will add more info on partial major MOB compactions, 
including file selection policy.   


> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required

[jira] [Comment Edited] (HBASE-22749) Distributed MOB compactions

2019-08-21 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912908#comment-16912908
 ] 

Vladimir Rodionov edited comment on HBASE-22749 at 8/22/19 3:33 AM:


It is the big list [~busbey].  Below are some answers:
{quote}
region sizing - splitting, normalizers, etc
Need to expressly state wether or not this change to per-region accounting 
plans to alter the current assumptions that use of the feature means that the 
MOB data isn’t counted when determining region size for decisions to normalize 
or split.
{quote}

This part has not been touched - meaning that MOB 2.0 does exactly the same 
what MOB 1.0 does. If MOB is not counted for normalize/split decision now in 
MOB it won'y be in 2.0. Should it? Probably, yes. But it is not part of  
scalable compactions.

{quote}
write amplification
{quote}

Good question. Default (non partial) major compaction does have the same or 
similar to regular HBase tiered compaction WA. I would not call this unbounded, 
but it is probably worse than in MOB 1.0. Partial MOB compaction will 
definetely have a  bounded WA comparable to what we have in MOB 1.0 (where 
compaction is done by partitions and partitions are date-based)
The idea of partial major MOB compaction is to keep total number of MOB files 
in a system under control (say - around 1 M) by not compacting MOB files, which 
reached some size threshold (say 1GB).   If you exclude all MOB files above 1GB 
from compaction - your WA will be bounded by logK(T/S), where logK - logarithm 
base K (K is average number of files in compaction selection), T - maximum MOB 
file size (threshold) and S - average size of Memstore flush. This is 
approximation of course. How it compares to MOB 1.0 partitioned compaction? By 
varying T we can get any WA we want. Say, if we set limit on  number of MOB 
files to 10M we can decrease T to 100MB and it will give us total capacity for 
MOB data to 1PB. With 100MB threshold, WA can be very low (low one's). I will 
update the document and will add more info on partial major MOB compactions, 
including file selection policy.   



was (Author: vrodionov):
It is the big list [~busbey].  Below are some answers:
{quote}
region sizing - splitting, normalizers, etc
Need to expressly state wether or not this change to per-region accounting 
plans to alter the current assumptions that use of the feature means that the 
MOB data isn’t counted when determining region size for decisions to normalize 
or split.
{quote}

This part has not been touched - meaning that MOB 2.0 does exactly the same 
what MOB 1.0 does. If MOB is not counted for normalize/split decision now in 
MOB it won'y be in 2.0. Should it? Probably, yes. But it is not part of  
scalable compactions.

{quote}
write amplification
{quote}

Good question. Default (non partial) major compaction does have the same or 
similar to regular HBase tiered compaction WA. I would not call this unbounded, 
but it is probably worse than in MOB 1.0. Partial MOB compaction will 
definetely have a  bounded WA comparable to what we have in MOB 1.0 (where 
compaction is done by partitions and partitions are date-based)
The idea of partial major MOB compaction is either to keep total number of MOB 
files in a system under control (say - around 1 M), or do not compact MOB files 
which reached some size threshold (say 1GB).  The latter case is easier to 
explain. If you exclude all MOB files above 1GB from compaction - your WA will 
be bounded by log2(T/S), where log2 - logarithm base 2, T - maximum MOB file 
size (threshold) and S - average size of Memstore flush. This is approximation 
of course. How it compares to MOB 1.0 partitioned compaction? By varying T we 
can get any WA we want. Say, if we set limit on  number of MOB files to 10M we 
can decrease T to 100MB and it will give us total capacity for MOB data to 1PB. 
With 100MB threshold, WA can be very low (low one's). I will update the 
document and will add more info on partial major MOB compactions, including 
file selection policy.   


> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB com

[jira] [Commented] (HBASE-22749) Distributed MOB compactions

2019-08-21 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912908#comment-16912908
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

It is the big list [~busbey].  Below are some answers:
{quote}
region sizing - splitting, normalizers, etc
Need to expressly state wether or not this change to per-region accounting 
plans to alter the current assumptions that use of the feature means that the 
MOB data isn’t counted when determining region size for decisions to normalize 
or split.
{quote}

This part has not been touched - meaning that MOB 2.0 does exactly the same 
what MOB 1.0 does. If MOB is not counted for normalize/split decision now in 
MOB it won'y be in 2.0. Should it? Probably, yes. But it is not part of  
scalable compactions.

{quote}
write amplification
{quote}

Good question. Default (non partial) major compaction does have the same or 
similar to regular HBase tiered compaction WA. I would not call this unbounded, 
but it is probably worse than in MOB 1.0. Partial MOB compaction will 
definetely have a  bounded WA comparable to what we have in MOB 1.0 (where 
compaction is done by partitions and partitions are date-based)
The idea of partial major MOB compaction is either to keep total number of MOB 
files in a system under control (say - around 1 M), or do not compact MOB files 
which reached some size threshold (say 1GB).  The latter case is easier to 
explain. If you exclude all MOB files above 1GB from compaction - your WA will 
be bounded by log2(T/S), where log2 - logarithm base 2, T - maximum MOB file 
size (threshold) and S - average size of Memstore flush. This is approximation 
of course. How it compares to MOB 1.0 partitioned compaction? By varying T we 
can get any WA we want. Say, if we set limit on  number of MOB files to 10M we 
can decrease T to 100MB and it will give us total capacity for MOB data to 1PB. 
With 100MB threshold, WA can be very low (low one's). I will update the 
document and will add more info on partial major MOB compactions, including 
file selection policy.   


> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (HBASE-22705) IllegalArgumentException exception occured during MobFileCache eviction

2019-08-21 Thread Vladimir Rodionov (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912881#comment-16912881
 ] 

Vladimir Rodionov commented on HBASE-22705:
---

Apologize for delay, [~pankaj2461]. Additional global lock on MOB file cache is 
the last resort approach - do not use it  until you explore other (lockless) 
options. 

> IllegalArgumentException exception occured during MobFileCache eviction
> ---
>
> Key: HBASE-22705
> URL: https://issues.apache.org/jira/browse/HBASE-22705
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.0.5
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 2.3.0
>
> Attachments: HBASE-22705.branch-2.patch
>
>
> IllegalArgumentException occured during scan operation,
> {noformat}
> 2019-07-08 01:46:57,764 | ERROR | 
> RpcServer.FifoWFPBQ.default.handler=129,queue=9,port=21302 | Unexpected 
> throwable object  | 
> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2502)
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>   at java.util.ComparableTimSort.mergeHi(ComparableTimSort.java:866)
>   at java.util.ComparableTimSort.mergeAt(ComparableTimSort.java:483)
>   at 
> java.util.ComparableTimSort.mergeForceCollapse(ComparableTimSort.java:422)
>   at java.util.ComparableTimSort.sort(ComparableTimSort.java:222)
>   at java.util.Arrays.sort(Arrays.java:1312)
>   at java.util.Arrays.sort(Arrays.java:1506)
>   at java.util.ArrayList.sort(ArrayList.java:1462)
>   at java.util.Collections.sort(Collections.java:141)
>   at org.apache.hadoop.hbase.mob.MobFileCache.evict(MobFileCache.java:144)
>   at 
> org.apache.hadoop.hbase.mob.MobFileCache.openFile(MobFileCache.java:214)
>   at 
> org.apache.hadoop.hbase.regionserver.HMobStore.readCell(HMobStore.java:397)
>   at 
> org.apache.hadoop.hbase.regionserver.HMobStore.resolve(HMobStore.java:358)
>   at 
> org.apache.hadoop.hbase.regionserver.MobStoreScanner.next(MobStoreScanner.java:74)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:150)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (HBASE-22826) Wrong FS: recovered.edits goes to wrong file system

2019-08-09 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22826:
--
Description: 
When WAL is attached to a separate file system, recovered.edits are going to 
hbase root directory.
PROBLEM

* Customer environment
HBase root directory : On WASB
hbase.wal.dir : On HDFS

Customer is creating and HBase table and running VIEW DDL on top of the Hbase 
table. The recovered.edits are going to hbase root directory in WASB and region 
assignments getting failed.
Customer is on HBase 2.0.4. 


The below stack trace is from local env reproduction:


{code:java}2019-08-05 22:07:31,940 ERROR 
[RS_OPEN_META-regionserver/c47-node3:16020-0] handler.OpenRegionHandler: Failed 
open of region=hbase:meta,,1.1588230740
java.lang.IllegalArgumentException: Wrong FS: 
hdfs://c47-node2.squadron-labs.com:8020/hbasewal/hbase/meta/1588230740/recovered.edits,
 expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:730)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:460)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at 
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
at 
org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at 
org.apache.hadoop.hbase.wal.WALSplitter.getSequenceIdFiles(WALSplitter.java:647)
at 
org.apache.hadoop.hbase.wal.WALSplitter.writeRegionSequenceIdFile(WALSplitter.java:680)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:984)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:881)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7149)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7108)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7080)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7038)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6989)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}



  was:
When WAL is attached to a separate file system, recovered.edits are going to 
hbase root directory.
PROBLEM

* Customer environment
HBase root directory : On WASB
hbase.wal.dir : On HDFS

Customer is creating and HBase table and running VIEW DDL on top of the Hbase 
table. The recovered.edits are going to hbase root directory in WASB and region 
assignments getting failed.
Customer is on HBase 2.0.4. 


{code:java}if (RegionReplicaUtil.isDefaultReplica(getRegionInfo())) {
  LOG.debug("writing seq id for {}", this.getRegionInfo().getEncodedName());
  WALSplitter.writeRegionSequenceIdFile(fs.getFileSystem(), 
getWALRegionDir(), nextSeqId);
  //WALSplitter.writeRegionSequenceIdFile(getWalFileSystem(), 
getWALRegionDir(), nextSeqId - 1);{code}


{code:java}2019-08-05 22:07:31,940 ERROR 
[RS_OPEN_META-regionserver/c47-node3:16020-0] handler.OpenRegionHandler: Failed 
open of region=hbase:meta,,1.1588230740
java.lang.IllegalArgumentException: Wrong FS: 
hdfs://c47-node2.squadron-labs.com:8020/hbasewal/hbase/meta/1588230740/recovered.edits,
 expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:730)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:460)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at 
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
at 
org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at 
org.apache.hadoop.hbase.wal.WALSplitter.getSequenceIdFiles(WALSplitter.java:647)
at 
org.apache.hadoop.hbase.wal.WALSplitter.writeRegionSequenceIdFile(WALSplitter.java:68

[jira] [Resolved] (HBASE-22826) Wrong FS: recovered.edits goes to wrong file system

2019-08-09 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov resolved HBASE-22826.
---
Resolution: Won't Fix

> Wrong FS: recovered.edits goes to wrong file system
> ---
>
> Key: HBASE-22826
> URL: https://issues.apache.org/jira/browse/HBASE-22826
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.5
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> When WAL is attached to a separate file system, recovered.edits are going to 
> hbase root directory.
> PROBLEM
> * Customer environment
> HBase root directory : On WASB
> hbase.wal.dir : On HDFS
> Customer is creating and HBase table and running VIEW DDL on top of the Hbase 
> table. The recovered.edits are going to hbase root directory in WASB and 
> region assignments getting failed.
> Customer is on HBase 2.0.4. 
> {code:java}if (RegionReplicaUtil.isDefaultReplica(getRegionInfo())) {
>   LOG.debug("writing seq id for {}", 
> this.getRegionInfo().getEncodedName());
>   WALSplitter.writeRegionSequenceIdFile(fs.getFileSystem(), 
> getWALRegionDir(), nextSeqId);
>   //WALSplitter.writeRegionSequenceIdFile(getWalFileSystem(), 
> getWALRegionDir(), nextSeqId - 1);{code}
> {code:java}2019-08-05 22:07:31,940 ERROR 
> [RS_OPEN_META-regionserver/c47-node3:16020-0] handler.OpenRegionHandler: 
> Failed open of region=hbase:meta,,1.1588230740
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://c47-node2.squadron-labs.com:8020/hbasewal/hbase/meta/1588230740/recovered.edits,
>  expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:730)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:460)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
> at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
> at 
> org.apache.hadoop.hbase.wal.WALSplitter.getSequenceIdFiles(WALSplitter.java:647)
> at 
> org.apache.hadoop.hbase.wal.WALSplitter.writeRegionSequenceIdFile(WALSplitter.java:680)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:984)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:881)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7149)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7108)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7080)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7038)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6989)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HBASE-22826) Wrong FS: recovered.edits goes to wrong file system

2019-08-09 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904122#comment-16904122
 ] 

Vladimir Rodionov commented on HBASE-22826:
---

Basically, 2.0.x does not support fully WAL on a different file system. For 
those who wants this feature it is time to upgrade to 2.1. Won't fix, because 
2.0 is EOL.

> Wrong FS: recovered.edits goes to wrong file system
> ---
>
> Key: HBASE-22826
> URL: https://issues.apache.org/jira/browse/HBASE-22826
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.5
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
>
> When WAL is attached to a separate file system, recovered.edits are going to 
> hbase root directory.
> PROBLEM
> * Customer environment
> HBase root directory : On WASB
> hbase.wal.dir : On HDFS
> Customer is creating and HBase table and running VIEW DDL on top of the Hbase 
> table. The recovered.edits are going to hbase root directory in WASB and 
> region assignments getting failed.
> Customer is on HBase 2.0.4. 
> {code:java}if (RegionReplicaUtil.isDefaultReplica(getRegionInfo())) {
>   LOG.debug("writing seq id for {}", 
> this.getRegionInfo().getEncodedName());
>   WALSplitter.writeRegionSequenceIdFile(fs.getFileSystem(), 
> getWALRegionDir(), nextSeqId);
>   //WALSplitter.writeRegionSequenceIdFile(getWalFileSystem(), 
> getWALRegionDir(), nextSeqId - 1);{code}
> {code:java}2019-08-05 22:07:31,940 ERROR 
> [RS_OPEN_META-regionserver/c47-node3:16020-0] handler.OpenRegionHandler: 
> Failed open of region=hbase:meta,,1.1588230740
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://c47-node2.squadron-labs.com:8020/hbasewal/hbase/meta/1588230740/recovered.edits,
>  expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:730)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:460)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
> at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
> at 
> org.apache.hadoop.hbase.wal.WALSplitter.getSequenceIdFiles(WALSplitter.java:647)
> at 
> org.apache.hadoop.hbase.wal.WALSplitter.writeRegionSequenceIdFile(WALSplitter.java:680)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:984)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:881)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7149)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7108)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7080)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7038)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6989)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HBASE-22826) Wrong FS: recovered.edits goes to wrong file system

2019-08-09 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-22826:
-

 Summary: Wrong FS: recovered.edits goes to wrong file system
 Key: HBASE-22826
 URL: https://issues.apache.org/jira/browse/HBASE-22826
 Project: HBase
  Issue Type: New Feature
Affects Versions: 2.0.5
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


When WAL is attached to a separate file system, recovered.edits are going to 
hbase root directory.
PROBLEM

* Customer environment
HBase root directory : On WASB
hbase.wal.dir : On HDFS

Customer is creating and HBase table and running VIEW DDL on top of the Hbase 
table. The recovered.edits are going to hbase root directory in WASB and region 
assignments getting failed.
Customer is on HBase 2.0.4. 


{code:java}if (RegionReplicaUtil.isDefaultReplica(getRegionInfo())) {
  LOG.debug("writing seq id for {}", this.getRegionInfo().getEncodedName());
  WALSplitter.writeRegionSequenceIdFile(fs.getFileSystem(), 
getWALRegionDir(), nextSeqId);
  //WALSplitter.writeRegionSequenceIdFile(getWalFileSystem(), 
getWALRegionDir(), nextSeqId - 1);{code}


{code:java}2019-08-05 22:07:31,940 ERROR 
[RS_OPEN_META-regionserver/c47-node3:16020-0] handler.OpenRegionHandler: Failed 
open of region=hbase:meta,,1.1588230740
java.lang.IllegalArgumentException: Wrong FS: 
hdfs://c47-node2.squadron-labs.com:8020/hbasewal/hbase/meta/1588230740/recovered.edits,
 expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:730)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:460)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at 
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:678)
at 
org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at 
org.apache.hadoop.hbase.wal.WALSplitter.getSequenceIdFiles(WALSplitter.java:647)
at 
org.apache.hadoop.hbase.wal.WALSplitter.writeRegionSequenceIdFile(WALSplitter.java:680)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:984)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:881)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7149)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7108)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7080)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7038)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6989)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
at 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}





--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HBASE-22749) Distributed MOB compactions

2019-08-07 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Summary: Distributed MOB compactions   (was: HBase MOB 2.0)

> Distributed MOB compactions 
> 
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HBASE-22749) HBase MOB 2.0

2019-08-07 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902451#comment-16902451
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

Np, I will change the title :)

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HBASE-22749) HBase MOB 2.0

2019-07-30 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.1.pdf)

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HBASE-22749) HBase MOB 2.0

2019-07-30 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v2.1.pdf

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HBASE-22749) HBase MOB 2.0

2019-07-30 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v2.1.pdf

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HBASE-22749) HBase MOB 2.0

2019-07-30 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: (was: HBase-MOB-2.0-v2.1.pdf)

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HBASE-22749) HBase MOB 2.0

2019-07-29 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895629#comment-16895629
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

Design doc v2.1 adds clarification on *CompactType.MOB* support.

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HBASE-22749) HBase MOB 2.0

2019-07-29 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v2.1.pdf

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.1.pdf, 
> HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (HBASE-22749) HBase MOB 2.0

2019-07-29 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895593#comment-16895593
 ] 

Vladimir Rodionov edited comment on HBASE-22749 at 7/29/19 9:28 PM:


{quote}
Why is it called MOB 2.0? Seems to be just a change in compaction.
{quote}

Compaction changes are just first step. Yes, we are collecting feedback from 
the community for features to add/change in MOB, such as already mentioned - 
streaming access to MOB data.

{quote}
There is no more special compactor for MOB files,  but the class that is doing 
the compaction is named DefaultMobStoreCompactor; i.e. a compactor that is 
'default' but for 'MOB'?
{quote}

DefaultMobStoreCompactor does not do MOB compactions in original MOB - 
PartitionedMobCompactor does, which is gone now as all the mob.compactions 
sub-package. 
{quote}
On #3, to compact MOB, need to submit a major_compaction request. Does that 
mean we major compact all in the target table – MOB and other files? Can I do 
one or the other (MOB or HFiles). 
{quote}

We are still considering support for *CompactType.MOB*. If there is request to 
support that we will add it. In this case to start MOB compaction, user must 
submit *major_compact* with type=CompactType.MOB request. 

Upd.

Having thought about this - no it is not possible major compact only MOB files. 
The CompactType.MOB request will have to compact both MOB and regular store 
files and CompactType.NORMAL will compact only store files. This change can be 
added.  

{quote}
After finishing 'Unified Compactor' section, how does this differ from what was 
there before? Why superior?
{quote}

Code reduction and unification is an advantage as well. But the overall 
"superiority" comes from overall MOB 2.0 feature - not from Unified compactor 
along. We describe the advantages in the design document.  


was (Author: vrodionov):
{quote}
Why is it called MOB 2.0? Seems to be just a change in compaction.
{quote}

Compaction changes are just first step. Yes, we are collecting feedback from 
the community for features to add/change in MOB, such as already mentioned - 
streaming access to MOB data.

{quote}
There is no more special compactor for MOB files,  but the class that is doing 
the compaction is named DefaultMobStoreCompactor; i.e. a compactor that is 
'default' but for 'MOB'?
{quote}

DefaultMobStoreCompactor does not do MOB compactions in original MOB - 
PartitionedMobCompactor does, which is gone now as all the mob.compactions 
sub-package. 
{quote}
On #3, to compact MOB, need to submit a major_compaction request. Does that 
mean we major compact all in the target table – MOB and other files? Can I do 
one or the other (MOB or HFiles). 
{quote}

We are still considering support for *CompactType.MOB*. If there is request to 
support that we will add it. In this case to start MOB compaction, user must 
submit *major_compact* with type=CompactType.MOB request.

{quote}
After finishing 'Unified Compactor' section, how does this differ from what was 
there before? Why superior?
{quote}

Code reduction and unification is an advantage as well. But the overall 
"superiority" comes from overall MOB 2.0 feature - not from Unified compactor 
along. We describe the advantages in the design document.  

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - jus

[jira] [Commented] (HBASE-22749) HBase MOB 2.0

2019-07-29 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895593#comment-16895593
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

{quote}
Why is it called MOB 2.0? Seems to be just a change in compaction.
{quote}

Compaction changes are just first step. Yes, we are collecting feedback from 
the community for features to add/change in MOB, such as already mentioned - 
streaming access to MOB data.

{quote}
There is no more special compactor for MOB files,  but the class that is doing 
the compaction is named DefaultMobStoreCompactor; i.e. a compactor that is 
'default' but for 'MOB'?
{quote}

DefaultMobStoreCompactor does not do MOB compactions in original MOB - 
PartitionedMobCompactor does, which is gone now as all the mob.compactions 
sub-package. 
{quote}
On #3, to compact MOB, need to submit a major_compaction request. Does that 
mean we major compact all in the target table – MOB and other files? Can I do 
one or the other (MOB or HFiles). 
{quote}

We are still considering support for *CompactType.MOB*. If there is request to 
support that we will add it. In this case to start MOB compaction, user must 
submit *major_compact* with type=CompactType.MOB request.

{quote}
After finishing 'Unified Compactor' section, how does this differ from what was 
there before? Why superior?
{quote}

Code reduction and unification is an advantage as well. But the overall 
"superiority" comes from overall MOB 2.0 feature - not from Unified compactor 
along. We describe the advantages in the design document.  

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HBASE-22749) HBase MOB 2.0

2019-07-29 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v2.pdf

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf, HBase-MOB-2.0-v2.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HBASE-22749) HBase MOB 2.0

2019-07-26 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894234#comment-16894234
 ] 

Vladimir Rodionov commented on HBASE-22749:
---

The patch for the master will follow around mid-August (when I will come back 
from vacation).

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HBASE-22749) HBase MOB 2.0

2019-07-26 Thread Vladimir Rodionov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-22749:
--
Attachment: HBase-MOB-2.0-v1.pdf

> HBase MOB 2.0
> -
>
> Key: HBASE-22749
> URL: https://issues.apache.org/jira/browse/HBASE-22749
> Project: HBase
>  Issue Type: New Feature
>  Components: mob
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBase-MOB-2.0-v1.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HBASE-22749) HBase MOB 2.0

2019-07-26 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-22749:
-

 Summary: HBase MOB 2.0
 Key: HBASE-22749
 URL: https://issues.apache.org/jira/browse/HBASE-22749
 Project: HBase
  Issue Type: New Feature
  Components: mob
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov


There are several  drawbacks in the original MOB 1.0  (Moderate Object Storage) 
implementation, which can limit the adoption of the MOB feature:  

# MOB compactions are executed in a Master as a chore, which limits scalability 
because all I/O goes through a single HBase Master server. 
# Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
way, but this won’t work in a stand-alone HBase cluster.
# Two separate compactors for MOB and for regular store files and their 
interactions can result in a data loss (see HBASE-22075)

The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
implementation, which is free of the above drawbacks and can be used as a drop 
in replacement in existing MOB deployments. So, these are design goals of a MOB 
2.0:

# Make MOB compactions scalable without relying on Yarn/Mapreduce framework
# Provide unified compactor for both MOB and regular store files
# Make it more robust especially w.r.t. to data losses. 
# Simplify and reduce the overall MOB code.
# Provide 100% compatible implementation with MOB 1.0.
# No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
software upgrade.




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HBASE-22705) IllegalArgumentException exception occured during MobFileCache eviction

2019-07-24 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892018#comment-16892018
 ] 

Vladimir Rodionov commented on HBASE-22705:
---

On my list today.

> IllegalArgumentException exception occured during MobFileCache eviction
> ---
>
> Key: HBASE-22705
> URL: https://issues.apache.org/jira/browse/HBASE-22705
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.0.5
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 2.3.0
>
> Attachments: HBASE-22705.branch-2.patch
>
>
> IllegalArgumentException occured during scan operation,
> {noformat}
> 2019-07-08 01:46:57,764 | ERROR | 
> RpcServer.FifoWFPBQ.default.handler=129,queue=9,port=21302 | Unexpected 
> throwable object  | 
> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2502)
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>   at java.util.ComparableTimSort.mergeHi(ComparableTimSort.java:866)
>   at java.util.ComparableTimSort.mergeAt(ComparableTimSort.java:483)
>   at 
> java.util.ComparableTimSort.mergeForceCollapse(ComparableTimSort.java:422)
>   at java.util.ComparableTimSort.sort(ComparableTimSort.java:222)
>   at java.util.Arrays.sort(Arrays.java:1312)
>   at java.util.Arrays.sort(Arrays.java:1506)
>   at java.util.ArrayList.sort(ArrayList.java:1462)
>   at java.util.Collections.sort(Collections.java:141)
>   at org.apache.hadoop.hbase.mob.MobFileCache.evict(MobFileCache.java:144)
>   at 
> org.apache.hadoop.hbase.mob.MobFileCache.openFile(MobFileCache.java:214)
>   at 
> org.apache.hadoop.hbase.regionserver.HMobStore.readCell(HMobStore.java:397)
>   at 
> org.apache.hadoop.hbase.regionserver.HMobStore.resolve(HMobStore.java:358)
>   at 
> org.apache.hadoop.hbase.regionserver.MobStoreScanner.next(MobStoreScanner.java:74)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:150)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


  1   2   3   4   5   6   7   8   9   10   >