[ 
https://issues.apache.org/jira/browse/HBASE-15400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231298#comment-15231298
 ] 

Hudson commented on HBASE-15400:
--------------------------------

SUCCESS: Integrated in HBase-1.3-IT #600 (See 
[https://builds.apache.org/job/HBase-1.3-IT/600/])
HBASE-15400 Use DateTieredCompactor for Date Tiered Compaction (Clara (tedyu: 
rev b17350210b924f6adb4d002d41691fb51369a46c)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/StripeCompactionPolicy.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/SortedCompactionPolicy.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DateTieredStoreEngine.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/ExploringCompactionPolicy.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockStoreFile.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DateTieredCompactionRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/DateTieredCompactionPolicy.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/FIFOCompactionPolicy.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/EverythingPolicy.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionPolicy.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDateTieredCompaction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/RatioBasedCompactionPolicy.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactionConfiguration.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestDateTieredCompactionPolicy.java


> Use DateTieredCompactor for Date Tiered Compaction
> --------------------------------------------------
>
>                 Key: HBASE-15400
>                 URL: https://issues.apache.org/jira/browse/HBASE-15400
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>            Reporter: Clara Xiong
>            Assignee: Clara Xiong
>             Fix For: 2.0.0, 1.3.0, 1.4.0
>
>         Attachments: HBASE-15400-15389-v12.patch, HBASE-15400-branch-1.patch, 
> HBASE-15400-v1.pa, HBASE-15400-v3-v3.patch, HBASE-15400-v3-v4.patch, 
> HBASE-15400-v3-v5.patch, HBASE-15400-v3.patch, HBASE-15400-v6.patch, 
> HBASE-15400-v7.patch, HBASE-15400.patch
>
>
> When we compact, we can output multiple files along the current window 
> boundaries. There are two use cases:
> 1. Major compaction: We want to output date tiered store files with data 
> older than max age archived in trunks of the window size on the higher tier. 
> Once a window is old enough, we don't combine the windows to promote to the 
> next tier any further. So files in these windows retain the same timespan as 
> they were minor-compacted last time, which is the window size of the highest 
> tier. Major compaction will touch these files and we want to maintain the 
> same layout. This way, TTL and archiving will be simpler and more efficient.
> 2. Bulk load files and the old file generated by major compaction before 
> upgrading to DTCP.
> Pros: 
> 1. Restore locality, process versioning, updates and deletes while 
> maintaining the tiered layout.
> 2. The best way to fix a skewed layout.
>  
> This work is based on a prototype of DateTieredCompactor from HBASE-15389 and 
> focused on the part to meet needs for these two use cases while supporting 
> others. I have to call out a few design decisions:
> 1. We only want to output the files along all windows for major compaction. 
> And we want to output multiple files older than max age in the trunks of the 
> maximum tier window size determined by base window size, windows per tier and 
> max age.
> 2. For minor compaction, we don't want to output too many files, which will 
> remain around because of current restriction of contiguous compaction by seq 
> id. I will only output two files if all the files in the windows are being 
> combined, one for the data within window and the other for the out-of-window 
> tail. If there is any file in the window excluded from compaction, only one 
> file will be output from compaction. When the windows are promoted, the 
> situation of out of order data will gradually improve. For the incoming 
> window, we need to accommodate the case with user-specified future data.
> 3. We have to pass the boundaries with the list of store file as a complete 
> time snapshot instead of two separate calls because window layout is 
> determined by the time the computation is called. So we will need new type of 
> compaction request. 
> 4. Since we will assign the same seq id for all output files, we need to sort 
> by maxTimestamp subsequently. Right now all compaction policy gets the files 
> sorted for StoreFileManager which sorts by seq id and other criteria. I will 
> use this order for DTCP only, to avoid impacting other compaction policies. 
> 5. We need some cleanup of current design of StoreEngine and CompactionPolicy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to