[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-21 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201-addendum_1.patch

A typo found when debugging HBASE-12405.
A simple put->putIfAbsent change so add it here as an addendum patch.

Need to commit to master and branch-1, Thanks~

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10201-addendum.txt, 3149-trunk-v1.txt, 
> HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, 
> HBASE-10201-0.99.patch, HBASE-10201-addendum_1.patch, HBASE-10201.patch, 
> HBASE-10201_1.patch, HBASE-10201_10.patch, HBASE-10201_11.patch, 
> HBASE-10201_12.patch, HBASE-10201_13.patch, HBASE-10201_13.patch, 
> HBASE-10201_14.patch, HBASE-10201_15.patch, HBASE-10201_16.patch, 
> HBASE-10201_17.patch, HBASE-10201_18.patch, HBASE-10201_19.patch, 
> HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
> HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
> HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, 
> memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-18 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Fix Version/s: 1.1.0

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 2.0.0, 1.1.0
>
> Attachments: 10201-addendum.txt, 3149-trunk-v1.txt, 
> HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, 
> HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, 
> HBASE-10201_10.patch, HBASE-10201_11.patch, HBASE-10201_12.patch, 
> HBASE-10201_13.patch, HBASE-10201_13.patch, HBASE-10201_14.patch, 
> HBASE-10201_15.patch, HBASE-10201_16.patch, HBASE-10201_17.patch, 
> HBASE-10201_18.patch, HBASE-10201_19.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-16 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10201:
---
Attachment: 10201-addendum.txt

>From 
>https://builds.apache.org/job/HBase-TRUNK/5931/testReport/org.apache.hadoop.hbase.regionserver/TestPerColumnFamilyFlush/testLogReplayWithDistributedReplay/
> :
{code}
java.lang.IllegalStateException: A mini-cluster is already running
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:865)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:799)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:770)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:757)
at 
org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush.testLogReplay(TestPerColumnFamilyFlush.java:349)
at 
org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush.testLogReplayWithDistributedReplay(TestPerColumnFamilyFlush.java:430)
{code}
Attached addendum changes TestPerColumnFamilyFlush to large test such that 
there would be no collision as shown above in Jenkins build.
TestPerColumnFamilyFlush passed with the addendum.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 2.0.0
>
> Attachments: 10201-addendum.txt, 3149-trunk-v1.txt, 
> HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, 
> HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch, 
> HBASE-10201_10.patch, HBASE-10201_11.patch, HBASE-10201_12.patch, 
> HBASE-10201_13.patch, HBASE-10201_13.patch, HBASE-10201_14.patch, 
> HBASE-10201_15.patch, HBASE-10201_16.patch, HBASE-10201_17.patch, 
> HBASE-10201_18.patch, HBASE-10201_19.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-16 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
   Resolution: Fixed
Fix Version/s: (was: 1.0.0)
 Release Note: Adds new flushing policy mechanism. Default, 
org.apache.hadoop.hbase.regionserver.FlushLargeStoresPolicy, will try to avoid 
flushing out the small column families in a region, those whose memstores are < 
hbase.hregion.percolumnfamilyflush.size.lower.bound. To restore the old 
behavior of flushes writing out all column families, set 
hbase.regionserver.flush.policy to 
org.apache.hadoop.hbase.regionserver.FlushAllStoresPolicy either in 
hbase-default.xml or on a per-table basis by setting the policy to use with 
HTableDescriptor.getFlushPolicyClassName().
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Pushed to master branch

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 2.0.0
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
> HBASE-10201_19.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, 
> HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, 
> HBASE-10201_7.patch, HBASE-10201_8.patch, HBASE-10201_9.patch, 
> compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-12 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_19.patch

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 1.0.0, 2.0.0
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
> HBASE-10201_19.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, 
> HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, 
> HBASE-10201_7.patch, HBASE-10201_8.patch, HBASE-10201_9.patch, 
> compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-12 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: (was: HBASE-10201_19.patch)

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 1.0.0, 2.0.0
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
> HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
> HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
> HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, 
> memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-12 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_19.patch

Addressed the issue on rb

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 1.0.0, 2.0.0
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
> HBASE-10201_19.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, 
> HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, 
> HBASE-10201_7.patch, HBASE-10201_8.patch, HBASE-10201_9.patch, 
> compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10201:
---
Status: Patch Available  (was: Open)

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 1.0.0, 2.0.0
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
> HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
> HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
> HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, 
> memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-10 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_18.patch

Rename lastFlushSeqId to maxFlushedSeqId in HRegion. Always generate a 
flushSeqId with incrementing of sequenceId and now maxFlushedSeqId is not equal 
to flushSeqId.
Rename FlushLargeStorePolicy to FlushLargeAndOldStorePolicy that also flush 
store which make HRegion.shouldFlush, and change forceFlushAllStores flag to 
false in PeriodicMemstoreFlusher.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 1.0.0, 2.0.0
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
> HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
> HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
> HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, 
> memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-09 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Priority: Major  (was: Critical)

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 1.0.0, 2.0.0
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-09 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Fix Version/s: (was: 0.98.10)

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 1.0.0, 2.0.0
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-09 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Priority: Critical  (was: Major)

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-09 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10201:
---
Priority: Major  (was: Critical)

I also don't think this is a Critical priority issue, since the behavior 
discussed is long standing and what the changes look like are not settled. 
Changing priority, but feel free to change back if you disagree.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
> Fix For: 1.0.0, 2.0.0, 0.98.10
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-09 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10201:
---
Fix Version/s: (was: 0.98.9)
   0.98.10
   Status: Open  (was: Patch Available)

Canceling stale patch. Moving out of 0.98.9.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-07 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_17.patch

line length

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-07 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_16.patch

trivial change to remove javadoc warnings

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_16.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, 
> HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, 
> HBASE-10201_7.patch, HBASE-10201_8.patch, HBASE-10201_9.patch, 
> compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-07 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_15.patch

fix a javadoc issue.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
> HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
> HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
> HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, 
> memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-07 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_14.patch

Seems previous hadoop QA run was exited abnormally.

Rebase and retry.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-07 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: (was: HBASE-10201_14.patch)

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, 
> HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, 
> HBASE-10201_7.patch, HBASE-10201_8.patch, HBASE-10201_9.patch, 
> compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-05 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_14.patch

Ted Yu's suggestion on rb.
Fix typo, add FlushPolicyFactory.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-05 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Attachment: compactions.png
io.png
count.png
memstore.png

Ran some loadings.  Small cluster with one regionserver hosting one region.  
Used the test packaged in this patch modifying it so could run ten clients in 
parallel rather than a single client.  The included test has a table schema of 
three column families and it fills them unevenly so it is 'ideal' for 
demonstrating benefit.  I ran with patch turned off twice and then turned on 
twice.  Set flushes at 64M.

I see less compactions and less hfiles (so less i/o), memstores carrying more 
(its hard to see but you should be able to make out memstore sizes do not go to 
zero or near zero when the patch is enabled)

Looks good.  Let me review again to recheck sequenceid accounting and run some 
MTTR tests.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, 
> HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, 
> HBASE-10201_7.patch, HBASE-10201_8.patch, HBASE-10201_9.patch, 
> compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-04 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Attachment: HBASE-10201_13.patch

Retry. Yes, hadoopqa was broke today. I think it back now. Lets see by 
retrying.  Still working on getting a few numbers to show with and without 
patch. Sorry it taking so long.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_13.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, 
> HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, 
> HBASE-10201_7.patch, HBASE-10201_8.patch, HBASE-10201_9.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-04 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_13.patch

Use a FlushPolicy instead of 
hbase.hregion.memstore.percolumnfamilyflush.enabled config.

I'm not good at naming things, and may break some rules when add the policy. So 
just point out if you have a better name of the policy and anything that you 
think is wrong in this patch. I will fix it as soon as possible.

Thanks.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
> HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
> HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
> HBASE-10201_8.patch, HBASE-10201_9.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-02 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-10201:
--
Fix Version/s: (was: 0.99.2)
   1.0.0

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.9
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-01 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_12.patch

[~tedyu]'s comments on rb

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-11-27 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_11.patch

rebase and some simple change

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_11.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, 
> HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, 
> HBASE-10201_7.patch, HBASE-10201_8.patch, HBASE-10201_9.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-11-21 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_10.patch

[~busbey]'s comment on rb

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
> HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
> HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
> HBASE-10201_8.patch, HBASE-10201_9.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-11-20 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_9.patch

rebase for WAL refactoring

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
> HBASE-10201_9.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-11-17 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_8.patch

The pressure on log truncating is solved by LogRoller and we already have a 
testcase in TestPerColumnFamilyFlush, sorry...

And I see that shouldFlush method in HRegion only check flushSeqId and 
lastFlushTime, and return true means we have some data remain in memstore for a 
long time, so I change PeriodicMemstoreFlusher to always flush all stores 
instead of doing a selective flush.

Add comment to testCompareStoreFileCount.

ReviewBoard: https://reviews.apache.org/r/28151/diff/#

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-11-17 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_7.patch

Seems HBASE-12405 must be evaluated by HBASE-10201, so I combined the changes 
together and post it in here. HBASE-12405 can be marked as duplicated with 
HBASE-10201 maybe?

Some comments:
I do not change the way of storeing flushedSequenceIdByRegion in ServerManager. 
We need to change protobuf definition(change CompleteSequenceId in RegionLoad 
from long to a kvlist) if we want to do this, so it may break wire compatible, 
at least a rolling upgrade or downgrade will be difficult.

And also, I do not change the RecoveringRegionLastFlushedSequenceId stored on 
zk when doing distributed log replay. The reason is same.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch, HBASE-10201_7.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-11-11 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_6.patch

plan to start working on HBASE-12405, so rebase this patch to the latest master 
code.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
> HBASE-10201_6.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10201:
---
Fix Version/s: 0.98.9

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-30 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_5.patch

rebase since master's HEAD has been moved.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-30 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_4.patch

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch, HBASE-10201_4.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-28 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201-0.99.patch

Backport to branch-1.

[~enis]

I tried but failed to make dev-support/test-patch.sh work properly...

So I only run the unit tests, sorry...

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
> HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, 
> HBASE-10201_3.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-28 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_3.patch

Wrap lines.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201.patch, 
> HBASE-10201_1.patch, HBASE-10201_2.patch, HBASE-10201_3.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-28 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_2.patch

fix some typo

rename some methods and fields

use AtomicUtils.updateMin

change DEFAULT_HREGION_MEMSTORE_PER_COLUMN_FAMILY_FLUSH to true to make the 
option open by default.



> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201.patch, 
> HBASE-10201_1.patch, HBASE-10201_2.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Assignee: zhangduo

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Assignee: zhangduo
>Priority: Critical
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201.patch, 
> HBASE-10201_1.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Priority: Critical  (was: Major)

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Priority: Critical
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201.patch, 
> HBASE-10201_1.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Component/s: wal

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Priority: Critical
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201.patch, 
> HBASE-10201_1.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-27 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10201:
--
Fix Version/s: 0.99.2
   2.0.0

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Ted Yu
>Priority: Critical
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201.patch, 
> HBASE-10201_1.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-27 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201_1.patch

Modified according to [~anoop.hbase] 's suggestion.

Add a lowestPossibleSeqId check to avoid waiting for sequence id to set every 
time.

Move oldestSeqId to HRegion, do not modify Store and MemStore. And only record 
oldestSeqId when perColumnFamilyFlushEnabled is true.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201.patch, 
> HBASE-10201_1.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-27 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10201:
---
Status: Patch Available  (was: Open)

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-27 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201.patch

Sorry I was wrong, getSequenceId in HLogKey will block until logSeqNum is set, 
so I can get the sequence id before syncing WAL. But I do not want to change 
the order when mutate, so I add a method to record oldestSequenceId, which is 
called after WAL appendNoSync and before releasing updateLock.

The patch passed all unit tests on my machine.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-14 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201-0.98_2.patch

Running with TestPerColumnFamilyFlush.

3 CFs, 16B value for CF1, 256B value for CF2 and 4K value for CF3, 1M rows, 
128M memstore flush size, 16M CF flush size.

Result without per CF flush:
NumStoreFiles: 7, StoreFileSize: 4336644762, NumCompactionsCompleted: 46, 
NumFilesCompacted: 146, NumBytesCompacted: 11132103132
Write amplification: 2.57

Result with per CF flush:
NumStoreFiles: 10, StoreFileSize: 4482510274, NumCompactionsCompleted: 27, 
NumFilesCompacted: 89, NumBytesCompacted: 10353603767
Write amplification: 2.31

Next I will run this benchmark on a real cluster instead of minicluster.



> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-13 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201-0.98_1.patch

According to Ted Yu's suggestion, add a testcase called 
testCompareStoreFileCount in TestPerColumnFamilyFlush to confirm we really 
reduce the number of store files with this patch.

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
> HBASE-10201-0.98_1.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-10-12 Thread zhangduo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangduo updated HBASE-10201:
-
Attachment: HBASE-10201-0.98.patch

I port the 3149-trunk-v1.txt patch to branch 0.98(a "just make it work" 
version, not the final version). 

Port to master is more difficult because of the rewrite of HLog. 
Flush per CF means we need to record the oldest sequence id per store instead 
of per region, so the patch add a seqNum parameter when add kv to store, which 
means we need to know the seqNum before we add kv to store.
It is easy on branch 0.98, just need to change the order of appendNoSync of wal 
and write back to memstore(am I right?). But on master, HLog seems to use a 
event-driven framework, and I am not sure when will the seqNum be determined.

The second problem is the flushSeqId. on 0.98, it is just a simple incAndGet, 
but on master it uses a method in HLog. So on 0.98, if we only flush some of 
the stores, we can set the flushSeqId to the oldest seqNum stored in the stores 
that not being flushed and do not inc sequenceId. But on master, I do not know 
the side effect of the method.Is it ok to remove the method call, or we still 
need to log something?

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
> Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2013-12-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10201:
---

Attachment: 3149-trunk-v1.txt

Work in progress

> Port 'Make flush decisions per column family' to trunk
> --
>
> Key: HBASE-10201
> URL: https://issues.apache.org/jira/browse/HBASE-10201
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
> Attachments: 3149-trunk-v1.txt
>
>
> Currently the flush decision is made using the aggregate size of all column 
> families. When large and small column families co-exist, this causes many 
> small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)