[jira] [Commented] (HBASE-10378) Divide HLog interface into User and Implementor specific interfaces

2014-10-24 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184001#comment-14184001
 ] 

Sean Busbey commented on HBASE-10378:
-

cancelled patch while I fix stack's last round of comments and I make the 
change to HLogKey binary compatible.

> Divide HLog interface into User and Implementor specific interfaces
> ---
>
> Key: HBASE-10378
> URL: https://issues.apache.org/jira/browse/HBASE-10378
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Himanshu Vashishtha
>Assignee: Sean Busbey
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 10378-1.patch, 10378-2.patch
>
>
> HBASE-5937 introduces the HLog interface as a first step to support multiple 
> WAL implementations. This interface is a good start, but has some 
> limitations/drawbacks in its current state, such as:
> 1) There is no clear distinction b/w User and Implementor APIs, and it 
> provides APIs both for WAL users (append, sync, etc) and also WAL 
> implementors (Reader/Writer interfaces, etc). There are APIs which are very 
> much implementation specific (getFileNum, etc) and a user such as a 
> RegionServer shouldn't know about it.
> 2) There are about 14 methods in FSHLog which are not present in HLog 
> interface but are used at several places in the unit test code. These tests 
> typecast HLog to FSHLog, which makes it very difficult to test multiple WAL 
> implementations without doing some ugly checks.
> I'd like to propose some changes in HLog interface that would ease the multi 
> WAL story:
> 1) Have two interfaces WAL and WALService. WAL provides APIs for 
> implementors. WALService provides APIs for users (such as RegionServer).
> 2) A skeleton implementation of the above two interface as the base class for 
> other WAL implementations (AbstractWAL). It provides required fields for all 
> subclasses (fs, conf, log dir, etc). Make a minimal set of test only methods 
> and add this set in AbstractWAL.
> 3) HLogFactory returns a WALService reference when creating a WAL instance; 
> if a user need to access impl specific APIs (there are unit tests which get 
> WAL from a HRegionServer and then call impl specific APIs), use AbstractWAL 
> type casting,
> 4) Make TestHLog abstract and let all implementors provide their respective 
> test class which extends TestHLog (TestFSHLog, for example).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10378) Divide HLog interface into User and Implementor specific interfaces

2014-10-24 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-10378:

Release Note: 
HBase internals for the write ahead log have been refactored. Advanced users of 
HBase should be aware of the following changes.
  - The command for analyzing write ahead logs has been renamed from 'hlog' to 
'wal'. The old usage is deprecated and will be removed in a future version.
  - Some utility methods in the HBaseTesetingUtility related to testing 
write-ahead-logs were changed in incompatible ways. No functionality has been 
removed, but method names and arguments have changed. See the javadoc for 
HBaseTestingUtility for details.
  - The labeling of server metrics on the region server status pages changed. 
Previously, the number of backing files for the write ahead log was labeled 
'Num. HLog Files'. If you wish to see this statistic now, please look for the 
label 'Num. WAL Files.'  If you rely on JMX for these metrics, their location 
has not changed.

Adding release notes in progress.

> Divide HLog interface into User and Implementor specific interfaces
> ---
>
> Key: HBASE-10378
> URL: https://issues.apache.org/jira/browse/HBASE-10378
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Himanshu Vashishtha
>Assignee: Sean Busbey
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 10378-1.patch, 10378-2.patch
>
>
> HBASE-5937 introduces the HLog interface as a first step to support multiple 
> WAL implementations. This interface is a good start, but has some 
> limitations/drawbacks in its current state, such as:
> 1) There is no clear distinction b/w User and Implementor APIs, and it 
> provides APIs both for WAL users (append, sync, etc) and also WAL 
> implementors (Reader/Writer interfaces, etc). There are APIs which are very 
> much implementation specific (getFileNum, etc) and a user such as a 
> RegionServer shouldn't know about it.
> 2) There are about 14 methods in FSHLog which are not present in HLog 
> interface but are used at several places in the unit test code. These tests 
> typecast HLog to FSHLog, which makes it very difficult to test multiple WAL 
> implementations without doing some ugly checks.
> I'd like to propose some changes in HLog interface that would ease the multi 
> WAL story:
> 1) Have two interfaces WAL and WALService. WAL provides APIs for 
> implementors. WALService provides APIs for users (such as RegionServer).
> 2) A skeleton implementation of the above two interface as the base class for 
> other WAL implementations (AbstractWAL). It provides required fields for all 
> subclasses (fs, conf, log dir, etc). Make a minimal set of test only methods 
> and add this set in AbstractWAL.
> 3) HLogFactory returns a WALService reference when creating a WAL instance; 
> if a user need to access impl specific APIs (there are unit tests which get 
> WAL from a HRegionServer and then call impl specific APIs), use AbstractWAL 
> type casting,
> 4) Make TestHLog abstract and let all implementors provide their respective 
> test class which extends TestHLog (TestFSHLog, for example).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12142) Truncate command does not preserve ACLs table

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183980#comment-14183980
 ] 

Hudson commented on HBASE-12142:


SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #601 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/601/])
HBASE-12142 Truncate command does not preserve ACLs table (Vandana 
Ayyalasomayajula) (apurtell: rev 5b1380c3c691007625d624096f57559124518288)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/ModifyRegionUtils.java
* hbase-protocol/src/main/protobuf/Master.proto
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/MasterProtos.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityController.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/TruncateTableHandler.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* hbase-shell/src/main/ruby/hbase/admin.rb
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java


> Truncate command does not preserve ACLs table
> -
>
> Key: HBASE-12142
> URL: https://issues.apache.org/jira/browse/HBASE-12142
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.6
>Reporter: Vandana Ayyalasomayajula
>Assignee: Vandana Ayyalasomayajula
>Priority: Minor
>  Labels: security
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12142_0.patch, HBASE-12142_1.patch, 
> HBASE-12142_2.patch, HBASE-12142_98.patch, HBASE-12142_98_2.patch, 
> HBASE-12142_branch_1.patch, HBASE-12142_branch_1.patch, 
> HBASE-12142_master_addendum.patch
>
>
> The current truncate command does not preserve acls on a table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10378) Divide HLog interface into User and Implementor specific interfaces

2014-10-24 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-10378:

Status: Open  (was: Patch Available)

> Divide HLog interface into User and Implementor specific interfaces
> ---
>
> Key: HBASE-10378
> URL: https://issues.apache.org/jira/browse/HBASE-10378
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Himanshu Vashishtha
>Assignee: Sean Busbey
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 10378-1.patch, 10378-2.patch
>
>
> HBASE-5937 introduces the HLog interface as a first step to support multiple 
> WAL implementations. This interface is a good start, but has some 
> limitations/drawbacks in its current state, such as:
> 1) There is no clear distinction b/w User and Implementor APIs, and it 
> provides APIs both for WAL users (append, sync, etc) and also WAL 
> implementors (Reader/Writer interfaces, etc). There are APIs which are very 
> much implementation specific (getFileNum, etc) and a user such as a 
> RegionServer shouldn't know about it.
> 2) There are about 14 methods in FSHLog which are not present in HLog 
> interface but are used at several places in the unit test code. These tests 
> typecast HLog to FSHLog, which makes it very difficult to test multiple WAL 
> implementations without doing some ugly checks.
> I'd like to propose some changes in HLog interface that would ease the multi 
> WAL story:
> 1) Have two interfaces WAL and WALService. WAL provides APIs for 
> implementors. WALService provides APIs for users (such as RegionServer).
> 2) A skeleton implementation of the above two interface as the base class for 
> other WAL implementations (AbstractWAL). It provides required fields for all 
> subclasses (fs, conf, log dir, etc). Make a minimal set of test only methods 
> and add this set in AbstractWAL.
> 3) HLogFactory returns a WALService reference when creating a WAL instance; 
> if a user need to access impl specific APIs (there are unit tests which get 
> WAL from a HRegionServer and then call impl specific APIs), use AbstractWAL 
> type casting,
> 4) Make TestHLog abstract and let all implementors provide their respective 
> test class which extends TestHLog (TestFSHLog, for example).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11912) Catch some bad practices at compile time with error-prone

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183890#comment-14183890
 ] 

Hudson commented on HBASE-11912:


FAILURE: Integrated in HBase-TRUNK #5699 (See 
[https://builds.apache.org/job/HBase-TRUNK/5699/])
HBASE-11912 Catch some bad practices at compile time with error-prone 
(apurtell: rev 7ed0260eff425e7e7a57193a67419fae29aa4f30)
* hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
* hbase-shell/pom.xml
* hbase-prefix-tree/pom.xml
* 
hbase-prefix-tree/src/test/java/org/apache/hadoop/hbase/codec/keyvalue/TestKeyValueTool.java
* hbase-client/pom.xml
* 
hbase-prefix-tree/src/test/java/org/apache/hadoop/hbase/codec/prefixtree/row/TestPrefixTreeSearcher.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTableWrapper.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
* hbase-thrift/pom.xml
* 
hbase-prefix-tree/src/test/java/org/apache/hadoop/hbase/codec/prefixtree/row/TestRowData.java
* pom.xml
* hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/HTablePool.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java
* hbase-hadoop2-compat/pom.xml
* hbase-common/pom.xml
* hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableSnapshotInputFormat.java
* hbase-hadoop-compat/pom.xml
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java
* hbase-examples/pom.xml
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
* hbase-it/pom.xml
* hbase-server/pom.xml


> Catch some bad practices at compile time with error-prone
> -
>
> Key: HBASE-11912
> URL: https://issues.apache.org/jira/browse/HBASE-11912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
> Attachments: HBASE-11912.patch, HBASE-11912.patch, HBASE-11912.patch
>
>
> Google's error-prone (https://code.google.com/p/error-prone/) wraps javac 
> with some additional static analysis that will generate additional warnings 
> or errors at compile time if certain bug patterns 
> (https://code.google.com/p/error-prone/wiki/BugPatterns) are detected. What's 
> nice about this approach, as opposed to findbugs, is the compile time 
> detection and erroring out prevent the detected problems from getting into 
> the codebase up front.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12142) Truncate command does not preserve ACLs table

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183889#comment-14183889
 ] 

Hudson commented on HBASE-12142:


FAILURE: Integrated in HBase-0.98 #631 (See 
[https://builds.apache.org/job/HBase-0.98/631/])
HBASE-12142 Truncate command does not preserve ACLs table (Vandana 
Ayyalasomayajula) (apurtell: rev 5b1380c3c691007625d624096f57559124518288)
* hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* hbase-shell/src/main/ruby/hbase/admin.rb
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityController.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
* 
hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/MasterProtos.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java
* hbase-protocol/src/main/protobuf/Master.proto
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/ModifyRegionUtils.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/TruncateTableHandler.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java


> Truncate command does not preserve ACLs table
> -
>
> Key: HBASE-12142
> URL: https://issues.apache.org/jira/browse/HBASE-12142
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.6
>Reporter: Vandana Ayyalasomayajula
>Assignee: Vandana Ayyalasomayajula
>Priority: Minor
>  Labels: security
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12142_0.patch, HBASE-12142_1.patch, 
> HBASE-12142_2.patch, HBASE-12142_98.patch, HBASE-12142_98_2.patch, 
> HBASE-12142_branch_1.patch, HBASE-12142_branch_1.patch, 
> HBASE-12142_master_addendum.patch
>
>
> The current truncate command does not preserve acls on a table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12334) Handling of DeserializationException causes needless retry on failure

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183885#comment-14183885
 ] 

Hudson commented on HBASE-12334:


SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #600 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/600/])
HBASE-12334 Handling of DeserializationException causes needless retry on 
failure. (larsh: rev 714e8368c33aab14dcc461180cff14fe65d9cdd3)
* hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java


> Handling of DeserializationException causes needless retry on failure
> -
>
> Key: HBASE-12334
> URL: https://issues.apache.org/jira/browse/HBASE-12334
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.7
>Reporter: James Taylor
>Assignee: Lars Hofhansl
>  Labels: Phoenix
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: 12334-0.98.txt
>
>
> If an unexpected exception occurs when deserialization occurs for a custom 
> filter, the exception gets wrapped in a DeserializationException. Since this 
> exception is in turn wrapped in an IOException, the many loop retry logic 
> kicks in. The net effect is that this same deserialization error occurs again 
> and again as the retries occur, just causing the client to wait needlessly.
> IMO, either the parseFrom methods should be allowed to throw whatever type of 
> IOException they'd like, in which case they could throw a 
> DoNotRetryIOException, or a DeserializationException should be wrapped in a 
> DoNotRetryIOException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12142) Truncate command does not preserve ACLs table

2014-10-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183875#comment-14183875
 ] 

Andrew Purtell commented on HBASE-12142:


No it did not apply. TestAccessController needed a small fix up. 

> Truncate command does not preserve ACLs table
> -
>
> Key: HBASE-12142
> URL: https://issues.apache.org/jira/browse/HBASE-12142
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.6
>Reporter: Vandana Ayyalasomayajula
>Assignee: Vandana Ayyalasomayajula
>Priority: Minor
>  Labels: security
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12142_0.patch, HBASE-12142_1.patch, 
> HBASE-12142_2.patch, HBASE-12142_98.patch, HBASE-12142_98_2.patch, 
> HBASE-12142_branch_1.patch, HBASE-12142_branch_1.patch, 
> HBASE-12142_master_addendum.patch
>
>
> The current truncate command does not preserve acls on a table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12142) Truncate command does not preserve ACLs table

2014-10-24 Thread Vandana Ayyalasomayajula (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183872#comment-14183872
 ] 

Vandana Ayyalasomayajula commented on HBASE-12142:
--

[~apurtell] The patch for branch-1 is already attached to the jira.I think you 
should be able to apply that directly.

> Truncate command does not preserve ACLs table
> -
>
> Key: HBASE-12142
> URL: https://issues.apache.org/jira/browse/HBASE-12142
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.6
>Reporter: Vandana Ayyalasomayajula
>Assignee: Vandana Ayyalasomayajula
>Priority: Minor
>  Labels: security
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12142_0.patch, HBASE-12142_1.patch, 
> HBASE-12142_2.patch, HBASE-12142_98.patch, HBASE-12142_98_2.patch, 
> HBASE-12142_branch_1.patch, HBASE-12142_branch_1.patch, 
> HBASE-12142_master_addendum.patch
>
>
> The current truncate command does not preserve acls on a table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183865#comment-14183865
 ] 

Nick Dimiduk commented on HBASE-12335:
--

bq. Do replicas have a better 99th percentile in your testing? 

Usually, but not always. Will need to dig deeper into logs on a failed run. You 
can see some recent results in 
https://docs.google.com/spreadsheets/d/1oapgHAtliDlH73rNH-BTme4ETxkpADFRabtvP9dZ2o0/edit?usp=sharing
 .

[~enis] also suggested adding meta server skipping in 
RestartRsHoldingTableAction. Will come back to this for another round. Will 
give that a go as well.

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-12335.00-0.99.patch, HBASE-12335.00.patch, 
> HBASE-12335.00.patch, HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12338) Client side scanning prefetching.

2014-10-24 Thread Yi Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183853#comment-14183853
 ] 

Yi Deng commented on HBASE-12338:
-

[~eclark] That just needs a change in `TableInputFormat.java`

> Client side scanning prefetching.
> -
>
> Key: HBASE-12338
> URL: https://issues.apache.org/jira/browse/HBASE-12338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 1.0.0, 2.0.0, 0.98.6.1
>Reporter: Yi Deng
>Assignee: Yi Deng
>  Labels: prefetch, results, scanner
> Attachments: 
> 0001-Add-ScanPrefetcher-for-client-side-scanning-prefetch.patch, 
> 0001-ScanPrefetcher.patch
>
>
> Since server side prefetching was not proved to be a good way to prefetch, we 
> need to do it on client side.
> This is a wrapper class that takes any instance of `ResultScanner` as the 
> underneath scanning component. The class will schedule the scanning in a 
> background thread. There is a buffering queue storing prefetched results, 
> whose's length is configurable. The prefetcher will release the thread if the 
> queue is full and wait for results to be consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12334) Handling of DeserializationException causes needless retry on failure

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183842#comment-14183842
 ] 

Hudson commented on HBASE-12334:


SUCCESS: Integrated in HBase-TRUNK #5698 (See 
[https://builds.apache.org/job/HBase-TRUNK/5698/])
HBASE-12334 Handling of DeserializationException causes needless retry on 
failure. (larsh: rev 6c7543c9c7d88b6d77e8d0d52ddad260fb487ae4)
* hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java


> Handling of DeserializationException causes needless retry on failure
> -
>
> Key: HBASE-12334
> URL: https://issues.apache.org/jira/browse/HBASE-12334
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.7
>Reporter: James Taylor
>Assignee: Lars Hofhansl
>  Labels: Phoenix
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: 12334-0.98.txt
>
>
> If an unexpected exception occurs when deserialization occurs for a custom 
> filter, the exception gets wrapped in a DeserializationException. Since this 
> exception is in turn wrapped in an IOException, the many loop retry logic 
> kicks in. The net effect is that this same deserialization error occurs again 
> and again as the retries occur, just causing the client to wait needlessly.
> IMO, either the parseFrom methods should be allowed to throw whatever type of 
> IOException they'd like, in which case they could throw a 
> DoNotRetryIOException, or a DeserializationException should be wrapped in a 
> DoNotRetryIOException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11912) Catch some bad practices at compile time with error-prone

2014-10-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183840#comment-14183840
 ] 

Andrew Purtell commented on HBASE-11912:


Pushed to master. Can revert if something breaks

> Catch some bad practices at compile time with error-prone
> -
>
> Key: HBASE-11912
> URL: https://issues.apache.org/jira/browse/HBASE-11912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
> Attachments: HBASE-11912.patch, HBASE-11912.patch, HBASE-11912.patch
>
>
> Google's error-prone (https://code.google.com/p/error-prone/) wraps javac 
> with some additional static analysis that will generate additional warnings 
> or errors at compile time if certain bug patterns 
> (https://code.google.com/p/error-prone/wiki/BugPatterns) are detected. What's 
> nice about this approach, as opposed to findbugs, is the compile time 
> detection and erroring out prevent the detected problems from getting into 
> the codebase up front.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12334) Handling of DeserializationException causes needless retry on failure

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183836#comment-14183836
 ] 

Hudson commented on HBASE-12334:


SUCCESS: Integrated in HBase-1.0 #357 (See 
[https://builds.apache.org/job/HBase-1.0/357/])
HBASE-12334 Handling of DeserializationException causes needless retry on 
failure. (larsh: rev 7ec441ca4fbcb89ef2fbe955ed19eca60cc7c860)
* hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java


> Handling of DeserializationException causes needless retry on failure
> -
>
> Key: HBASE-12334
> URL: https://issues.apache.org/jira/browse/HBASE-12334
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.7
>Reporter: James Taylor
>Assignee: Lars Hofhansl
>  Labels: Phoenix
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: 12334-0.98.txt
>
>
> If an unexpected exception occurs when deserialization occurs for a custom 
> filter, the exception gets wrapped in a DeserializationException. Since this 
> exception is in turn wrapped in an IOException, the many loop retry logic 
> kicks in. The net effect is that this same deserialization error occurs again 
> and again as the retries occur, just causing the client to wait needlessly.
> IMO, either the parseFrom methods should be allowed to throw whatever type of 
> IOException they'd like, in which case they could throw a 
> DoNotRetryIOException, or a DeserializationException should be wrapped in a 
> DoNotRetryIOException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12142) Truncate command does not preserve ACLs table

2014-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183834#comment-14183834
 ] 

Hadoop QA commented on HBASE-12142:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12677072/HBASE-12142_branch_1.patch
  against trunk revision .
  ATTACHMENT ID: 12677072

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11465//console

This message is automatically generated.

> Truncate command does not preserve ACLs table
> -
>
> Key: HBASE-12142
> URL: https://issues.apache.org/jira/browse/HBASE-12142
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.6
>Reporter: Vandana Ayyalasomayajula
>Assignee: Vandana Ayyalasomayajula
>Priority: Minor
>  Labels: security
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12142_0.patch, HBASE-12142_1.patch, 
> HBASE-12142_2.patch, HBASE-12142_98.patch, HBASE-12142_98_2.patch, 
> HBASE-12142_branch_1.patch, HBASE-12142_branch_1.patch, 
> HBASE-12142_master_addendum.patch
>
>
> The current truncate command does not preserve acls on a table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12142) Truncate command does not preserve ACLs table

2014-10-24 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12142:
---
Attachment: HBASE-12142_branch_1.patch

Pushed to 0.98.

There is an impact to the running time of TestAdmin so I filed HBASE-12344 for 
followup. 

We're still missing a commit to branch-1 before this issue can be resolved. 
Refreshed the branch-1 patch. The test passes locally for me.  Ping [~enis], 
for real this time. 

> Truncate command does not preserve ACLs table
> -
>
> Key: HBASE-12142
> URL: https://issues.apache.org/jira/browse/HBASE-12142
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.6
>Reporter: Vandana Ayyalasomayajula
>Assignee: Vandana Ayyalasomayajula
>Priority: Minor
>  Labels: security
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12142_0.patch, HBASE-12142_1.patch, 
> HBASE-12142_2.patch, HBASE-12142_98.patch, HBASE-12142_98_2.patch, 
> HBASE-12142_branch_1.patch, HBASE-12142_branch_1.patch, 
> HBASE-12142_master_addendum.patch
>
>
> The current truncate command does not preserve acls on a table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-2609) Harmonize the Get and Delete operations

2014-10-24 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183830#comment-14183830
 ] 

Enis Soztutar commented on HBASE-2609:
--

Wow an old one. Patch looks good for branch-1. Do we need an interface to hold 
addColumn() methods? We have Mutate and Query for write and read ops 
respectively, but this covers a Get/Scan and Delete. Can do in another issue. 

> Harmonize the Get and Delete operations
> ---
>
> Key: HBASE-2609
> URL: https://issues.apache.org/jira/browse/HBASE-2609
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Jeff Hammerbacher
>Assignee: stack
> Fix For: 0.99.2
>
> Attachments: 2609.txt, 2609v2.txt
>
>
> In my work on HBASE-2400, implementing deletes for the Avro server felt quite 
> awkward. Rather than the clean API of the Get object, which allows 
> restrictions on the result set from a row to be expressed with addColumn, 
> addFamily, setTimeStamp, setTimeRange, setMaxVersions, and setFilters, the 
> Delete object hides these semantics behind various constructors to 
> deleteColumn[s] an deleteFamily. From my naive vantage point, I see no reason 
> why it would be a bad idea to mimic the Get API exactly, though I could quite 
> possibly be missing something. Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12334) Handling of DeserializationException causes needless retry on failure

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183829#comment-14183829
 ] 

Hudson commented on HBASE-12334:


FAILURE: Integrated in HBase-0.98 #630 (See 
[https://builds.apache.org/job/HBase-0.98/630/])
HBASE-12334 Handling of DeserializationException causes needless retry on 
failure. (larsh: rev 714e8368c33aab14dcc461180cff14fe65d9cdd3)
* hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java


> Handling of DeserializationException causes needless retry on failure
> -
>
> Key: HBASE-12334
> URL: https://issues.apache.org/jira/browse/HBASE-12334
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.7
>Reporter: James Taylor
>Assignee: Lars Hofhansl
>  Labels: Phoenix
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: 12334-0.98.txt
>
>
> If an unexpected exception occurs when deserialization occurs for a custom 
> filter, the exception gets wrapped in a DeserializationException. Since this 
> exception is in turn wrapped in an IOException, the many loop retry logic 
> kicks in. The net effect is that this same deserialization error occurs again 
> and again as the retries occur, just causing the client to wait needlessly.
> IMO, either the parseFrom methods should be allowed to throw whatever type of 
> IOException they'd like, in which case they could throw a 
> DoNotRetryIOException, or a DeserializationException should be wrapped in a 
> DoNotRetryIOException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12338) Client side scanning prefetching.

2014-10-24 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183825#comment-14183825
 ] 

Elliott Clark commented on HBASE-12338:
---

If the pre-fetching is useful should we have a way to turn it on for MR ?

> Client side scanning prefetching.
> -
>
> Key: HBASE-12338
> URL: https://issues.apache.org/jira/browse/HBASE-12338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 1.0.0, 2.0.0, 0.98.6.1
>Reporter: Yi Deng
>Assignee: Yi Deng
>  Labels: prefetch, results, scanner
> Attachments: 
> 0001-Add-ScanPrefetcher-for-client-side-scanning-prefetch.patch, 
> 0001-ScanPrefetcher.patch
>
>
> Since server side prefetching was not proved to be a good way to prefetch, we 
> need to do it on client side.
> This is a wrapper class that takes any instance of `ResultScanner` as the 
> underneath scanning component. The class will schedule the scanning in a 
> background thread. There is a buffering queue storing prefetched results, 
> whose's length is configurable. The prefetcher will release the thread if the 
> queue is full and wait for results to be consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12344) Split up TestAdmin

2014-10-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183796#comment-14183796
 ] 

Andrew Purtell commented on HBASE-12344:


Running time above measured on 0.98 branch

> Split up TestAdmin
> --
>
> Key: HBASE-12344
> URL: https://issues.apache.org/jira/browse/HBASE-12344
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
>
> Running time for TestAdmin on a dev box is about 400 seconds before 
> HBASE-12142, 500 seconds after.  Split it up. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12344) Split up TestAdmin

2014-10-24 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-12344:
--

 Summary: Split up TestAdmin
 Key: HBASE-12344
 URL: https://issues.apache.org/jira/browse/HBASE-12344
 Project: HBase
  Issue Type: Task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 2.0.0, 0.98.8, 0.99.2


Running time for TestAdmin on a dev box is about 400 seconds before 
HBASE-12142, 500 seconds after.  Split it up. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12075) Preemptive Fast Fail

2014-10-24 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183788#comment-14183788
 ] 

Elliott Clark commented on HBASE-12075:
---

So this looks good to me. [~stack] [~tedyu] You up for another review ? This is 
a pretty useful feature for making sure that bad machines aren't a cancer that 
grow though your whole application.

> Preemptive Fast Fail
> 
>
> Key: HBASE-12075
> URL: https://issues.apache.org/jira/browse/HBASE-12075
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 0.99.0, 2.0.0, 0.98.6.1
>Reporter: Manukranth Kolloju
>Assignee: Manukranth Kolloju
> Attachments: 0001-Add-a-test-case-for-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-HBASE-12075-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch, 
> 0001-Implement-Preemptive-Fast-Fail.patch
>
>
> In multi threaded clients, we use a feature developed on 0.89-fb branch 
> called Preemptive Fast Fail. This allows the client threads which would 
> potentially fail, fail fast. The idea behind this feature is that we allow, 
> among the hundreds of client threads, one thread to try and establish 
> connection with the regionserver and if that succeeds, we mark it as a live 
> node again. Meanwhile, other threads which are trying to establish connection 
> to the same server would ideally go into the timeouts which is effectively 
> unfruitful. We can in those cases return appropriate exceptions to those 
> clients instead of letting them retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12311) Version stats in HFiles?

2014-10-24 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-12311:
--
Attachment: 12311.txt

W.I.P. patch. Just parking it. Does not work.
Need to find a good way of passing this from Store to SQM. Could use ScanInfo, 
but that's used in some coprocs - then again it's marked with 
InterfaceAudience.Private.

> Version stats in HFiles?
> 
>
> Key: HBASE-12311
> URL: https://issues.apache.org/jira/browse/HBASE-12311
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
> Attachments: 12311.txt, CellStatTracker.java
>
>
> In HBASE-9778 I basically punted the decision on whether doing repeated 
> scanner.next() called instead of the issueing (re)seeks to the user.
> I think we can do better.
> One way do that is maintain simple stats of what the maximum number of 
> versions we've seen for any row/col combination and store these in the 
> HFile's metadata (just like the timerange, oldest Put, etc).
> Then we estimate fairly accurately whether we have to expect lots of versions 
> (i.e. seek between columns is better) or not (in which case we'd issue 
> repeated next()'s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183753#comment-14183753
 ] 

stack commented on HBASE-12335:
---

Do replicas have a better 99th percentile in your testing?

Patch lgtm.

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-12335.00-0.99.patch, HBASE-12335.00.patch, 
> HBASE-12335.00.patch, HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12338) Client side scanning prefetching.

2014-10-24 Thread Yi Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183729#comment-14183729
 ] 

Yi Deng commented on HBASE-12338:
-

[~stack] I'll make a benchmark testing and paste the result.

> Client side scanning prefetching.
> -
>
> Key: HBASE-12338
> URL: https://issues.apache.org/jira/browse/HBASE-12338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 1.0.0, 2.0.0, 0.98.6.1
>Reporter: Yi Deng
>Assignee: Yi Deng
>  Labels: prefetch, results, scanner
> Attachments: 
> 0001-Add-ScanPrefetcher-for-client-side-scanning-prefetch.patch, 
> 0001-ScanPrefetcher.patch
>
>
> Since server side prefetching was not proved to be a good way to prefetch, we 
> need to do it on client side.
> This is a wrapper class that takes any instance of `ResultScanner` as the 
> underneath scanning component. The class will schedule the scanning in a 
> background thread. There is a buffering queue storing prefetched results, 
> whose's length is configurable. The prefetcher will release the thread if the 
> queue is full and wait for results to be consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11912) Catch some bad practices at compile time with error-prone

2014-10-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183720#comment-14183720
 ] 

Andrew Purtell commented on HBASE-11912:


Sure, let me rebase and do that later today.

> Catch some bad practices at compile time with error-prone
> -
>
> Key: HBASE-11912
> URL: https://issues.apache.org/jira/browse/HBASE-11912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
> Attachments: HBASE-11912.patch, HBASE-11912.patch, HBASE-11912.patch
>
>
> Google's error-prone (https://code.google.com/p/error-prone/) wraps javac 
> with some additional static analysis that will generate additional warnings 
> or errors at compile time if certain bug patterns 
> (https://code.google.com/p/error-prone/wiki/BugPatterns) are detected. What's 
> nice about this approach, as opposed to findbugs, is the compile time 
> detection and erroring out prevent the detected problems from getting into 
> the codebase up front.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-12334) Handling of DeserializationException causes needless retry on failure

2014-10-24 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-12334.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

Pushed to 0.98+

> Handling of DeserializationException causes needless retry on failure
> -
>
> Key: HBASE-12334
> URL: https://issues.apache.org/jira/browse/HBASE-12334
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.7
>Reporter: James Taylor
>Assignee: Lars Hofhansl
>  Labels: Phoenix
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: 12334-0.98.txt
>
>
> If an unexpected exception occurs when deserialization occurs for a custom 
> filter, the exception gets wrapped in a DeserializationException. Since this 
> exception is in turn wrapped in an IOException, the many loop retry logic 
> kicks in. The net effect is that this same deserialization error occurs again 
> and again as the retries occur, just causing the client to wait needlessly.
> IMO, either the parseFrom methods should be allowed to throw whatever type of 
> IOException they'd like, in which case they could throw a 
> DoNotRetryIOException, or a DeserializationException should be wrapped in a 
> DoNotRetryIOException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11964) Improve spreading replication load from failed regionservers

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183713#comment-14183713
 ] 

Hudson commented on HBASE-11964:


SUCCESS: Integrated in HBase-TRUNK #5697 (See 
[https://builds.apache.org/job/HBase-TRUNK/5697/])
HBASE-11964 Improve spreading replication load from failed regionservers 
(apurtell: rev 97acb9ef24094042548a2981bdd99767156caeb3)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationBase.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java


> Improve spreading replication load from failed regionservers
> 
>
> Key: HBASE-11964
> URL: https://issues.apache.org/jira/browse/HBASE-11964
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-11964.patch, HBASE-11964.patch, HBASE-11964.patch
>
>
> Improve replication source thread handling. Improve fanout when transferring 
> queues. Ensure replication sources terminate properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964

2014-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183696#comment-14183696
 ] 

Hadoop QA commented on HBASE-12343:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677018/HBASE-12343.patch
  against trunk revision .
  ATTACHMENT ID: 12677018

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.util.TestBytes.testToStringBytesBinaryReversible(TestBytes.java:296)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11463//console

This message is automatically generated.

> Document recommended configuration for 0.98 from HBASE-11964
> 
>
> Key: HBASE-12343
> URL: https://issues.apache.org/jira/browse/HBASE-12343
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0
>
> Attachments: HBASE-12343.patch
>
>
> We're not committing the configuration changes from HBASE-11964 to 0.98 but 
> they should be the recommend configuration for replication. Add a paragraph 
> to the replication section of the manual on this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183697#comment-14183697
 ] 

Hadoop QA commented on HBASE-12335:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677034/HBASE-12335.00.patch
  against trunk revision .
  ATTACHMENT ID: 12677034

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
3789 checkstyle errors (more than the trunk's current 3788 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11464//console

This message is automatically generated.

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-12335.00-0.99.patch, HBASE-12335.00.patch, 
> HBASE-12335.00.patch, HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11983) HRegion constructors should not create HLog

2014-10-24 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-11983:
-
Labels: beginner  (was: )

> HRegion constructors should not create HLog 
> 
>
> Key: HBASE-11983
> URL: https://issues.apache.org/jira/browse/HBASE-11983
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Enis Soztutar
>Assignee: Sean Busbey
>  Labels: beginner
>
> We should get rid of HRegion creating its own HLog. It should ALWAYS get the 
> log from outside. 
> I think this was added for unit tests, but we should refrain from such 
> practice in the future (adding UT constructors always leads to weird and 
> critical bugs down the road). See recent: HBASE-11982, HBASE-11654. 
> Get rid of weird things like ignoreHLog:
> {code}
>   /**
>* @param ignoreHLog - true to skip generate new hlog if it is null, mostly 
> for createTable
>*/
>   public static HRegion createHRegion(final HRegionInfo info, final Path 
> rootDir,
>   final Configuration conf,
>   final HTableDescriptor hTableDescriptor,
>   final HLog hlog,
>   final boolean initialize, final boolean 
> ignoreHLog)
> {code}
> We can unify all the createXX and newXX methods and separate creating a 
> region in the file system vs opening a region. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11983) HRegion constructors should not create HLog

2014-10-24 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183680#comment-14183680
 ] 

Nick Dimiduk commented on HBASE-11983:
--

I noticed this is an issue causing our tests to leak file descriptors. I had 
some time for cleanup, but that time has passed. No specific hurry, just 
hygiene at this point.

> HRegion constructors should not create HLog 
> 
>
> Key: HBASE-11983
> URL: https://issues.apache.org/jira/browse/HBASE-11983
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Enis Soztutar
>Assignee: Sean Busbey
>  Labels: beginner
>
> We should get rid of HRegion creating its own HLog. It should ALWAYS get the 
> log from outside. 
> I think this was added for unit tests, but we should refrain from such 
> practice in the future (adding UT constructors always leads to weird and 
> critical bugs down the road). See recent: HBASE-11982, HBASE-11654. 
> Get rid of weird things like ignoreHLog:
> {code}
>   /**
>* @param ignoreHLog - true to skip generate new hlog if it is null, mostly 
> for createTable
>*/
>   public static HRegion createHRegion(final HRegionInfo info, final Path 
> rootDir,
>   final Configuration conf,
>   final HTableDescriptor hTableDescriptor,
>   final HLog hlog,
>   final boolean initialize, final boolean 
> ignoreHLog)
> {code}
> We can unify all the createXX and newXX methods and separate creating a 
> region in the file system vs opening a region. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11964) Improve spreading replication load from failed regionservers

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183673#comment-14183673
 ] 

Hudson commented on HBASE-11964:


SUCCESS: Integrated in HBase-1.0 #355 (See 
[https://builds.apache.org/job/HBase-1.0/355/])
HBASE-11964 Improve spreading replication load from failed regionservers 
(apurtell: rev 54fdd965168f671e981703e9069de485ec8b148e)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationBase.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java


> Improve spreading replication load from failed regionservers
> 
>
> Key: HBASE-11964
> URL: https://issues.apache.org/jira/browse/HBASE-11964
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-11964.patch, HBASE-11964.patch, HBASE-11964.patch
>
>
> Improve replication source thread handling. Improve fanout when transferring 
> queues. Ensure replication sources terminate properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12335:
-
Attachment: HBASE-12335.00.patch

Reattaching for BuildBot.

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-12335.00-0.99.patch, HBASE-12335.00.patch, 
> HBASE-12335.00.patch, HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11912) Catch some bad practices at compile time with error-prone

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183573#comment-14183573
 ] 

stack commented on HBASE-11912:
---

Want to try committing to master [~apurtell]?  (Patch looks good)

> Catch some bad practices at compile time with error-prone
> -
>
> Key: HBASE-11912
> URL: https://issues.apache.org/jira/browse/HBASE-11912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
> Attachments: HBASE-11912.patch, HBASE-11912.patch, HBASE-11912.patch
>
>
> Google's error-prone (https://code.google.com/p/error-prone/) wraps javac 
> with some additional static analysis that will generate additional warnings 
> or errors at compile time if certain bug patterns 
> (https://code.google.com/p/error-prone/wiki/BugPatterns) are detected. What's 
> nice about this approach, as opposed to findbugs, is the compile time 
> detection and erroring out prevent the detected problems from getting into 
> the codebase up front.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12338) Client side scanning prefetching.

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183569#comment-14183569
 ] 

stack commented on HBASE-12338:
---

Yeah, I suppose I'm not clear no how/why I'd use it.  Does it speed up stuff 
[~daviddengfb]?

> Client side scanning prefetching.
> -
>
> Key: HBASE-12338
> URL: https://issues.apache.org/jira/browse/HBASE-12338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 1.0.0, 2.0.0, 0.98.6.1
>Reporter: Yi Deng
>Assignee: Yi Deng
>  Labels: prefetch, results, scanner
> Attachments: 
> 0001-Add-ScanPrefetcher-for-client-side-scanning-prefetch.patch, 
> 0001-ScanPrefetcher.patch
>
>
> Since server side prefetching was not proved to be a good way to prefetch, we 
> need to do it on client side.
> This is a wrapper class that takes any instance of `ResultScanner` as the 
> underneath scanning component. The class will schedule the scanning in a 
> background thread. There is a buffering queue storing prefetched results, 
> whose's length is configurable. The prefetcher will release the thread if the 
> queue is full and wait for results to be consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-24 Thread Dima Spivak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dima Spivak updated HBASE-12285:

Status: In Progress  (was: Patch Available)

Trying to track down the stream-breaking culprit tests...

> Builds are failing, possibly because of SUREFIRE-1091
> -
>
> Key: HBASE-12285
> URL: https://issues.apache.org/jira/browse/HBASE-12285
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>Priority: Blocker
> Attachments: HBASE-12285_branch-1_v1.patch
>
>
> Our branch-1 builds on builds.apache.org have been failing in recent days 
> after we switched over to an official version of Surefire a few days back 
> (HBASE-4955). The version we're using, 2.17, is hit by a bug 
> ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
> in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183564#comment-14183564
 ] 

stack commented on HBASE-12285:
---

Since changing to log WARN only, builds passed three times where previous they 
always failed.  See here 
https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/

> Builds are failing, possibly because of SUREFIRE-1091
> -
>
> Key: HBASE-12285
> URL: https://issues.apache.org/jira/browse/HBASE-12285
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>Priority: Blocker
> Attachments: HBASE-12285_branch-1_v1.patch
>
>
> Our branch-1 builds on builds.apache.org have been failing in recent days 
> after we switched over to an official version of Surefire a few days back 
> (HBASE-4955). The version we're using, 2.17, is hit by a bug 
> ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
> in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183557#comment-14183557
 ] 

Nick Dimiduk commented on HBASE-12335:
--

Checking the mean of the stdev is consistently passing on my test cluster, 5/5 
runs. I think this is good enough for now; there may be more improvements to 
make both to the test and the logic for firing queries to replicas.

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-12335.00-0.99.patch, HBASE-12335.00.patch, 
> HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964

2014-10-24 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12343:
---
Status: Patch Available  (was: Open)

> Document recommended configuration for 0.98 from HBASE-11964
> 
>
> Key: HBASE-12343
> URL: https://issues.apache.org/jira/browse/HBASE-12343
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0
>
> Attachments: HBASE-12343.patch
>
>
> We're not committing the configuration changes from HBASE-11964 to 0.98 but 
> they should be the recommend configuration for replication. Add a paragraph 
> to the replication section of the manual on this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964

2014-10-24 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12343:
---
Attachment: HBASE-12343.patch

Ping [~misty]

> Document recommended configuration for 0.98 from HBASE-11964
> 
>
> Key: HBASE-12343
> URL: https://issues.apache.org/jira/browse/HBASE-12343
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0
>
> Attachments: HBASE-12343.patch
>
>
> We're not committing the configuration changes from HBASE-11964 to 0.98 but 
> they should be the recommend configuration for replication. Add a paragraph 
> to the replication section of the manual on this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183542#comment-14183542
 ] 

Vladimir Rodionov commented on HBASE-12324:
---

{quote}
BTW, you cannot delete a file under the region using an external tool if the 
region is being served (table enabled, hbase cluster running).
{quote}

I am pretty sure that the system will remain in operable state. Some 
HFileReaders will fail, of course.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183532#comment-14183532
 ] 

Hadoop QA commented on HBASE-12335:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12677011/HBASE-12335.00-0.99.patch
  against trunk revision .
  ATTACHMENT ID: 12677011

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11461//console

This message is automatically generated.

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-12335.00-0.99.patch, HBASE-12335.00.patch, 
> HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183520#comment-14183520
 ] 

Enis Soztutar commented on HBASE-12324:
---

This compaction policy makes sense with HBASE-10141 I think. Given the use 
case, it disables compactions effectively, but still lets TTL do the job. The 
problem with disable compactions using regular configuration is that, only 
compaction will get rid of hfiles, so disabling compactions will not expire any 
files. With this compaction policy, we trigger compactions, but the compaction 
selection will not select any files. 
bq. Run periodically utility which purge/archive the oldest HFiles
BTW, you cannot delete a file under the region using an external tool if the 
region is being served (table enabled, hbase cluster running).
bq. It's actually worse than that, because the clock could adjust and we could 
have a file timestamp that is older than the cell timestamps within it. That 
would result in deleting data that isn't yet expired. (presuming the timestamp 
will be set based on when the server calls close())
That is how TTL's work in HBase. The RS compares the max TS of the file / cell 
with the current timestamp. 
bq. You will never read this stale data back unles you have MIN_VERSIONS > 0 
for that CF.
I think HBASE-10141 and MIN_VERSIONS > 0 is incompatible. We may need to 
address / document that. 


> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11964) Improve spreading replication load from failed regionservers

2014-10-24 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11964:
---
Fix Version/s: (was: 0.94.25)
   (was: 0.98.8)

> Improve spreading replication load from failed regionservers
> 
>
> Key: HBASE-11964
> URL: https://issues.apache.org/jira/browse/HBASE-11964
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-11964.patch, HBASE-11964.patch, HBASE-11964.patch
>
>
> Improve replication source thread handling. Improve fanout when transferring 
> queues. Ensure replication sources terminate properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12335:
-
Attachment: HBASE-12335.00-0.99.patch

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-12335.00-0.99.patch, HBASE-12335.00.patch, 
> HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11964) Improve spreading replication load from failed regionservers

2014-10-24 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11964:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to master and branch-1

> Improve spreading replication load from failed regionservers
> 
>
> Key: HBASE-11964
> URL: https://issues.apache.org/jira/browse/HBASE-11964
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-11964.patch, HBASE-11964.patch, HBASE-11964.patch
>
>
> Improve replication source thread handling. Improve fanout when transferring 
> queues. Ensure replication sources terminate properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964

2014-10-24 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-12343:
--

 Summary: Document recommended configuration for 0.98 from 
HBASE-11964
 Key: HBASE-12343
 URL: https://issues.apache.org/jira/browse/HBASE-12343
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell


We're not committing the configuration changes from HBASE-11964 to 0.98 but 
they should be the recommend configuration for replication. Add a paragraph to 
the replication section of the manual on this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964

2014-10-24 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reassigned HBASE-12343:
--

Assignee: Andrew Purtell

> Document recommended configuration for 0.98 from HBASE-11964
> 
>
> Key: HBASE-12343
> URL: https://issues.apache.org/jira/browse/HBASE-12343
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0
>
>
> We're not committing the configuration changes from HBASE-11964 to 0.98 but 
> they should be the recommend configuration for replication. Add a paragraph 
> to the replication section of the manual on this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12335:
-
Fix Version/s: 0.99.2
   2.0.0
Affects Version/s: 2.0.0
   0.99.0
   Status: Patch Available  (was: Open)

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 0.99.2
>
> Attachments: HBASE-12335.00.patch, HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183518#comment-14183518
 ] 

Vladimir Rodionov commented on HBASE-12324:
---

[~ndimiduk]:
{quote}
What I'm hearing is that it could be interesting to be able to flag a table or 
column family as "immutable". This combined with a TTL setting can allow for 
significant compaction optimizations. Next steps would be to enable such a 
configuration and write an integration test that demonstrates the benefit of 
this configuration.
{quote}

Benefits:

* Constant sustained write rate which does not decreases over time. All DB 
systems that do not update data in - place and relies on a compaction to merge 
delete/updates suffer badly when data set size becomes larger and larger. 
Sustained write rate decreases significantly over time (anti -logarithmically)
* A lot of optimizations can be done on a read path as well (no need to keep 
track for deleted cell, for example)

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12343) Document recommended configuration for 0.98 from HBASE-11964

2014-10-24 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12343:
---
Fix Version/s: 2.0.0

> Document recommended configuration for 0.98 from HBASE-11964
> 
>
> Key: HBASE-12343
> URL: https://issues.apache.org/jira/browse/HBASE-12343
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0
>
>
> We're not committing the configuration changes from HBASE-11964 to 0.98 but 
> they should be the recommend configuration for replication. Add a paragraph 
> to the replication section of the manual on this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12335:
-
Attachment: HBASE-12335.00.patch

Updating patch, I missed a git-add.

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Attachments: HBASE-12335.00.patch, HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12142) Truncate command does not preserve ACLs table

2014-10-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183511#comment-14183511
 ] 

Andrew Purtell commented on HBASE-12142:


Going to commit to 0.98 shortly unless objection

> Truncate command does not preserve ACLs table
> -
>
> Key: HBASE-12142
> URL: https://issues.apache.org/jira/browse/HBASE-12142
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.6
>Reporter: Vandana Ayyalasomayajula
>Assignee: Vandana Ayyalasomayajula
>Priority: Minor
>  Labels: security
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12142_0.patch, HBASE-12142_1.patch, 
> HBASE-12142_2.patch, HBASE-12142_98.patch, HBASE-12142_98_2.patch, 
> HBASE-12142_branch_1.patch, HBASE-12142_master_addendum.patch
>
>
> The current truncate command does not preserve acls on a table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12142) Truncate command does not preserve ACLs table

2014-10-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183508#comment-14183508
 ] 

Andrew Purtell commented on HBASE-12142:


Hey, look at that, you already did :-)

> Truncate command does not preserve ACLs table
> -
>
> Key: HBASE-12142
> URL: https://issues.apache.org/jira/browse/HBASE-12142
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.6
>Reporter: Vandana Ayyalasomayajula
>Assignee: Vandana Ayyalasomayajula
>Priority: Minor
>  Labels: security
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12142_0.patch, HBASE-12142_1.patch, 
> HBASE-12142_2.patch, HBASE-12142_98.patch, HBASE-12142_98_2.patch, 
> HBASE-12142_branch_1.patch, HBASE-12142_master_addendum.patch
>
>
> The current truncate command does not preserve acls on a table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12142) Truncate command does not preserve ACLs table

2014-10-24 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12142:
---
Release Note: 
Prior to this change, the truncate shell command could not preserve ACLs on the 
table being truncated. In the 0.98 branch, this change also backports 
HBASE-8332, which adds a master handler for table truncation and new HBaseAdmin 
APIs for same. 


Looks good. Let me put up another 0.98 patch in a sec that adds the tests from 
HBASE-8332 also. 

> Truncate command does not preserve ACLs table
> -
>
> Key: HBASE-12142
> URL: https://issues.apache.org/jira/browse/HBASE-12142
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.6
>Reporter: Vandana Ayyalasomayajula
>Assignee: Vandana Ayyalasomayajula
>Priority: Minor
>  Labels: security
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: HBASE-12142_0.patch, HBASE-12142_1.patch, 
> HBASE-12142_2.patch, HBASE-12142_98.patch, HBASE-12142_98_2.patch, 
> HBASE-12142_branch_1.patch, HBASE-12142_master_addendum.patch
>
>
> The current truncate command does not preserve acls on a table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-2609) Harmonize the Get and Delete operations

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183499#comment-14183499
 ] 

stack commented on HBASE-2609:
--

Thanks [~apurtell] Not for 0.98.  Waiting on @enis input for branch-1.
St.Ack

> Harmonize the Get and Delete operations
> ---
>
> Key: HBASE-2609
> URL: https://issues.apache.org/jira/browse/HBASE-2609
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Jeff Hammerbacher
>Assignee: stack
> Fix For: 0.99.2
>
> Attachments: 2609.txt, 2609v2.txt
>
>
> In my work on HBASE-2400, implementing deletes for the Avro server felt quite 
> awkward. Rather than the clean API of the Get object, which allows 
> restrictions on the result set from a row to be expressed with addColumn, 
> addFamily, setTimeStamp, setTimeRange, setMaxVersions, and setFilters, the 
> Delete object hides these semantics behind various constructors to 
> deleteColumn[s] an deleteFamily. From my naive vantage point, I see no reason 
> why it would be a bad idea to mimic the Get API exactly, though I could quite 
> possibly be missing something. Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183494#comment-14183494
 ] 

Nick Dimiduk commented on HBASE-12324:
--

What I'm hearing is that it could be interesting to be able to flag a table or 
column family as "immutable". This combined with a TTL setting can allow for 
significant compaction optimizations. Next steps would be to enable such a 
configuration and write an integration test that demonstrates the benefit of 
this configuration.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12334) Handling of DeserializationException causes needless retry on failure

2014-10-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183487#comment-14183487
 ] 

Andrew Purtell commented on HBASE-12334:


lgtm, as long it's committed to master and branch-1 too. 

> Handling of DeserializationException causes needless retry on failure
> -
>
> Key: HBASE-12334
> URL: https://issues.apache.org/jira/browse/HBASE-12334
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.7
>Reporter: James Taylor
>Assignee: Lars Hofhansl
>  Labels: Phoenix
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
> Attachments: 12334-0.98.txt
>
>
> If an unexpected exception occurs when deserialization occurs for a custom 
> filter, the exception gets wrapped in a DeserializationException. Since this 
> exception is in turn wrapped in an IOException, the many loop retry logic 
> kicks in. The net effect is that this same deserialization error occurs again 
> and again as the retries occur, just causing the client to wait needlessly.
> IMO, either the parseFrom methods should be allowed to throw whatever type of 
> IOException they'd like, in which case they could throw a 
> DoNotRetryIOException, or a DeserializationException should be wrapped in a 
> DoNotRetryIOException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-2609) Harmonize the Get and Delete operations

2014-10-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183482#comment-14183482
 ] 

Andrew Purtell edited comment on HBASE-2609 at 10/24/14 9:03 PM:
-

Looks fine but if you put it in 0.98 the javadoc needs a small update to the 
@deprecated text. 


was (Author: apurtell):
Looks fine but if you put it in 0.98 the javadoc needs a small update to 
@since. 

> Harmonize the Get and Delete operations
> ---
>
> Key: HBASE-2609
> URL: https://issues.apache.org/jira/browse/HBASE-2609
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Jeff Hammerbacher
>Assignee: stack
> Fix For: 0.99.2
>
> Attachments: 2609.txt, 2609v2.txt
>
>
> In my work on HBASE-2400, implementing deletes for the Avro server felt quite 
> awkward. Rather than the clean API of the Get object, which allows 
> restrictions on the result set from a row to be expressed with addColumn, 
> addFamily, setTimeStamp, setTimeRange, setMaxVersions, and setFilters, the 
> Delete object hides these semantics behind various constructors to 
> deleteColumn[s] an deleteFamily. From my naive vantage point, I see no reason 
> why it would be a bad idea to mimic the Get API exactly, though I could quite 
> possibly be missing something. Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183485#comment-14183485
 ] 

Vladimir Rodionov commented on HBASE-12324:
---

{quote}
The only issue I see with TS is if old data come late
{quote}

There is no issue here. You will never read this stale data back unles you have 
MIN_VERSIONS > 0 for that CF.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-2609) Harmonize the Get and Delete operations

2014-10-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183482#comment-14183482
 ] 

Andrew Purtell commented on HBASE-2609:
---

Looks fine but if you put it in 0.98 the javadoc needs a small update to 
@since. 

> Harmonize the Get and Delete operations
> ---
>
> Key: HBASE-2609
> URL: https://issues.apache.org/jira/browse/HBASE-2609
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Reporter: Jeff Hammerbacher
>Assignee: stack
> Fix For: 0.99.2
>
> Attachments: 2609.txt, 2609v2.txt
>
>
> In my work on HBASE-2400, implementing deletes for the Avro server felt quite 
> awkward. Rather than the clean API of the Get object, which allows 
> restrictions on the result set from a row to be expressed with addColumn, 
> addFamily, setTimeStamp, setTimeRange, setMaxVersions, and setFilters, the 
> Delete object hides these semantics behind various constructors to 
> deleteColumn[s] an deleteFamily. From my naive vantage point, I see no reason 
> why it would be a bad idea to mimic the Get API exactly, though I could quite 
> possibly be missing something. Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183477#comment-14183477
 ] 

Vladimir Rodionov commented on HBASE-12324:
---

{quote}
IMO Adding external utilities is error prone and operational overhead.
{quote}

May be ( it will take a couple hours to write and debug such utils), I just 
described how would I approach this use case without changing a single line of 
code in HBase.  But, in general, I agree that immutable data (tables, CFs) must 
be treated as separate class citizen in HBase.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk reassigned HBASE-12335:


Assignee: Nick Dimiduk

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Attachments: HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Sheetal Dolas (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183370#comment-14183370
 ] 

Sheetal Dolas commented on HBASE-12324:
---

Makes sense to me. Some cases may still need some sort of compaction (though 
ours did not as we already had other params tuned as we could not afford 
frequent minor compactions to support million ingests per second)
So more informed decision based on number of files at hand (and possibly 
configurations for advanced user to delay or totally turn compactions off but 
enable only deletes) can make it usable for multiple situations.

In our case we did not have TS in key. TS was only in cell version.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12335) IntegrationTestRegionReplicaPerf is flaky

2014-10-24 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12335:
-
Attachment: HBASE-12335.00.patch

Here's a patch for master. It's ported over from my 0.98 + replicas branch. I 
don't have a master cluster to test with at the moment; I've been testing this 
on a 5-node cluster with that other build.

> IntegrationTestRegionReplicaPerf is flaky
> -
>
> Key: HBASE-12335
> URL: https://issues.apache.org/jira/browse/HBASE-12335
> Project: HBase
>  Issue Type: Test
>  Components: test
>Reporter: Nick Dimiduk
> Attachments: HBASE-12335.00.patch
>
>
> I find that this test often fails; the assertion that running with read 
> replicas should complete faster than without is usually false. I need to 
> investigate further as to why this is the case and how we should tune it.
> In the mean time, I'd like to change the test to assert instead on the 
> average of the stdev across all the test runs in each category. Meaning, 
> enabling this feature should reduce the overall latency variance experienced 
> by the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183355#comment-14183355
 ] 

Nick Dimiduk commented on HBASE-12324:
--

I believe OpenTSDB is commonly used as a metrics archival tool as well, so 
retaining data for months or years will quickly accumulate small HFIles using 
this scheme. I believe its data is otherwise consistent with your assumptions. 
You need to be very careful with your flush sizes to avoid a small file 
problem. As Sean says, I'd prefer to see less operational overhead push to 
users, not more. It would be interesting to see an 
"ImmutableRealtimeTimeSeriesCompactionPolicy" that will compact small files 
when some threshold is exceeded but otherwise defer to simply expiring files as 
you do here.

Another question: in this schema, does the rowkey contain the data's timestamp? 
Are you just using HBase cell version for storing your temporal attribute? 
StripeCompactionPolicy is explicitly addressing the former case (because stripe 
boundaries are identified by rowkey ranges.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12006) [JDK 8] KeyStoreTestUtil#generateCertificate fails due to "subject class type invalid"

2014-10-24 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183352#comment-14183352
 ] 

Robert Kanter commented on HBASE-12006:
---

I've put up a patch to fix this at HADOOP-11230

> [JDK 8] KeyStoreTestUtil#generateCertificate fails due to "subject class type 
> invalid"
> --
>
> Key: HBASE-12006
> URL: https://issues.apache.org/jira/browse/HBASE-12006
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.99.0, 2.0.0
>Reporter: Andrew Purtell
>Priority: Minor
>
> Running tests on Java 8. All unit tests for branch 0.98 pass. On master 
> branch some variation in the security API is causing a failure in 
> TestSSLHttpServer:
> {noformat}
> Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.181 sec <<< 
> FAILURE! - in org.apache.hadoop.hbase.http.TestSSLHttpServer
> org.apache.hadoop.hbase.http.TestSSLHttpServer  Time elapsed: 0.181 sec  <<< 
> ERROR!
> java.security.cert.CertificateException: Subject class type invalid.
>   at sun.security.x509.X509CertInfo.setSubject(X509CertInfo.java:888)
>   at sun.security.x509.X509CertInfo.set(X509CertInfo.java:415)
>   at 
> org.apache.hadoop.hbase.http.ssl.KeyStoreTestUtil.generateCertificate(KeyStoreTestUtil.java:94)
>   at 
> org.apache.hadoop.hbase.http.ssl.KeyStoreTestUtil.setupSSLConfig(KeyStoreTestUtil.java:246)
>   at 
> org.apache.hadoop.hbase.http.TestSSLHttpServer.setup(TestSSLHttpServer.java:72)
> org.apache.hadoop.hbase.http.TestSSLHttpServer  Time elapsed: 0.181 sec  <<< 
> ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hbase.http.TestSSLHttpServer.cleanup(TestSSLHttpServer.java:100)
> Tests in error: 
>   TestSSLHttpServer.setup:72 » Certificate Subject class type invalid.
>   TestSSLHttpServer.cleanup:100 NullPointer
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183342#comment-14183342
 ] 

Sean Busbey commented on HBASE-12324:
-

{quote}
The only issue I see with TS is if old data come late. But in those cases, the 
data will get deleted later which seems same as running major compaction late.
{quote}

It's actually worse than that, because the clock could adjust and we could have 
a file timestamp that is older than the cell timestamps within it. That would 
result in deleting data that isn't yet expired. (presuming the timestamp will 
be set based on when the server calls close())

{quote}
Do you mean to say that every file will have latest timestamp of any cell in 
it. And we could use that TS to identify files to delete instead of looking at 
file timestamp ? That sounds interesting.
{quote}

Yes exactly, we use protobufs and  have a bunch of padded space in the fixed 
trailer so that we can make optimizations without having to increment the file 
version. We already track some other cell stats as we make a file, seems like 
adding the info about the timestamps inside the file should be straight forward.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Sheetal Dolas (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183343#comment-14183343
 ] 

Sheetal Dolas commented on HBASE-12324:
---

So probably removeUnneededFiles method of HStore can be modified to check for 
the trailer ts and use it to decide files to be deleted. If not present rely 
back on file TS.
This way it will be compatible with older data as well.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-12340) Fix release audit warning in master

2014-10-24 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark resolved HBASE-12340.
---
Resolution: Invalid

Seems like stack already got to this last night

> Fix release audit warning in master
> ---
>
> Key: HBASE-12340
> URL: https://issues.apache.org/jira/browse/HBASE-12340
> Project: HBase
>  Issue Type: Bug
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Sheetal Dolas (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183337#comment-14183337
 ] 

Sheetal Dolas commented on HBASE-12324:
---

The only issue I see with TS is if old data come late. But in those cases, the 
data will get deleted later which seems same as running major compaction late.

Do you mean to say that every file will have latest timestamp of any cell in 
it. And we could use that TS to identify files to delete instead of looking at 
file timestamp ? That sounds interesting.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183314#comment-14183314
 ] 

stack commented on HBASE-7912:
--

Moved out of hbase 1.0.  No progress.

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and 
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup set without interruption 
> of incremental backup schedule.
> * Support rollup/combining of incremental backups into longer period and 
> bigger incremental backups.
> * Unified command line interface for all the above.
> The solution will support HBase backup to FileSystem, either on the same 
> cluster or across clusters.  It has the flexibility to support backup to 
> other devices and servers in the future.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2014-10-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7912:
-
Parent Issue: HBASE-12342  (was: HBASE-10856)

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and 
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup set without interruption 
> of incremental backup schedule.
> * Support rollup/combining of incremental backups into longer period and 
> bigger incremental backups.
> * Unified command line interface for all the above.
> The solution will support HBase backup to FileSystem, either on the same 
> cluster or across clusters.  It has the flexibility to support backup to 
> other devices and servers in the future.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183311#comment-14183311
 ] 

Sean Busbey commented on HBASE-12324:
-

Relying on the file timestamp seems problematic.

Do you have a specific concern with adding the optional trailer item about cell 
timestamps present? It would allow us to generalize that part of the 
optimization to other policies.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12341) Overhaul logging; log4j2, machine-readable, etc.

2014-10-24 Thread stack (JIRA)
stack created HBASE-12341:
-

 Summary: Overhaul logging; log4j2, machine-readable, etc.
 Key: HBASE-12341
 URL: https://issues.apache.org/jira/browse/HBASE-12341
 Project: HBase
  Issue Type: Umbrella
Reporter: stack
 Fix For: 2.0.0


This is a general umbrella issue for 2.x logging improvements. Hang related 
work off this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12342) HBase Backup Options

2014-10-24 Thread stack (JIRA)
stack created HBASE-12342:
-

 Summary: HBase Backup Options
 Key: HBASE-12342
 URL: https://issues.apache.org/jira/browse/HBASE-12342
 Project: HBase
  Issue Type: Umbrella
Reporter: stack


Umbrella issue for hbase backup options.  Hang all related here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Sheetal Dolas (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183287#comment-14183287
 ] 

Sheetal Dolas commented on HBASE-12324:
---

Sean, Vlad,

Thanks for your inputs.

[~vrodionov], in our case already had all those params tuned , however the 
expired data must get deleted. Which utility are you referring to ? Can one run 
that while tables are active and data being ingested?
IMO Adding external utilities is error prone and operational overhead. So it 
would be nice if it is inside HBase. Also as [~busbey] pointed out, tuning 
these parameter needs careful evaluation and need for niche expertise.

It would be nice if HBase itself can take care of complexities and make it easy 
for users/operators. I can see multiple use cases including Open TSDB which 
need this to be handled elegantly.

Let me add some more details to the use case and proposed solution.
Use case:
* Very high ingest rate.
* Immutable data
* Data life is short (few days)
* Read rates are low to moderate (in comparison to ingest rates)

Issues with default major compaction (even when compactions are done rarely)
* Lot of data IO just to get out expired data out
* No other significant benefits then expired data deletion

Proposed solution
* During major (or even minor) compactions, do not compact any data
* Just delete files whose timestamp is older than TTL
* Add a new compaction policy class say 
"OnlyDeleteExpiredFilesCompactionPolicy" and set these configurations while 
creating the table.
'hbase.hstore.defaultengine.compactionpolicy.class' => 
'org.apache.hadoop.hbase.regionserver.compactions.OnlyDeleteExpiredFilesCompactionPolicy',
 'hbase.store.delete.expired.storefile' => 'true' 

Benefits
* Significant reduction in IO during compaction
* Automatically get rid of expired data

Assumptions and applicability
* TTL is defined at table level or for all CFs in table
* Cells use system timestamp for versioning or if overwritten, the overwritten 
timestamp is closer to system timestamp

Attached proposed compaction policy. It appears trivially simple. Thoughts?


> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Sheetal Dolas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sheetal Dolas updated HBASE-12324:
--
Attachment: OnlyDeleteExpiredFilesCompactionPolicy.java

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
> Attachments: OnlyDeleteExpiredFilesCompactionPolicy.java
>
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10092) Move up on to log4j2

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183286#comment-14183286
 ] 

stack commented on HBASE-10092:
---

I moved this out of hbase 1.0.  Move back if viable patch in time.

> Move up on to log4j2
> 
>
> Key: HBASE-10092
> URL: https://issues.apache.org/jira/browse/HBASE-10092
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: Alex Newman
> Fix For: 2.0.0
>
> Attachments: 10092.txt, 10092v2.txt, HBASE-10092-preview-v0.patch, 
> HBASE-10092.patch
>
>
> Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
> This rather radical transition can be done w/ minor change given they have an 
> adapter for apache's logging, the one we use.  They also have and adapter for 
> slf4j so we likely can remove at least some of the 4 versions of this module 
> our dependencies make use of.
> I made a start in attached patch but am currently stuck in maven dependency 
> resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
> a good net connection, an item I currently lack.  Other TODOs are that will 
> need to fix our little log level setting jsp page -- will likely have to undo 
> our use of hadoop's tool here -- and the config system changes a little.
> I will return to this project soon.  Will bring numbers.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10092) Move up on to log4j2

2014-10-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10092:
--
Parent Issue: HBASE-12341  (was: HBASE-10856)

> Move up on to log4j2
> 
>
> Key: HBASE-10092
> URL: https://issues.apache.org/jira/browse/HBASE-10092
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: Alex Newman
> Fix For: 2.0.0
>
> Attachments: 10092.txt, 10092v2.txt, HBASE-10092-preview-v0.patch, 
> HBASE-10092.patch
>
>
> Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
> This rather radical transition can be done w/ minor change given they have an 
> adapter for apache's logging, the one we use.  They also have and adapter for 
> slf4j so we likely can remove at least some of the 4 versions of this module 
> our dependencies make use of.
> I made a start in attached patch but am currently stuck in maven dependency 
> resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
> a good net connection, an item I currently lack.  Other TODOs are that will 
> need to fix our little log level setting jsp page -- will likely have to undo 
> our use of hadoop's tool here -- and the config system changes a little.
> I will return to this project soon.  Will bring numbers.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12338) Client side scanning prefetching.

2014-10-24 Thread Yi Deng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Deng updated HBASE-12338:

Attachment: 0001-Add-ScanPrefetcher-for-client-side-scanning-prefetch.patch

Address [~tedyu] and [~stack]'s comments and some small refactoring.

> Client side scanning prefetching.
> -
>
> Key: HBASE-12338
> URL: https://issues.apache.org/jira/browse/HBASE-12338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 1.0.0, 2.0.0, 0.98.6.1
>Reporter: Yi Deng
>Assignee: Yi Deng
>  Labels: prefetch, results, scanner
> Attachments: 
> 0001-Add-ScanPrefetcher-for-client-side-scanning-prefetch.patch, 
> 0001-ScanPrefetcher.patch
>
>
> Since server side prefetching was not proved to be a good way to prefetch, we 
> need to do it on client side.
> This is a wrapper class that takes any instance of `ResultScanner` as the 
> underneath scanning component. The class will schedule the scanning in a 
> background thread. There is a buffering queue storing prefetched results, 
> whose's length is configurable. The prefetcher will release the thread if the 
> queue is full and wait for results to be consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11179) API parity between mapred and mapreduce

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183275#comment-14183275
 ] 

stack commented on HBASE-11179:
---

[~ndimiduk] Ok I move this out as a hbase 1.0 subtask. It is marked beginner 
and has no one assigned.

> API parity between mapred and mapreduce
> ---
>
> Key: HBASE-11179
> URL: https://issues.apache.org/jira/browse/HBASE-11179
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce
>Reporter: Nick Dimiduk
>  Labels: beginner
> Fix For: 0.99.2
>
>
> This ticket is for bringing the mapred package up to feature parity with 
> mapreduce. Might become an umbrella ticket in and of itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12340) Fix release audit warning in master

2014-10-24 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-12340:
-

 Summary: Fix release audit warning in master
 Key: HBASE-12340
 URL: https://issues.apache.org/jira/browse/HBASE-12340
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12313) Redo the hfile index length optimization so cell-based rather than serialized KV key

2014-10-24 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183266#comment-14183266
 ] 

Anoop Sam John commented on HBASE-12313:


Quick pass over the patch, it looks great. Will have a closer look at the new 
Cell based mid key calc.



> Redo the hfile index length optimization so cell-based rather than serialized 
> KV key
> 
>
> Key: HBASE-12313
> URL: https://issues.apache.org/jira/browse/HBASE-12313
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: stack
>Assignee: stack
> Attachments: 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt
>
>
> Trying to remove API that returns the 'key' of a KV serialized into a byte 
> array is thorny.
> I tried to move over the first and last key serializations and the hfile 
> index entries to be cell but patch was turning massive.  Here is a smaller 
> patch that just redoes the optimization that tries to find 'short' midpoints 
> between last key of last block and first key of next block so it is 
> Cell-based rather than byte array based (presuming Keys serialized in a 
> certain way).  Adds unit tests which we didn't have before.
> Also remove CellKey.  Not needed... at least not yet.  Its just utility for 
> toString.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183202#comment-14183202
 ] 

stack commented on HBASE-12285:
---

As [~dimaspivak] noticed, we just had our first blue branch-1 build. Only 
change is the move to WARN and commit of bulk load Interface patch.  
Rerunning

> Builds are failing, possibly because of SUREFIRE-1091
> -
>
> Key: HBASE-12285
> URL: https://issues.apache.org/jira/browse/HBASE-12285
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>Priority: Blocker
> Attachments: HBASE-12285_branch-1_v1.patch
>
>
> Our branch-1 builds on builds.apache.org have been failing in recent days 
> after we switched over to an official version of Surefire a few days back 
> (HBASE-4955). The version we're using, 2.17, is hit by a bug 
> ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
> in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183188#comment-14183188
 ] 

Sean Busbey commented on HBASE-12324:
-

Sure, but I'd rather we have an optimization in place that can improve this 
workload without requiring niche tuning and special operational handling. 
Especially if these datasets need to live in an hbase cluster with others that 
don't share the same properties.

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12277) Refactor bulkLoad methods in AccessController to its own interface

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183130#comment-14183130
 ] 

Hudson commented on HBASE-12277:


SUCCESS: Integrated in HBase-1.0 #352 (See 
[https://builds.apache.org/job/HBase-1.0/352/])
HBASE-12277 Refactor bulkLoad methods in AccessController to its own interface 
(Madhan Neethiraj) (stack: rev ceffa3c48dc0809e3eb3fd77b99b6104873b4f59)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/SecureBulkLoadEndpoint.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BulkLoadObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java


> Refactor bulkLoad methods in AccessController to its own interface
> --
>
> Key: HBASE-12277
> URL: https://issues.apache.org/jira/browse/HBASE-12277
> Project: HBase
>  Issue Type: Bug
>Reporter: Madhan Neethiraj
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 
> 0001-HBASE-12277-Refactored-bulk-load-methods-from-Access.patch, 
> 0002-HBASE-12277-License-text-added-to-the-newly-created-.patch, 
> HBASE-12277-v2.patch, HBASE-12277-v3.patch, HBASE-12277-v4.patch, 
> HBASE-12277.patch
>
>
> SecureBulkLoadEndPoint references couple of methods, prePrepareBulkLoad() and 
> preCleanupBulkLoad(), implemented in AccessController i.e. direct coupling 
> between AccessController and SecureBuikLoadEndPoint classes.
> SecureBulkLoadEndPoint assumes presence of AccessController in 
> secure-cluster. If HBase is configured with another coprocessor for 
> access-control, SecureBulkLoadEndPoint fails with NPE.
> To remove this direct coupling, bulk-load related methods in AccessController 
> should be refactored to an interface; and have AccessController implement 
> this interfaces. SecureBulkLoadEndPoint should then look for coprocessors 
> that implement this interface, instead of directly looking for 
> AccessController.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183129#comment-14183129
 ] 

Hudson commented on HBASE-12285:


SUCCESS: Integrated in HBase-1.0 #352 (See 
[https://builds.apache.org/job/HBase-1.0/352/])
HBASE-12285 Builds are failing, possibly because of SUREFIRE-1091 -- trying 
WARN level to see if it makes any difference (stack: rev 
b8ed37b88e515efad2958f9b8d92e93fe6e922ad)
* hbase-server/src/test/resources/log4j.properties


> Builds are failing, possibly because of SUREFIRE-1091
> -
>
> Key: HBASE-12285
> URL: https://issues.apache.org/jira/browse/HBASE-12285
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>Priority: Blocker
> Attachments: HBASE-12285_branch-1_v1.patch
>
>
> Our branch-1 builds on builds.apache.org have been failing in recent days 
> after we switched over to an official version of Surefire a few days back 
> (HBASE-4955). The version we're using, 2.17, is hit by a bug 
> ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
> in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12324) Improve compaction speed and process for immutable short lived datasets

2014-10-24 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183131#comment-14183131
 ] 

Vladimir Rodionov commented on HBASE-12324:
---

Sean,

You can effectively disable compaction by setting the following config:
{code}
conf.setLong("hbase.hregion.max.filesize", Long.MAX_VALUE);
conf.setLong("hbase.hregion.memstore.flush.size", FLUSH_SIZE);
conf.setInt("hbase.hstore.compactionThreshold", Integer.MAX_VALUE);
conf.setInt("hbase.hstore.blockingStoreFiles", Integer.MAX_VALUE); 
conf.setInt("hbase.hstore.compaction.min", Integer.MAX_VALUE);
conf.setInt("hbase.hstore.compaction.max", Integer.MAX_VALUE);
{code}

* If you do not need compaction , you can have only few (even one) regions per 
server
* Make sure pre-split your table
* Run periodically utility which purge/archive the oldest HFiles
* FLUSH_SIZE should be large enough but not that extreme. Because you can 
afford hosting very few regions per RS , your flush size can be quite large.

 

> Improve compaction speed and process for immutable short lived datasets
> ---
>
> Key: HBASE-12324
> URL: https://issues.apache.org/jira/browse/HBASE-12324
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Sheetal Dolas
>
> We have seen multiple cases where HBase is used to store immutable data and 
> the data lives for short period of time (few days)
> On very high volume systems, major compactions become very costly and 
> slowdown ingestion rates.
> In all such use cases (immutable data, high write rate and moderate read 
> rates and shorter ttl), avoiding any compactions and just deleting old data 
> brings lot of performance benefits.
> We should have a compaction policy that can only delete/archive files older 
> than TTL and not compact any files.
> Also attaching a patch that can do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HBASE-12339) WAL performance evaluation tool doesn't roll logs

2014-10-24 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-12339 started by Sean Busbey.
---
> WAL performance evaluation tool doesn't roll logs
> -
>
> Key: HBASE-12339
> URL: https://issues.apache.org/jira/browse/HBASE-12339
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.99.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
> Fix For: 0.99.2
>
>
> the perf eval tool for the wal never sets up a log roller and instead used to 
> just call the rollWriter method directly.
> Eventually it changed to call requestLogRoll instead of attempting to do the 
> roll itself. requestLogRoll is the same method used internal to the wal code 
> and it relies on there being a LogRoller to actually have rolls happen. (the 
> method just notifies all of the listeners for hte wal that one of them should 
> call the roll method.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12338) Client side scanning prefetching.

2014-10-24 Thread Yi Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183093#comment-14183093
 ] 

Yi Deng commented on HBASE-12338:
-

https://reviews.facebook.net/D25617

> Client side scanning prefetching.
> -
>
> Key: HBASE-12338
> URL: https://issues.apache.org/jira/browse/HBASE-12338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 1.0.0, 2.0.0, 0.98.6.1
>Reporter: Yi Deng
>Assignee: Yi Deng
>  Labels: prefetch, results, scanner
> Attachments: 0001-ScanPrefetcher.patch
>
>
> Since server side prefetching was not proved to be a good way to prefetch, we 
> need to do it on client side.
> This is a wrapper class that takes any instance of `ResultScanner` as the 
> underneath scanning component. The class will schedule the scanning in a 
> background thread. There is a buffering queue storing prefetched results, 
> whose's length is configurable. The prefetcher will release the thread if the 
> queue is full and wait for results to be consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12338) Client side scanning prefetching.

2014-10-24 Thread Yi Deng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Deng updated HBASE-12338:

Attachment: 0001-ScanPrefetcher.patch

> Client side scanning prefetching.
> -
>
> Key: HBASE-12338
> URL: https://issues.apache.org/jira/browse/HBASE-12338
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Affects Versions: 1.0.0, 2.0.0, 0.98.6.1
>Reporter: Yi Deng
>Assignee: Yi Deng
>  Labels: prefetch, results, scanner
> Attachments: 0001-ScanPrefetcher.patch
>
>
> Since server side prefetching was not proved to be a good way to prefetch, we 
> need to do it on client side.
> This is a wrapper class that takes any instance of `ResultScanner` as the 
> underneath scanning component. The class will schedule the scanning in a 
> background thread. There is a buffering queue storing prefetched results, 
> whose's length is configurable. The prefetcher will release the thread if the 
> queue is full and wait for results to be consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12339) WAL performance evaluation tool doesn't roll logs

2014-10-24 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183070#comment-14183070
 ] 

Sean Busbey commented on HBASE-12339:
-

One impact is that right now, attempts to use the tool on branch-1+ for 
non-trivial workloads results in a single multi-gigabyte log file, missing hte 
impact of pipeline set up from rolling in measurements, and the inability for 
the wal to do some of its error recovery.

> WAL performance evaluation tool doesn't roll logs
> -
>
> Key: HBASE-12339
> URL: https://issues.apache.org/jira/browse/HBASE-12339
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.99.0
>Reporter: Sean Busbey
>Assignee: Sean Busbey
> Fix For: 0.99.2
>
>
> the perf eval tool for the wal never sets up a log roller and instead used to 
> just call the rollWriter method directly.
> Eventually it changed to call requestLogRoll instead of attempting to do the 
> roll itself. requestLogRoll is the same method used internal to the wal code 
> and it relies on there being a LogRoller to actually have rolls happen. (the 
> method just notifies all of the listeners for hte wal that one of them should 
> call the roll method.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12339) WAL performance evaluation tool doesn't roll logs

2014-10-24 Thread Sean Busbey (JIRA)
Sean Busbey created HBASE-12339:
---

 Summary: WAL performance evaluation tool doesn't roll logs
 Key: HBASE-12339
 URL: https://issues.apache.org/jira/browse/HBASE-12339
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.99.0
Reporter: Sean Busbey
Assignee: Sean Busbey
 Fix For: 0.99.2


the perf eval tool for the wal never sets up a log roller and instead used to 
just call the rollWriter method directly.

Eventually it changed to call requestLogRoll instead of attempting to do the 
roll itself. requestLogRoll is the same method used internal to the wal code 
and it relies on there being a LogRoller to actually have rolls happen. (the 
method just notifies all of the listeners for hte wal that one of them should 
call the roll method.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12277) Refactor bulkLoad methods in AccessController to its own interface

2014-10-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183066#comment-14183066
 ] 

Hudson commented on HBASE-12277:


SUCCESS: Integrated in HBase-TRUNK #5696 (See 
[https://builds.apache.org/job/HBase-TRUNK/5696/])
HBASE-12277 Refactor bulkLoad methods in AccessController to its own interface 
(Madhan Neethiraj) (stack: rev 2916d4f3568184f92006dba9a1e4ef18492643ea)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/SecureBulkLoadEndpoint.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BulkLoadObserver.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java


> Refactor bulkLoad methods in AccessController to its own interface
> --
>
> Key: HBASE-12277
> URL: https://issues.apache.org/jira/browse/HBASE-12277
> Project: HBase
>  Issue Type: Bug
>Reporter: Madhan Neethiraj
> Fix For: 2.0.0, 0.99.2
>
> Attachments: 
> 0001-HBASE-12277-Refactored-bulk-load-methods-from-Access.patch, 
> 0002-HBASE-12277-License-text-added-to-the-newly-created-.patch, 
> HBASE-12277-v2.patch, HBASE-12277-v3.patch, HBASE-12277-v4.patch, 
> HBASE-12277.patch
>
>
> SecureBulkLoadEndPoint references couple of methods, prePrepareBulkLoad() and 
> preCleanupBulkLoad(), implemented in AccessController i.e. direct coupling 
> between AccessController and SecureBuikLoadEndPoint classes.
> SecureBulkLoadEndPoint assumes presence of AccessController in 
> secure-cluster. If HBase is configured with another coprocessor for 
> access-control, SecureBulkLoadEndPoint fails with NPE.
> To remove this direct coupling, bulk-load related methods in AccessController 
> should be refactored to an interface; and have AccessController implement 
> this interfaces. SecureBulkLoadEndPoint should then look for coprocessors 
> that implement this interface, instead of directly looking for 
> AccessController.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12262) Make it possible to run hbase-it against a distributed cluster using Maven

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183057#comment-14183057
 ] 

stack commented on HBASE-12262:
---

Given how much you fellows all love maven messing, going the [~eclark] route 
probably makes most sense.  Lets be sure to update the refguide if we go this 
way.

> Make it possible to run hbase-it against a distributed cluster using Maven
> --
>
> Key: HBASE-12262
> URL: https://issues.apache.org/jira/browse/HBASE-12262
> Project: HBase
>  Issue Type: Task
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>
> hbase-it is pretty awesome, especially when run against a distributed 
> cluster. As I've been trying to develop new tests to add to the module, I got 
> sick of having to run {{mvn package}} and moving JARs to my cluster and 
> wanted to run the tests directly using {{mvn verify}}. Unfortunately, I can't 
> seem to do so; this JIRA is to try to fix that.
> [~enis] and [~ndimiduk], I've seen that you guys are behind a lot of these 
> tests. Was it previously possible to do what I'm trying to do and it just 
> broke at some point, or is this new functionality that I'll need to figure 
> out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183021#comment-14183021
 ] 

stack commented on HBASE-12285:
---

[[~dimaspivak] Why branch-1 fail but trunk usually passes when both using same 
surefire do you think?

> Builds are failing, possibly because of SUREFIRE-1091
> -
>
> Key: HBASE-12285
> URL: https://issues.apache.org/jira/browse/HBASE-12285
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>Priority: Blocker
> Attachments: HBASE-12285_branch-1_v1.patch
>
>
> Our branch-1 builds on builds.apache.org have been failing in recent days 
> after we switched over to an official version of Surefire a few days back 
> (HBASE-4955). The version we're using, 2.17, is hit by a bug 
> ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
> in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183019#comment-14183019
 ] 

stack commented on HBASE-11368:
---

Is this meant to be in the patch?

1449LOG.info("###compaction get the closelock, sleep 20s to simulate 
slow compaction");
1450try {
1451  Thread.sleep(2);
1452} catch (InterruptedException e) {
1453  LOG.info("###sleep interrupted");
1454}

What change did you do [~tianq]?

> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, 
> performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12285) Builds are failing, possibly because of SUREFIRE-1091

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183008#comment-14183008
 ] 

stack commented on HBASE-12285:
---

[~dimaspivak]
I just committed change that sets logging to WARN to see if it makes a 
difference.

[~eclark] You think we should move to snapshot of surefilre -- like we did in 
old days?


> Builds are failing, possibly because of SUREFIRE-1091
> -
>
> Key: HBASE-12285
> URL: https://issues.apache.org/jira/browse/HBASE-12285
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>Priority: Blocker
> Attachments: HBASE-12285_branch-1_v1.patch
>
>
> Our branch-1 builds on builds.apache.org have been failing in recent days 
> after we switched over to an official version of Surefire a few days back 
> (HBASE-4955). The version we're using, 2.17, is hit by a bug 
> ([SUREFIRE-1091|https://jira.codehaus.org/browse/SUREFIRE-1091]) that results 
> in an IOException, which looks like what we're seeing on Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >