[jira] [Comment Edited] (HBASE-14383) Compaction improvements

2015-09-15 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746597#comment-14746597
 ] 

Enis Soztutar edited comment on HBASE-14383 at 9/16/15 12:48 AM:
-

bq. Can we retire hbase.regionserver.maxlogs?
I am in favor of that, or keeping it as a safety net, but with a much higher 
default (128?). 

With default settings
{code}
hbase.regionserver.maxlogs=32
hbase.regionserver.hlog.blocksize=128MB
hbase.regionserver.logroll.multiplier=0.95
{code}
We can only have 32*128*0.95 = 3.9GB of WAL entries. So, if you are running 
with 32GB heap and 0.4 memstore size, the memstore space is just left unused. 

Also, not related to compactions, but I have seen cases where there are not 
enough regions per region server to fill the whole memstore space with the 
128MB flush size, a few active regions and big heaps. We do not allow a 
memstore to grow beyond the flush limit to guard against long flushes and long 
MTTR times. But my feeling is that, maybe we can have a dynamically adjustable 
flush size taking into account a min and max flush size and delay triggering 
the flush if there is more space. 



was (Author: enis):
bq. Can we retire hbase.regionserver.maxlogs?
I am in favor of that, or keeping it as a safety net, but with a much higher 
default (128?). 

With default settings
{code}
hbase.regionserver.maxlogs=32
hbase.regionserver.hlog.blocksize=128MB
hbase.regionserver.logroll.multiplier=0.95
{code}
We can only have 32*128*0.95 = 3.9MB of WAL entries. So, if you are running 
with 32GB heap and 0.4 memstore size, the memstore space is just left unused. 

Also, not related to compactions, but I have seen cases where there are not 
enough regions per region server to fill the whole memstore space with the 
128MB flush size, a few active regions and big heaps. We do not allow a 
memstore to grow beyond the flush limit to guard against long flushes and long 
MTTR times. But my feeling is that, maybe we can have a dynamically adjustable 
flush size taking into account a min and max flush size and delay triggering 
the flush if there is more space. 


> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746749#comment-14746749
 ] 

stack commented on HBASE-14383:
---

I'd be for upping the max logs number (have seen cases where it ran away up to 
the thousands so some guard would be good)

> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14394) Properly close the connection after reading records from table.

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746407#comment-14746407
 ] 

Hudson commented on HBASE-14394:


FAILURE: Integrated in HBase-1.2 #174 (See 
[https://builds.apache.org/job/HBase-1.2/174/])
HBASE-14394 Properly close the connection after reading records from table. 
(ssrungarapu: rev 0d3f0f64a49cbd33f26d7dfd97ec30aa7d78a752)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java


> Properly close the connection after reading records from table.
> ---
>
> Key: HBASE-14394
> URL: https://issues.apache.org/jira/browse/HBASE-14394
> Project: HBase
>  Issue Type: Bug
>Reporter: Srikanth Srungarapu
>Assignee: Srikanth Srungarapu
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: HBASE-14394.patch, HBASE-14394_v2.patch, 
> HBASE-14394_v3.patch, HBASE-14394_v4.patch
>
>
> This was brought to notice by one of our observant customers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14411) Fix unit test failures when using multiwal as default WAL provider

2015-09-15 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746494#comment-14746494
 ] 

Ted Yu commented on HBASE-14411:


Mind preparing patch for branch-1 ?

Thanks, Yu.

> Fix unit test failures when using multiwal as default WAL provider
> --
>
> Key: HBASE-14411
> URL: https://issues.apache.org/jira/browse/HBASE-14411
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14411.patch, HBASE-14411_v2.patch
>
>
> If we set hbase.wal.provider to multiwal in 
> hbase-server/src/test/resources/hbase-site.xml which allows us to use 
> BoundedRegionGroupingProvider in UT, we will observe below failures in 
> current code base:
> {noformat}
> Failed tests:
>   TestHLogRecordReader>TestWALRecordReader.testPartialRead:164 expected:<1> 
> but was:<2>
>   TestHLogRecordReader>TestWALRecordReader.testWALRecordReader:216 
> expected:<2> but was:<3>
>   TestWALRecordReader.testPartialRead:164 expected:<1> but was:<2>
>   TestWALRecordReader.testWALRecordReader:216 expected:<2> but was:<3>
>   TestDistributedLogSplitting.testRecoveredEdits:276 edits dir should have 
> more than a single file in it. instead has 1
>   TestAtomicOperation.testMultiRowMutationMultiThreads:499 expected:<0> but 
> was:<1>
>   TestHRegionServerBulkLoad.testAtomicBulkLoad:307
> Expected: is 
>  but: was 
>   TestLogRolling.testCompactionRecordDoesntBlockRolling:611 Should have WAL; 
> one table is not flushed expected:<1> but was:<0>
>   TestLogRolling.testLogRollOnDatanodeDeath:359 null
>   TestLogRolling.testLogRollOnPipelineRestart:472 Missing datanode should've 
> triggered a log roll
>   TestReplicationSourceManager.testLogRoll:237 expected:<6> but was:<7>
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 if 
> skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong number of files in the 
> archive log expected:<11> but was:<12>
>   TestWALSplit.testMovedWALDuringRecovery:810->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   TestWALSplit.testRetryOpenDuringRecovery:838->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594
>  if skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplitCompressed>TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong 
> number of files in the archive log expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testMovedWALDuringRecovery:810->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testRetryOpenDuringRecovery:838->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
> {noformat}
> While patch for HBASE-14306 could resolve failures of TestHLogRecordReader, 
> TestReplicationSourceManager and TestReplicationWALReaderManager, this JIRA 
> will focus on resolving the others



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14400) Fix HBase RPC protection documentation

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746791#comment-14746791
 ] 

Hudson commented on HBASE-14400:


FAILURE: Integrated in HBase-1.0 #1052 (See 
[https://builds.apache.org/job/HBase-1.0/1052/])
HBASE-14400 Fix HBase RPC protection documentation (apurtell: rev 
7da2a8f3052a9d50693fcde56beca1e1b582f0f0)
* hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslUtil.java
* hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/TestHBaseSaslRpcClient.java
* src/main/asciidoc/_chapters/security.adoc


> Fix HBase RPC protection documentation
> --
>
> Key: HBASE-14400
> URL: https://issues.apache.org/jira/browse/HBASE-14400
> Project: HBase
>  Issue Type: Bug
>  Components: encryption, rpc, security
>Reporter: Apekshit Sharma
>Assignee: Apekshit Sharma
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: HBASE-14400-branch-0.98.patch, 
> HBASE-14400-branch-1.0.patch, HBASE-14400-branch-1.1.patch, 
> HBASE-14400-branch-1.2.patch, HBASE-14400-master-v2.patch, 
> HBASE-14400-master.patch
>
>
> HBase configuration 'hbase.rpc.protection' can be set to 'authentication', 
> 'integrity' or 'privacy'.
> "authentication means authentication only and no integrity or privacy; 
> integrity implies
> authentication and integrity are enabled; and privacy implies all of
> authentication, integrity and privacy are enabled."
> However hbase ref guide incorrectly suggests in some places to set the value 
> to 'auth-conf' instead of 'privacy'. Setting value to 'auth-conf' doesn't 
> provide rpc encryption which is what user wants.
> This jira will fix:
> - documentation: change 'auth-conf' references to 'privacy'
> - SaslUtil to support both set of values (privacy/integrity/authentication 
> and auth-conf/auth-int/auth) to be backward compatible with what was being 
> suggested till now.
> - change 'hbase.thrift.security.qop' to be consistent with other similar 
> configurations by using same set of values (privacy/integrity/authentication).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12751) Allow RowLock to be reader writer

2015-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746902#comment-14746902
 ] 

Hadoop QA commented on HBASE-12751:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12756168/12751.v37.txt
  against master branch at commit fe2c4f630d3b5f3346c9ee9f95c256186c9e6907.
  ATTACHMENT ID: 12756168

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 99 new 
or modified tests.

{color:red}-1 Anti-pattern{color}.  The patch appears to 
have anti-pattern where BYTES_COMPARATOR was omitted:
 -getRegionInfo(), -1, new TreeMap());.

{color:red}-1 javac{color}.  The patch appears to cause mvn compile goal to 
fail with Hadoop version 2.4.0.

Compilation errors resume:
[ERROR] COMPILATION ERROR : 
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationWALReaderManager.java:[82,17]
 cannot find symbol
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationWALReaderManager.java:[82,45]
 cannot find symbol
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.2:testCompile 
(default-testCompile) on project hbase-server: Compilation failure: Compilation 
failure:
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationWALReaderManager.java:[82,17]
 cannot find symbol
[ERROR] symbol:   class AtomicLong
[ERROR] location: class 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager
[ERROR] 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationWALReaderManager.java:[82,45]
 cannot find symbol
[ERROR] symbol:   class AtomicLong
[ERROR] location: class 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hbase-server


Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15615//console

This message is automatically generated.

> Allow RowLock to be reader writer
> -
>
> Key: HBASE-12751
> URL: https://issues.apache.org/jira/browse/HBASE-12751
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 12751.rebased.v25.txt, 12751.rebased.v26.txt, 
> 12751.rebased.v26.txt, 12751.rebased.v27.txt, 12751.rebased.v29.txt, 
> 12751.rebased.v31.txt, 12751.rebased.v32.txt, 12751.rebased.v32.txt, 
> 12751.rebased.v33.txt, 12751.rebased.v34.txt, 12751.rebased.v35.txt, 
> 12751.rebased.v35.txt, 12751.rebased.v35.txt, 12751.v37.txt, 12751.v38.txt, 
> 12751v22.txt, 12751v23.txt, 12751v23.txt, 12751v23.txt, 12751v23.txt, 
> 12751v36.txt, HBASE-12751-v1.patch, HBASE-12751-v10.patch, 
> HBASE-12751-v10.patch, HBASE-12751-v11.patch, HBASE-12751-v12.patch, 
> HBASE-12751-v13.patch, HBASE-12751-v14.patch, HBASE-12751-v15.patch, 
> HBASE-12751-v16.patch, HBASE-12751-v17.patch, HBASE-12751-v18.patch, 
> HBASE-12751-v19 (1).patch, HBASE-12751-v19.patch, HBASE-12751-v2.patch, 
> HBASE-12751-v20.patch, HBASE-12751-v20.patch, HBASE-12751-v21.patch, 
> HBASE-12751-v3.patch, HBASE-12751-v4.patch, HBASE-12751-v5.patch, 
> HBASE-12751-v6.patch, HBASE-12751-v7.patch, HBASE-12751-v8.patch, 
> HBASE-12751-v9.patch, HBASE-12751.patch
>
>
> Right now every write operation grabs a row lock. This is to prevent values 
> from changing during a read modify write operation (increment or check and 
> put). However it limits parallelism in several different scenarios.
> If there are several puts to the same row but different columns or stores 
> then this is very limiting.
> If there are puts to the same column then mvcc number should ensure a 
> consistent ordering. So locking is not needed.
> However locking for check and 

[jira] [Commented] (HBASE-10449) Wrong execution pool configuration in HConnectionManager

2015-09-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746905#comment-14746905
 ] 

stack commented on HBASE-10449:
---

Thanks [~nkeywal] Let me commit HBASE-14433. Lets go with less threads till we 
do the test that proves we need more. Thanks for the review boss.

> Wrong execution pool configuration in HConnectionManager
> 
>
> Key: HBASE-10449
> URL: https://issues.apache.org/jira/browse/HBASE-10449
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0, 0.99.0, 0.96.1.1
>Reporter: Nicolas Liochon
>Assignee: Nicolas Liochon
>Priority: Critical
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBASE-10449.v1.patch
>
>
> There is a confusion in the configuration of the pool. The attached patch 
> fixes this. This may change the client performances, as we were using a 
> single thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14433) Set down the client executor core thread count from 256 to number of processors

2015-09-15 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14433:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0
   Status: Resolved  (was: Patch Available)

Committing to master branch only. Let patch build run with less threads. 
Backport when we can show this change definitively better performance.

> Set down the client executor core thread count from 256 to number of 
> processors
> ---
>
> Key: HBASE-14433
> URL: https://issues.apache.org/jira/browse/HBASE-14433
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: 14433 (1).txt, 14433.txt, 14433v2.txt, 14433v3.txt, 
> 14433v3.txt, 14433v3.txt, 14433v3.txt, 14433v3.txt, 14433v3.txt
>
>
> HBASE-10449 upped our core count from 0 to 256 (max is 256). Looking in a 
> recent test run core dump, I see up to 256 threads per client and all are 
> idle. At a minimum it makes it hard reading test thread dumps. Trying to 
> learn more about why we went a core of 256 over in HBASE-10449. Meantime will 
> try setting down configs for test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14439) New/Improved Filesystem Abstractions

2015-09-15 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-14439:

Issue Type: Sub-task  (was: New Feature)
Parent: HBASE-7806

> New/Improved Filesystem Abstractions
> 
>
> Key: HBASE-14439
> URL: https://issues.apache.org/jira/browse/HBASE-14439
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ben Lau
>Assignee: Matteo Bertozzi
> Attachments: abstraction.patch
>
>
> Ticket for work in progress on new FileSystem abstractions.  Previously, we 
> (Yahoo) submitted a ticket that would add support for humongous (1 million 
> region+) tables via a hierarchical layout (HBASE-13991).  However open source 
> is moving in a similar but not identical direction in the future and so the 
> patch will not be merged into open source.
> We will be working with Cloudera on a different patch now.  It will 
> create/add to 2 layers-- a path abstraction layer and a use-oriented 
> abstraction layer.  The path abstraction layer is epitomized by classes like 
> FsUtils (and in the patch new classes like AFsLayout).  The use oriented 
> abstraction layer is epitomized by existing classes like 
> MasterFileSystem/HRegionFileSystem (and possibly new classes later) that 
> build on the path abstraction layer and focus on 'doing things' (eg creating 
> regions) and less on the gritty details like the paths.
> This work on abstracting and isolating the paths from the use cases will help 
> Yahoo not diverge too much from open source with its internal 'Humongous' 
> table hierarchical layout, while also helping open source move further 
> towards the eventual goal of redoing the FS layout in a similar (but 
> different) hierarchical layout later that focuses on data directory 
> uniformity (unlike the humongous patch) and storing hierarchy in the meta 
> table instead which enables new optimizations (see HBASE-14090.)
> Attached to this ticket is some work we've done at Yahoo so far that will be 
> put into an open source HBase branch for further collaboration.  The patch is 
> not meant to be complete yet and is a work in progress.  (Please wait on 
> patch comments/reviews.)  It also includes some Yahoo-specific 'humongous' 
> layout code that will be removed before submission in open source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14439) New/Improved Filesystem Abstractions

2015-09-15 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746393#comment-14746393
 ] 

Matteo Bertozzi commented on HBASE-14439:
-

I opened the hbase-14439 branch and committed Ben's patch. 
https://github.com/apache/hbase/tree/hbase-14439
I plan to do some experiment there and then have some official proposal here.
but the idea is to have an API that talks about Table, WAL, Region, Family and 
Store Files and not FileSystem paths.
basically finishing up the work started on HBASE-7806/HBASE-7807, and have an 
API ready for HBASE-14090

> New/Improved Filesystem Abstractions
> 
>
> Key: HBASE-14439
> URL: https://issues.apache.org/jira/browse/HBASE-14439
> Project: HBase
>  Issue Type: New Feature
>Reporter: Ben Lau
>Assignee: Matteo Bertozzi
> Attachments: abstraction.patch
>
>
> Ticket for work in progress on new FileSystem abstractions.  Previously, we 
> (Yahoo) submitted a ticket that would add support for humongous (1 million 
> region+) tables via a hierarchical layout (HBASE-13991).  However open source 
> is moving in a similar but not identical direction in the future and so the 
> patch will not be merged into open source.
> We will be working with Cloudera on a different patch now.  It will 
> create/add to 2 layers-- a path abstraction layer and a use-oriented 
> abstraction layer.  The path abstraction layer is epitomized by classes like 
> FsUtils (and in the patch new classes like AFsLayout).  The use oriented 
> abstraction layer is epitomized by existing classes like 
> MasterFileSystem/HRegionFileSystem (and possibly new classes later) that 
> build on the path abstraction layer and focus on 'doing things' (eg creating 
> regions) and less on the gritty details like the paths.
> This work on abstracting and isolating the paths from the use cases will help 
> Yahoo not diverge too much from open source with its internal 'Humongous' 
> table hierarchical layout, while also helping open source move further 
> towards the eventual goal of redoing the FS layout in a similar (but 
> different) hierarchical layout later that focuses on data directory 
> uniformity (unlike the humongous patch) and storing hierarchy in the meta 
> table instead which enables new optimizations (see HBASE-14090.)
> Attached to this ticket is some work we've done at Yahoo so far that will be 
> put into an open source HBase branch for further collaboration.  The patch is 
> not meant to be complete yet and is a work in progress.  (Please wait on 
> patch comments/reviews.)  It also includes some Yahoo-specific 'humongous' 
> layout code that will be removed before submission in open source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14411) Fix unit test failures when using multiwal as default WAL provider

2015-09-15 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14411:
---
Summary: Fix unit test failures when using multiwal as default WAL provider 
 (was: Fix UT failures when using multiwal as default provider)

> Fix unit test failures when using multiwal as default WAL provider
> --
>
> Key: HBASE-14411
> URL: https://issues.apache.org/jira/browse/HBASE-14411
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14411.patch, HBASE-14411_v2.patch
>
>
> If we set hbase.wal.provider to multiwal in 
> hbase-server/src/test/resources/hbase-site.xml which allows us to use 
> BoundedRegionGroupingProvider in UT, we will observe below failures in 
> current code base:
> {noformat}
> Failed tests:
>   TestHLogRecordReader>TestWALRecordReader.testPartialRead:164 expected:<1> 
> but was:<2>
>   TestHLogRecordReader>TestWALRecordReader.testWALRecordReader:216 
> expected:<2> but was:<3>
>   TestWALRecordReader.testPartialRead:164 expected:<1> but was:<2>
>   TestWALRecordReader.testWALRecordReader:216 expected:<2> but was:<3>
>   TestDistributedLogSplitting.testRecoveredEdits:276 edits dir should have 
> more than a single file in it. instead has 1
>   TestAtomicOperation.testMultiRowMutationMultiThreads:499 expected:<0> but 
> was:<1>
>   TestHRegionServerBulkLoad.testAtomicBulkLoad:307
> Expected: is 
>  but: was 
>   TestLogRolling.testCompactionRecordDoesntBlockRolling:611 Should have WAL; 
> one table is not flushed expected:<1> but was:<0>
>   TestLogRolling.testLogRollOnDatanodeDeath:359 null
>   TestLogRolling.testLogRollOnPipelineRestart:472 Missing datanode should've 
> triggered a log roll
>   TestReplicationSourceManager.testLogRoll:237 expected:<6> but was:<7>
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 if 
> skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong number of files in the 
> archive log expected:<11> but was:<12>
>   TestWALSplit.testMovedWALDuringRecovery:810->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   TestWALSplit.testRetryOpenDuringRecovery:838->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594
>  if skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplitCompressed>TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong 
> number of files in the archive log expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testMovedWALDuringRecovery:810->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testRetryOpenDuringRecovery:838->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
> {noformat}
> While patch for HBASE-14306 could resolve failures of TestHLogRecordReader, 
> TestReplicationSourceManager and TestReplicationWALReaderManager, this JIRA 
> will focus on resolving the others



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-15 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746608#comment-14746608
 ] 

Elliott Clark commented on HBASE-14383:
---

bq.Also, not related to compactions, but I have seen cases where there are not 
enough regions per region server to fill the whole memstore space with the 
128MB flush size, a few active regions and big heaps. 

I'm hopeful that we can up that per memstore limit a lot and just let the 
auto-tuning/periodic flusher work together and ensure that all space is more 
used while MTTR is not too bad.

> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14145) Allow the Canary in regionserver mode to try all regions on the server, not just one

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746672#comment-14746672
 ] 

Hudson commented on HBASE-14145:


FAILURE: Integrated in HBase-1.2 #175 (See 
[https://builds.apache.org/job/HBase-1.2/175/])
HBASE-14145 added flag to canary to try all regions in regionserver mode 
(eclark: rev a59579c86671f931b249523b5e92d002dfa69446)
* hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java


> Allow the Canary in regionserver mode to try all regions on the server, not 
> just one
> 
>
> Key: HBASE-14145
> URL: https://issues.apache.org/jira/browse/HBASE-14145
> Project: HBase
>  Issue Type: Bug
>  Components: canary, util
>Affects Versions: 2.0.0, 1.1.0.1
>Reporter: Elliott Clark
>Assignee: Sanjeev Srivatsa
>  Labels: beginner
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14145-v1.patch, HBASE-14145-v2.patch, 
> HBASE-14145-v2.patch, testrun.txt, testrun2.txt
>
>
> We want a pretty in-depth canary that will try every region on a cluster. 
> When doing that for the whole cluster one machine is too slow, so we wanted 
> to split it up and have each regionserver run a canary. That works however 
> the canary does less work as it just tries one random region.
> Lets add a flag that will allow the canary to try all regions on a 
> regionserver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12911) Client-side metrics

2015-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746785#comment-14746785
 ] 

Hadoop QA commented on HBASE-12911:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12756130/12911-0.98.00.patch
  against 0.98 branch at commit 903d876f29aeb11a290d0daed6e0778c8f4ac961.
  ATTACHMENT ID: 12756130

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 29 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
23 warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
3874 checkstyle errors (more than the master's current 3869 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15612//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15612//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15612//artifact/patchprocess/checkstyle-aggregate.html

Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15612//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15612//console

This message is automatically generated.

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, 
> 12911-branch-1.00.patch, am.jpg, client metrics RS-Master.jpg, client metrics 
> client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-15 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746786#comment-14746786
 ] 

Vladimir Rodionov commented on HBASE-14383:
---

{quote}
MTTR depends not a max number of WAL files but on a current load and PMF 
interval.
{quote}

and on memstore size, of course .

> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14400) Fix HBase RPC protection documentation

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746811#comment-14746811
 ] 

Hudson commented on HBASE-14400:


FAILURE: Integrated in HBase-1.1 #664 (See 
[https://builds.apache.org/job/HBase-1.1/664/])
HBASE-14400 Fix HBase RPC protection documentation (apurtell: rev 
ddf2d4b06ef14998e6eb28c62ef99f463bb4827f)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/TestHBaseSaslRpcClient.java
* src/main/asciidoc/_chapters/security.adoc
* hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslUtil.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslClientHandler.java
* hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java


> Fix HBase RPC protection documentation
> --
>
> Key: HBASE-14400
> URL: https://issues.apache.org/jira/browse/HBASE-14400
> Project: HBase
>  Issue Type: Bug
>  Components: encryption, rpc, security
>Reporter: Apekshit Sharma
>Assignee: Apekshit Sharma
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: HBASE-14400-branch-0.98.patch, 
> HBASE-14400-branch-1.0.patch, HBASE-14400-branch-1.1.patch, 
> HBASE-14400-branch-1.2.patch, HBASE-14400-master-v2.patch, 
> HBASE-14400-master.patch
>
>
> HBase configuration 'hbase.rpc.protection' can be set to 'authentication', 
> 'integrity' or 'privacy'.
> "authentication means authentication only and no integrity or privacy; 
> integrity implies
> authentication and integrity are enabled; and privacy implies all of
> authentication, integrity and privacy are enabled."
> However hbase ref guide incorrectly suggests in some places to set the value 
> to 'auth-conf' instead of 'privacy'. Setting value to 'auth-conf' doesn't 
> provide rpc encryption which is what user wants.
> This jira will fix:
> - documentation: change 'auth-conf' references to 'privacy'
> - SaslUtil to support both set of values (privacy/integrity/authentication 
> and auth-conf/auth-int/auth) to be backward compatible with what was being 
> suggested till now.
> - change 'hbase.thrift.security.qop' to be consistent with other similar 
> configurations by using same set of values (privacy/integrity/authentication).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-15 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746783#comment-14746783
 ] 

Vladimir Rodionov commented on HBASE-14383:
---

[~saint@gmail.com]:
{quote}
I'd be for upping the max logs number (have seen cases where it ran away up to 
the thousands so some guard would be good)
{quote}

That is very degenerate case. I have thought about this, it is possible to have 
many CF in a table and very small flush files. By default, flush policy ignores 
all files less than 15MB. Imagine that all your files in a region's memstores  
selected for flushing less than 15MB => there will be no flush and WAL numbers 
will continue growing (indefinitely, by the way).

We probably need  *hbase.regionserver.maxlogs* as a safeguard against runaway 
wals during prolonged burst load, when ingested data per RS  in a PMF flush 
interval (1h) is much greater than overall memstore capacity. I agree we have 
to up default value of *hbase.regionserver.maxlogs* but set during RS init and 
not statically. We have to make sure that overall WAL capacity is not less than 
overall memstore capacity. Ideally it should be large enough to make the event 
(max number exceeded) very rare.

MTTR depends not a max number of WAL files but on a current load and PMF 
interval.  

> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14433) Set down the client executor core thread count from 256 to number of processors

2015-09-15 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14433:
--
Attachment: 14433v3.txt

Retry

> Set down the client executor core thread count from 256 to number of 
> processors
> ---
>
> Key: HBASE-14433
> URL: https://issues.apache.org/jira/browse/HBASE-14433
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Attachments: 14433 (1).txt, 14433.txt, 14433v2.txt, 14433v3.txt, 
> 14433v3.txt, 14433v3.txt, 14433v3.txt, 14433v3.txt, 14433v3.txt
>
>
> HBASE-10449 upped our core count from 0 to 256 (max is 256). Looking in a 
> recent test run core dump, I see up to 256 threads per client and all are 
> idle. At a minimum it makes it hard reading test thread dumps. Trying to 
> learn more about why we went a core of 256 over in HBASE-10449. Meantime will 
> try setting down configs for test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14400) Fix HBase RPC protection documentation

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746892#comment-14746892
 ] 

Hudson commented on HBASE-14400:


FAILURE: Integrated in HBase-1.2 #176 (See 
[https://builds.apache.org/job/HBase-1.2/176/])
HBASE-14400 Fix HBase RPC protection documentation (apurtell: rev 
956d4d9ca6ed46f33b67b0f981af05fb8d6e0111)
* hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/TestHBaseSaslRpcClient.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslClientHandler.java
* src/main/asciidoc/_chapters/security.adoc
* hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslUtil.java


> Fix HBase RPC protection documentation
> --
>
> Key: HBASE-14400
> URL: https://issues.apache.org/jira/browse/HBASE-14400
> Project: HBase
>  Issue Type: Bug
>  Components: encryption, rpc, security
>Reporter: Apekshit Sharma
>Assignee: Apekshit Sharma
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: HBASE-14400-branch-0.98.patch, 
> HBASE-14400-branch-1.0.patch, HBASE-14400-branch-1.1.patch, 
> HBASE-14400-branch-1.2.patch, HBASE-14400-master-v2.patch, 
> HBASE-14400-master.patch
>
>
> HBase configuration 'hbase.rpc.protection' can be set to 'authentication', 
> 'integrity' or 'privacy'.
> "authentication means authentication only and no integrity or privacy; 
> integrity implies
> authentication and integrity are enabled; and privacy implies all of
> authentication, integrity and privacy are enabled."
> However hbase ref guide incorrectly suggests in some places to set the value 
> to 'auth-conf' instead of 'privacy'. Setting value to 'auth-conf' doesn't 
> provide rpc encryption which is what user wants.
> This jira will fix:
> - documentation: change 'auth-conf' references to 'privacy'
> - SaslUtil to support both set of values (privacy/integrity/authentication 
> and auth-conf/auth-int/auth) to be backward compatible with what was being 
> suggested till now.
> - change 'hbase.thrift.security.qop' to be consistent with other similar 
> configurations by using same set of values (privacy/integrity/authentication).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14429) Checkstyle report is broken

2015-09-15 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14429:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~templedf]

> Checkstyle report is broken
> ---
>
> Key: HBASE-14429
> URL: https://issues.apache.org/jira/browse/HBASE-14429
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 1.1.2
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-14429.001.patch
>
>
> I just happened across this when hunting for a checkstyle reporter.  The 
> checkstyle_report.py script is broken.  The output is garbled because the 
> print_row() method is printing the wrong variables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10449) Wrong execution pool configuration in HConnectionManager

2015-09-15 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746919#comment-14746919
 ] 

Nicolas Liochon commented on HBASE-10449:
-

Actually I'm having two doubts:
- the core threads should already have this timeout, no. We should not see 256 
threads, because they should expire already
- IIRC, this thread pool is used when connecting to the various regionserver, 
and they block until they have an answer. So with 4 core threads (for example), 
it means that if we do a multi we contact 4 servers simultaneously at most. The 
threads are not really using CPUs, they're waiting  (old i/o style). BUt may be 
it has changed?





> Wrong execution pool configuration in HConnectionManager
> 
>
> Key: HBASE-10449
> URL: https://issues.apache.org/jira/browse/HBASE-10449
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0, 0.99.0, 0.96.1.1
>Reporter: Nicolas Liochon
>Assignee: Nicolas Liochon
>Priority: Critical
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBASE-10449.v1.patch
>
>
> There is a confusion in the configuration of the pool. The attached patch 
> fixes this. This may change the client performances, as we were using a 
> single thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-15 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746788#comment-14746788
 ] 

Vladimir Rodionov commented on HBASE-14383:
---

{quote}
flush policy ignores all files less than 15MB. 
{quote}

Correction: memstores, not files.

> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14411) Fix unit test failures when using multiwal as default WAL provider

2015-09-15 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-14411:
--
Attachment: HBASE-14411.branch-1.patch

Upload patch for branch-1

Thanks [~tedyu] for review!

> Fix unit test failures when using multiwal as default WAL provider
> --
>
> Key: HBASE-14411
> URL: https://issues.apache.org/jira/browse/HBASE-14411
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14411.branch-1.patch, HBASE-14411.patch, 
> HBASE-14411_v2.patch
>
>
> If we set hbase.wal.provider to multiwal in 
> hbase-server/src/test/resources/hbase-site.xml which allows us to use 
> BoundedRegionGroupingProvider in UT, we will observe below failures in 
> current code base:
> {noformat}
> Failed tests:
>   TestHLogRecordReader>TestWALRecordReader.testPartialRead:164 expected:<1> 
> but was:<2>
>   TestHLogRecordReader>TestWALRecordReader.testWALRecordReader:216 
> expected:<2> but was:<3>
>   TestWALRecordReader.testPartialRead:164 expected:<1> but was:<2>
>   TestWALRecordReader.testWALRecordReader:216 expected:<2> but was:<3>
>   TestDistributedLogSplitting.testRecoveredEdits:276 edits dir should have 
> more than a single file in it. instead has 1
>   TestAtomicOperation.testMultiRowMutationMultiThreads:499 expected:<0> but 
> was:<1>
>   TestHRegionServerBulkLoad.testAtomicBulkLoad:307
> Expected: is 
>  but: was 
>   TestLogRolling.testCompactionRecordDoesntBlockRolling:611 Should have WAL; 
> one table is not flushed expected:<1> but was:<0>
>   TestLogRolling.testLogRollOnDatanodeDeath:359 null
>   TestLogRolling.testLogRollOnPipelineRestart:472 Missing datanode should've 
> triggered a log roll
>   TestReplicationSourceManager.testLogRoll:237 expected:<6> but was:<7>
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 if 
> skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong number of files in the 
> archive log expected:<11> but was:<12>
>   TestWALSplit.testMovedWALDuringRecovery:810->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   TestWALSplit.testRetryOpenDuringRecovery:838->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594
>  if skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplitCompressed>TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong 
> number of files in the archive log expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testMovedWALDuringRecovery:810->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testRetryOpenDuringRecovery:838->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
> {noformat}
> While patch for HBASE-14306 could resolve failures of TestHLogRecordReader, 
> TestReplicationSourceManager and TestReplicationWALReaderManager, this JIRA 
> will focus on resolving the others



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14400) Fix HBase RPC protection documentation

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746888#comment-14746888
 ] 

Hudson commented on HBASE-14400:


SUCCESS: Integrated in HBase-1.2-IT #149 (See 
[https://builds.apache.org/job/HBase-1.2-IT/149/])
HBASE-14400 Fix HBase RPC protection documentation (apurtell: rev 
956d4d9ca6ed46f33b67b0f981af05fb8d6e0111)
* src/main/asciidoc/_chapters/security.adoc
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslClientHandler.java
* hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslUtil.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/TestHBaseSaslRpcClient.java


> Fix HBase RPC protection documentation
> --
>
> Key: HBASE-14400
> URL: https://issues.apache.org/jira/browse/HBASE-14400
> Project: HBase
>  Issue Type: Bug
>  Components: encryption, rpc, security
>Reporter: Apekshit Sharma
>Assignee: Apekshit Sharma
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: HBASE-14400-branch-0.98.patch, 
> HBASE-14400-branch-1.0.patch, HBASE-14400-branch-1.1.patch, 
> HBASE-14400-branch-1.2.patch, HBASE-14400-master-v2.patch, 
> HBASE-14400-master.patch
>
>
> HBase configuration 'hbase.rpc.protection' can be set to 'authentication', 
> 'integrity' or 'privacy'.
> "authentication means authentication only and no integrity or privacy; 
> integrity implies
> authentication and integrity are enabled; and privacy implies all of
> authentication, integrity and privacy are enabled."
> However hbase ref guide incorrectly suggests in some places to set the value 
> to 'auth-conf' instead of 'privacy'. Setting value to 'auth-conf' doesn't 
> provide rpc encryption which is what user wants.
> This jira will fix:
> - documentation: change 'auth-conf' references to 'privacy'
> - SaslUtil to support both set of values (privacy/integrity/authentication 
> and auth-conf/auth-int/auth) to be backward compatible with what was being 
> suggested till now.
> - change 'hbase.thrift.security.qop' to be consistent with other similar 
> configurations by using same set of values (privacy/integrity/authentication).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10449) Wrong execution pool configuration in HConnectionManager

2015-09-15 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746897#comment-14746897
 ] 

Nicolas Liochon commented on HBASE-10449:
-

As I understand the doc, if we do that we create maxThreads and then reject all 
the tasks. Not really useful.
But the patch in HBASE-14433 seems ok:
- we create up to core threads (Runtime.getRuntime().availableProcessors()). If 
we have 10 tasks in parallel we still have 
Runtime.getRuntime().availableProcessors() threads.
- the expire quite quickly (because we do allowCoreThreadTimeOut(true);)

May be we should set maxThreads to coreThreads as well and increase 
HConstants.DEFAULT_HBASE_CLIENT_MAX_TOTAL_TASKS.

But I'm +1 with HBASE-14433 as it is now.

> Wrong execution pool configuration in HConnectionManager
> 
>
> Key: HBASE-10449
> URL: https://issues.apache.org/jira/browse/HBASE-10449
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0, 0.99.0, 0.96.1.1
>Reporter: Nicolas Liochon
>Assignee: Nicolas Liochon
>Priority: Critical
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBASE-10449.v1.patch
>
>
> There is a confusion in the configuration of the pool. The attached patch 
> fixes this. This may change the client performances, as we were using a 
> single thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12751) Allow RowLock to be reader writer

2015-09-15 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-12751:
--
Attachment: 12751.v38.txt

Missed an import.

> Allow RowLock to be reader writer
> -
>
> Key: HBASE-12751
> URL: https://issues.apache.org/jira/browse/HBASE-12751
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 12751.rebased.v25.txt, 12751.rebased.v26.txt, 
> 12751.rebased.v26.txt, 12751.rebased.v27.txt, 12751.rebased.v29.txt, 
> 12751.rebased.v31.txt, 12751.rebased.v32.txt, 12751.rebased.v32.txt, 
> 12751.rebased.v33.txt, 12751.rebased.v34.txt, 12751.rebased.v35.txt, 
> 12751.rebased.v35.txt, 12751.rebased.v35.txt, 12751.v37.txt, 12751.v38.txt, 
> 12751v22.txt, 12751v23.txt, 12751v23.txt, 12751v23.txt, 12751v23.txt, 
> 12751v36.txt, HBASE-12751-v1.patch, HBASE-12751-v10.patch, 
> HBASE-12751-v10.patch, HBASE-12751-v11.patch, HBASE-12751-v12.patch, 
> HBASE-12751-v13.patch, HBASE-12751-v14.patch, HBASE-12751-v15.patch, 
> HBASE-12751-v16.patch, HBASE-12751-v17.patch, HBASE-12751-v18.patch, 
> HBASE-12751-v19 (1).patch, HBASE-12751-v19.patch, HBASE-12751-v2.patch, 
> HBASE-12751-v20.patch, HBASE-12751-v20.patch, HBASE-12751-v21.patch, 
> HBASE-12751-v3.patch, HBASE-12751-v4.patch, HBASE-12751-v5.patch, 
> HBASE-12751-v6.patch, HBASE-12751-v7.patch, HBASE-12751-v8.patch, 
> HBASE-12751-v9.patch, HBASE-12751.patch
>
>
> Right now every write operation grabs a row lock. This is to prevent values 
> from changing during a read modify write operation (increment or check and 
> put). However it limits parallelism in several different scenarios.
> If there are several puts to the same row but different columns or stores 
> then this is very limiting.
> If there are puts to the same column then mvcc number should ensure a 
> consistent ordering. So locking is not needed.
> However locking for check and put or increment is still needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746847#comment-14746847
 ] 

stack commented on HBASE-14383:
---

bq. That is very degenerate case.

I do not disagree. Bitch to fix after the fact though.

bq. By default, flush policy ignores all files less than 15MB. 

This seems wrong if one of these memstores has an old edit that is holding up 
our freeing/GC'ing WALs.

bq. I agree we have to up default value of hbase.regionserver.maxlogs but set 
during RS init and not statically. 

That sounds reasonable...

bq. MTTR depends not a max number of WAL files but on a current load and PMF 
interval.

Disagree. Many WALs has longer MTTR than few WALs



> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14433) Set down the client executor core thread count from 256 to number of processors

2015-09-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746855#comment-14746855
 ] 

stack commented on HBASE-14433:
---

Crashed in hbase-common with below.

Any chance of a +1


{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on 
project hbase-common: ExecutionException: java.lang.RuntimeException: The 
forked VM terminated without properly saying goodbye. VM crash or System.exit 
called?
[ERROR] Command was /bin/sh -c cd 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase/hbase-common
 && /home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk-1.7u51/jre/bin/java 
-enableassertions -XX:MaxDirectMemorySize=1G -Xmx2800m -XX:MaxPermSize=256m 
-Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true 
-Djava.awt.headless=true -jar 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase/hbase-common/target/surefire/surefirebooter8261544835830113794.jar
 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase/hbase-common/target/surefire/surefire7189337424097884080tmp
 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/hbase/hbase-common/target/surefire/surefire_1745728566103349986tmp
{code}

> Set down the client executor core thread count from 256 to number of 
> processors
> ---
>
> Key: HBASE-14433
> URL: https://issues.apache.org/jira/browse/HBASE-14433
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Attachments: 14433 (1).txt, 14433.txt, 14433v2.txt, 14433v3.txt, 
> 14433v3.txt, 14433v3.txt, 14433v3.txt, 14433v3.txt
>
>
> HBASE-10449 upped our core count from 0 to 256 (max is 256). Looking in a 
> recent test run core dump, I see up to 256 threads per client and all are 
> idle. At a minimum it makes it hard reading test thread dumps. Trying to 
> learn more about why we went a core of 256 over in HBASE-10449. Meantime will 
> try setting down configs for test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10449) Wrong execution pool configuration in HConnectionManager

2015-09-15 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746863#comment-14746863
 ] 

Nicolas Liochon commented on HBASE-10449:
-

Sorry for the delay, I'm seeing this now only.
Let me have a look.

> Wrong execution pool configuration in HConnectionManager
> 
>
> Key: HBASE-10449
> URL: https://issues.apache.org/jira/browse/HBASE-10449
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0, 0.99.0, 0.96.1.1
>Reporter: Nicolas Liochon
>Assignee: Nicolas Liochon
>Priority: Critical
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBASE-10449.v1.patch
>
>
> There is a confusion in the configuration of the pool. The attached patch 
> fixes this. This may change the client performances, as we were using a 
> single thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10449) Wrong execution pool configuration in HConnectionManager

2015-09-15 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746875#comment-14746875
 ] 

Nicolas Liochon commented on HBASE-10449:
-

> Where does 'Create a single thread, queue all the tasks for this thread.' 
> come from?
This is what HBASE-9917 actually implemented: with the ThreadPoolExecutor if 
the task queue is unbounded, it does not create new threads:

From: 
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html
If fewer than corePoolSize threads are running, the Executor always prefers 
adding a new thread rather than queuing.
If corePoolSize or more threads are running, the Executor always prefers 
queuing a request rather than adding a new thread.
If a request cannot be queued, a new thread is created unless this would exceed 
maximumPoolSize, in which case, the task will be rejected.

But having less than 256 threads is fine. This was just restoring the previous 
value.


> Wrong execution pool configuration in HConnectionManager
> 
>
> Key: HBASE-10449
> URL: https://issues.apache.org/jira/browse/HBASE-10449
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0, 0.99.0, 0.96.1.1
>Reporter: Nicolas Liochon
>Assignee: Nicolas Liochon
>Priority: Critical
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBASE-10449.v1.patch
>
>
> There is a confusion in the configuration of the pool. The attached patch 
> fixes this. This may change the client performances, as we were using a 
> single thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14400) Fix HBase RPC protection documentation

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746883#comment-14746883
 ] 

Hudson commented on HBASE-14400:


FAILURE: Integrated in HBase-1.3 #177 (See 
[https://builds.apache.org/job/HBase-1.3/177/])
HBASE-14400 Fix HBase RPC protection documentation (apurtell: rev 
1517deee67fb9cd920faa146237f41049fc2ef60)
* src/main/asciidoc/_chapters/security.adoc
* hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslUtil.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/TestHBaseSaslRpcClient.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslClientHandler.java


> Fix HBase RPC protection documentation
> --
>
> Key: HBASE-14400
> URL: https://issues.apache.org/jira/browse/HBASE-14400
> Project: HBase
>  Issue Type: Bug
>  Components: encryption, rpc, security
>Reporter: Apekshit Sharma
>Assignee: Apekshit Sharma
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: HBASE-14400-branch-0.98.patch, 
> HBASE-14400-branch-1.0.patch, HBASE-14400-branch-1.1.patch, 
> HBASE-14400-branch-1.2.patch, HBASE-14400-master-v2.patch, 
> HBASE-14400-master.patch
>
>
> HBase configuration 'hbase.rpc.protection' can be set to 'authentication', 
> 'integrity' or 'privacy'.
> "authentication means authentication only and no integrity or privacy; 
> integrity implies
> authentication and integrity are enabled; and privacy implies all of
> authentication, integrity and privacy are enabled."
> However hbase ref guide incorrectly suggests in some places to set the value 
> to 'auth-conf' instead of 'privacy'. Setting value to 'auth-conf' doesn't 
> provide rpc encryption which is what user wants.
> This jira will fix:
> - documentation: change 'auth-conf' references to 'privacy'
> - SaslUtil to support both set of values (privacy/integrity/authentication 
> and auth-conf/auth-int/auth) to be backward compatible with what was being 
> suggested till now.
> - change 'hbase.thrift.security.qop' to be consistent with other similar 
> configurations by using same set of values (privacy/integrity/authentication).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10449) Wrong execution pool configuration in HConnectionManager

2015-09-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746886#comment-14746886
 ] 

stack commented on HBASE-10449:
---

Thanks [~nkeywal]

Our queue is unbounded then so we do not create new threads once we hit core ? 
Rather, we just queue?  Can we make queue size zero ?

I suppose I should test

> Wrong execution pool configuration in HConnectionManager
> 
>
> Key: HBASE-10449
> URL: https://issues.apache.org/jira/browse/HBASE-10449
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0, 0.99.0, 0.96.1.1
>Reporter: Nicolas Liochon
>Assignee: Nicolas Liochon
>Priority: Critical
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBASE-10449.v1.patch
>
>
> There is a confusion in the configuration of the pool. The attached patch 
> fixes this. This may change the client performances, as we were using a 
> single thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12751) Allow RowLock to be reader writer

2015-09-15 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-12751:
--
Attachment: 12751.v37.txt

Retry

> Allow RowLock to be reader writer
> -
>
> Key: HBASE-12751
> URL: https://issues.apache.org/jira/browse/HBASE-12751
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 12751.rebased.v25.txt, 12751.rebased.v26.txt, 
> 12751.rebased.v26.txt, 12751.rebased.v27.txt, 12751.rebased.v29.txt, 
> 12751.rebased.v31.txt, 12751.rebased.v32.txt, 12751.rebased.v32.txt, 
> 12751.rebased.v33.txt, 12751.rebased.v34.txt, 12751.rebased.v35.txt, 
> 12751.rebased.v35.txt, 12751.rebased.v35.txt, 12751.v37.txt, 12751v22.txt, 
> 12751v23.txt, 12751v23.txt, 12751v23.txt, 12751v23.txt, 12751v36.txt, 
> HBASE-12751-v1.patch, HBASE-12751-v10.patch, HBASE-12751-v10.patch, 
> HBASE-12751-v11.patch, HBASE-12751-v12.patch, HBASE-12751-v13.patch, 
> HBASE-12751-v14.patch, HBASE-12751-v15.patch, HBASE-12751-v16.patch, 
> HBASE-12751-v17.patch, HBASE-12751-v18.patch, HBASE-12751-v19 (1).patch, 
> HBASE-12751-v19.patch, HBASE-12751-v2.patch, HBASE-12751-v20.patch, 
> HBASE-12751-v20.patch, HBASE-12751-v21.patch, HBASE-12751-v3.patch, 
> HBASE-12751-v4.patch, HBASE-12751-v5.patch, HBASE-12751-v6.patch, 
> HBASE-12751-v7.patch, HBASE-12751-v8.patch, HBASE-12751-v9.patch, 
> HBASE-12751.patch
>
>
> Right now every write operation grabs a row lock. This is to prevent values 
> from changing during a read modify write operation (increment or check and 
> put). However it limits parallelism in several different scenarios.
> If there are several puts to the same row but different columns or stores 
> then this is very limiting.
> If there are puts to the same column then mvcc number should ensure a 
> consistent ordering. So locking is not needed.
> However locking for check and put or increment is still needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14438) admin.rb#alter should invoke HTableDescriptor#addCoprocessorWithSpec instead of HTableDescriptor#addCoprocessorWith

2015-09-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746895#comment-14746895
 ] 

stack commented on HBASE-14438:
---

Looks good to me. Anyone who knows this better wish to comment?

> admin.rb#alter should invoke HTableDescriptor#addCoprocessorWithSpec instead 
> of HTableDescriptor#addCoprocessorWith
> ---
>
> Key: HBASE-14438
> URL: https://issues.apache.org/jira/browse/HBASE-14438
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-14438-trunk-v1.patch
>
>
> The help info of 'alter' in hbase shell includes the following usage:
> {code}
> You can add a table coprocessor by setting a table coprocessor attribute:
>   hbase> alter 't1',
> 
> 'coprocessor'=>'hdfs:///foo.jar|com.foo.FooRegionObserver|1001|arg1=1,arg2=2'
> {code}
> However, the admin.rb#alter method will invoke 
> HTableDescriptor#addCoprocessorWith, in which the coprocessor will be 
> incorrectly formatted as:
> {code}
> |hdfs:///foo.jar|com.foo.FooRegionObserver|1001|arg1=1,arg2=2|1073741823|
> {code}
> It seems the admin.rb#alter should use 
> HTableDescriptor#addCoprocessorWithSpec instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746910#comment-14746910
 ] 

Hudson commented on HBASE-14207:


FAILURE: Integrated in HBase-0.98 #1122 (See 
[https://builds.apache.org/job/HBase-0.98/1122/])
HBASE-14207 Region was hijacked and remained in transition when RS failed to 
open a region and later regionplan changed to new RS on retry (apurtell: rev 
7344676074f3e9a57693d77558b432a188d76cee)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


> Region was hijacked and remained in transition when RS failed to open a 
> region and later regionplan changed to new RS on retry
> --
>
> Key: HBASE-14207
> URL: https://issues.apache.org/jira/browse/HBASE-14207
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.98.6
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 0.98.15
>
> Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98-V2.patch, 
> HBASE-14207-0.98.patch
>
>
> On production environment, following events happened
> 1. Master is trying to assign a region to RS, but due to 
> KeeperException$SessionExpiredException RS failed to open the region.
>   In RS log, saw multiple WARN log related to 
> KeeperException$SessionExpiredException 
>   > KeeperErrorCode = Session expired for 
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
>   > Unable to get data of znode 
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
> 2. Master retried to assign the region to same RS, but RS again failed.
> 3. On second retry new plan formed and this time plan destination (RS) is 
> different, so master send the request to new RS to open the region. But new 
> RS failed to open the region as there was server mismatch in ZNODE than the  
> expected current server name. 
> Logs Snippet:
> {noformat}
> HM
> 2015-07-14 03:50:29,759 | INFO  | master:T101PC03VM13:21300 | Processing 
> 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | 
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644)
> 2015-07-14 03:50:29,759 | INFO  | master:T101PC03VM13:21300 | Transitioned 
> {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, 
> server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, 
> ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | 
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:50:29,760 | INFO  | master:T101PC03VM13:21300 | Processed 
> region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on 
> server: T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768)
> 2015-07-14 03:50:29,800 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:29,801 | WARN  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 
> of 10 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:29,802 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> the same failed server. | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123)
> 2015-07-14 03:50:31,804 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:31,806 | WARN  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 
> of 10 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:31,807 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned 
> {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, 
> 

[jira] [Commented] (HBASE-14433) Set down the client executor core thread count from 256 to number of processors

2015-09-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746907#comment-14746907
 ] 

stack commented on HBASE-14433:
---

I got a +1 over on HBASE-10449  from the man himself, [~nkeywal]

We need to test but for now will live with less threads till proved we need 
more.

> Set down the client executor core thread count from 256 to number of 
> processors
> ---
>
> Key: HBASE-14433
> URL: https://issues.apache.org/jira/browse/HBASE-14433
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Attachments: 14433 (1).txt, 14433.txt, 14433v2.txt, 14433v3.txt, 
> 14433v3.txt, 14433v3.txt, 14433v3.txt, 14433v3.txt, 14433v3.txt
>
>
> HBASE-10449 upped our core count from 0 to 256 (max is 256). Looking in a 
> recent test run core dump, I see up to 256 threads per client and all are 
> idle. At a minimum it makes it hard reading test thread dumps. Trying to 
> learn more about why we went a core of 256 over in HBASE-10449. Meantime will 
> try setting down configs for test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-7332) [webui] HMaster webui should display the number of regions a table has.

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746911#comment-14746911
 ] 

Hudson commented on HBASE-7332:
---

FAILURE: Integrated in HBase-0.98 #1122 (See 
[https://builds.apache.org/job/HBase-0.98/1122/])
HBASE-14434 Merge of HBASE-7332 to 0.98 dropped a hunk (apurtell: rev 
13af5d2a24deacf30f4665b78d7316718ab47191)
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java


> [webui] HMaster webui should display the number of regions a table has.
> ---
>
> Key: HBASE-7332
> URL: https://issues.apache.org/jira/browse/HBASE-7332
> Project: HBase
>  Issue Type: Bug
>  Components: UI
>Affects Versions: 2.0.0, 1.1.0
>Reporter: Jonathan Hsieh
>Assignee: Andrey Stepachev
>Priority: Minor
>  Labels: beginner, operability
> Fix For: 2.0.0, 1.1.0, 0.98.11
>
> Attachments: HBASE-7332-0.98.patch, HBASE-7332.patch, 
> HBASE-7332.patch, Screen Shot 2014-07-28 at 4.10.01 PM.png, Screen Shot 
> 2015-02-03 at 9.23.57 AM.png
>
>
> Pre-0.96/trunk hbase displayed the number of regions per table in the table 
> listing.  Would be good to have this back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14434) Merge of HBASE-7332 to 0.98 dropped a hunk

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746909#comment-14746909
 ] 

Hudson commented on HBASE-14434:


FAILURE: Integrated in HBase-0.98 #1122 (See 
[https://builds.apache.org/job/HBase-0.98/1122/])
HBASE-14434 Merge of HBASE-7332 to 0.98 dropped a hunk (apurtell: rev 
13af5d2a24deacf30f4665b78d7316718ab47191)
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java


> Merge of HBASE-7332 to 0.98 dropped a hunk
> --
>
> Key: HBASE-14434
> URL: https://issues.apache.org/jira/browse/HBASE-14434
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.11
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14434-0.98.patch
>
>
> The merge of HBASE-7332 to 0.98 dropped a hunk. Spotted by [~cuijianwei] 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-15 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746573#comment-14746573
 ] 

Vladimir Rodionov commented on HBASE-14383:
---

[~enis], [~saint@gmail.com], [~lhofhansl], [~apurtell] what do you think? 
Can we retire *hbase.regionserver.maxlogs*?

> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12911) Client-side metrics

2015-09-15 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12911:
-
Attachment: (was: HBASE-14382-0.98.00.patch)

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, 
> 12911-branch-1.00.patch, am.jpg, client metrics RS-Master.jpg, client metrics 
> client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12911) Client-side metrics

2015-09-15 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12911:
-
Attachment: 12911-0.98.00.patch

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, 
> 12911-branch-1.00.patch, am.jpg, client metrics RS-Master.jpg, client metrics 
> client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12911) Client-side metrics

2015-09-15 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12911:
-
Attachment: HBASE-14382-0.98.00.patch

Here's a WIP backport for 0.98. IPC and TestMetricsConnection both pass with 
hadoop2. I haven't implemented hadoop1 compat module yet. [~apurtell] Is this 
something you're interested in?

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, 
> 12911-branch-1.00.patch, am.jpg, client metrics RS-Master.jpg, client metrics 
> client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14145) Allow the Canary in regionserver mode to try all regions on the server, not just one

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746622#comment-14746622
 ] 

Hudson commented on HBASE-14145:


SUCCESS: Integrated in HBase-1.3-IT #157 (See 
[https://builds.apache.org/job/HBase-1.3-IT/157/])
HBASE-14145 added flag to canary to try all regions in regionserver mode 
(eclark: rev 9e7f9b621ac759d3b6a6569c7a79529a2bf910ee)
* hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java


> Allow the Canary in regionserver mode to try all regions on the server, not 
> just one
> 
>
> Key: HBASE-14145
> URL: https://issues.apache.org/jira/browse/HBASE-14145
> Project: HBase
>  Issue Type: Bug
>  Components: canary, util
>Affects Versions: 2.0.0, 1.1.0.1
>Reporter: Elliott Clark
>Assignee: Sanjeev Srivatsa
>  Labels: beginner
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14145-v1.patch, HBASE-14145-v2.patch, 
> HBASE-14145-v2.patch, testrun.txt, testrun2.txt
>
>
> We want a pretty in-depth canary that will try every region on a cluster. 
> When doing that for the whole cluster one machine is too slow, so we wanted 
> to split it up and have each regionserver run a canary. That works however 
> the canary does less work as it just tries one random region.
> Lets add a flag that will allow the canary to try all regions on a 
> regionserver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14441) HBase Backup/Restore Phase 2: Multiple RS per host support

2015-09-15 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14441:
-

 Summary: HBase Backup/Restore Phase 2: Multiple RS per host support
 Key: HBASE-14441
 URL: https://issues.apache.org/jira/browse/HBASE-14441
 Project: HBase
  Issue Type: New Feature
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14055) Untangle region - region server dependencies

2015-09-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746394#comment-14746394
 ] 

stack commented on HBASE-14055:
---

Interested in getting this in [~enis] I like the WALServices bit...  Or, would 
it be better to put up a target for where to take Region long term first to 
make sure this patch and others are driving toward it (We have RegionServices 
Interface but we also have Region Interface... we need both?)

> Untangle region - region server dependencies
> 
>
> Key: HBASE-14055
> URL: https://issues.apache.org/jira/browse/HBASE-14055
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.3.0
>
> Attachments: hbase-14055_v1.patch
>
>
> We need to untangle region from outside of region server. The parent jira is 
> trying to create an embedded library, like leveldb, out of region so that it 
> should be possible to host a set of regions without being a regionserver. 
> From a layering point of view, region hosting and WAL services should be 
> abstracted. Hosting a region means threads for flushing, compaction, 
> cleanups, hosting a WAL, and block cache. The RegionServer normally holds 
> these, together with RpcServer and zk connection, heartbeats, and all other 
> stuff that is required for running as a deamon and inside a cluster. 
> See parent jira for some more context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14283) Reverse scan doesn’t work with HFile inline index/bloom blocks

2015-09-15 Thread Ben Lau (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746484#comment-14746484
 ] 

Ben Lau commented on HBASE-14283:
-

After talking to some committers in HBase, it seems that unless there is a very 
strong case / no viable alternative, all new patches to HBase should not 
require a full cluster restart.  Hence, we will be going with the 
2-rolling-restart approach as described above.  It requires the cluster 
operator to do 2 rolling restarts and set a new config but that should not be 
too burdensome for a major upgrade.  This rolling-restart-compatible approach 
is a bit more messy/complicated code-wise so let us look a bit into the best 
way to do this.

> Reverse scan doesn’t work with HFile inline index/bloom blocks
> --
>
> Key: HBASE-14283
> URL: https://issues.apache.org/jira/browse/HBASE-14283
> Project: HBase
>  Issue Type: Bug
>Reporter: Ben Lau
>Assignee: Ben Lau
> Attachments: HBASE-14283-v2.patch, HBASE-14283.patch, 
> hfile-seek-before.patch
>
>
> Reverse scans do not work if an HFile contains inline bloom blocks or leaf 
> level index blocks.  The reason is because the seekBefore() call calculates 
> the previous data block’s size by assuming data blocks are contiguous which 
> is not the case in HFile V2 and beyond.
> Attached is a first cut patch (targeting 
> bcef28eefaf192b0ad48c8011f98b8e944340da5 on trunk) which includes:
> (1) a unit test which exposes the bug and demonstrates failures for both 
> inline bloom blocks and inline index blocks
> (2) a proposed fix for inline index blocks that does not require a new HFile 
> version change, but is only performant for 1 and 2-level indexes and not 3+.  
> 3+ requires an HFile format update for optimal performance.
> This patch does not fix the bloom filter blocks bug.  But the fix should be 
> similar to the case of inline index blocks.  The reason I haven’t made the 
> change yet is I want to confirm that you guys would be fine with me revising 
> the HFile.Reader interface.
> Specifically, these 2 functions (getGeneralBloomFilterMetadata and 
> getDeleteBloomFilterMetadata) need to return the BloomFilter.  Right now the 
> HFileReader class doesn’t have a reference to the bloom filters (and hence 
> their indices) and only constructs the IO streams and hence has no way to 
> know where the bloom blocks are in the HFile.  It seems that the HFile.Reader 
> bloom method comments state that they “know nothing about how that metadata 
> is structured” but I do not know if that is a requirement of the abstraction 
> (why?) or just an incidental current property. 
> We would like to do 3 things with community approval:
> (1) Update the HFile.Reader interface and implementation to contain and 
> return BloomFilters directly rather than unstructured IO streams
> (2) Merge the fixes for index blocks and bloom blocks into open source
> (3) Create a new Jira ticket for open source HBase to add a ‘prevBlockSize’ 
> field in the block header in the next HFile version, so that seekBefore() 
> calls can not only be correct but performant in all cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14287) Bootstrapping a cluster leaves temporary WAL directory laying around

2015-09-15 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14287:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Bootstrapping a cluster leaves temporary WAL directory laying around
> 
>
> Key: HBASE-14287
> URL: https://issues.apache.org/jira/browse/HBASE-14287
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Affects Versions: 1.0.2, 1.1.1
>Reporter: Lars George
>Priority: Minor
> Fix For: 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14287-branch-1.txt
>
>
> When a new cluster is started, it creates a temporary WAL as {{hbase:meta}} 
> is created during bootstrapping the system. Then this log is closed before 
> properly opened on a region server. The temp WAL file is scheduled for 
> removal, moved to oldWALs and eventually claimed. Issue is that the WAL 
> directory with the temp region is not removed. For example:
> {noformat}
> drwxr-xr-x   - hadoop hadoop  0 2015-05-28 10:21 
> /hbase/WALs/hregion-65589555
> {noformat}
> The directory is empty and does not harm, but on the other hand it is not 
> needed anymore and should be removed. Cosmetic and good housekeeping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-15 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746597#comment-14746597
 ] 

Enis Soztutar commented on HBASE-14383:
---

bq. Can we retire hbase.regionserver.maxlogs?
I am in favor of that, or keeping it as a safety net, but with a much higher 
default (128?). 

With default settings
{code}
hbase.regionserver.maxlogs=32
hbase.regionserver.hlog.blocksize=128MB
hbase.regionserver.logroll.multiplier=0.95
{code}
We can only have 32*128*0.95 = 3.9MB of WAL entries. So, if you are running 
with 32GB heap and 0.4 memstore size, the memstore space is just left unused. 

Also, not related to compactions, but I have seen cases where there are not 
enough regions per region server to fill the whole memstore space with the 
128MB flush size, a few active regions and big heaps. We do not allow a 
memstore to grow beyond the flush limit to guard against long flushes and long 
MTTR times. But my feeling is that, maybe we can have a dynamically adjustable 
flush size taking into account a min and max flush size and delay triggering 
the flush if there is more space. 


> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13116) Enhance the documentation for usage of "doAs" through REST and Thrift gateways

2015-09-15 Thread Srikanth Srungarapu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746619#comment-14746619
 ] 

Srikanth Srungarapu commented on HBASE-13116:
-

Thanks!

> Enhance the documentation for usage of "doAs" through REST and Thrift gateways
> --
>
> Key: HBASE-13116
> URL: https://issues.apache.org/jira/browse/HBASE-13116
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Srikanth Srungarapu
>Assignee: Jerry He
>Priority: Minor
>
> The existing documentation on the instructions to use "doAs" feature is a bit 
> misleading. A little more explanation about the overlapping configurations 
> for impersonation and doAs would make things smoother.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14400) Fix HBase RPC protection documentation

2015-09-15 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14400:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: (was: 1.2.1)
   0.98.15
   1.3.0
   1.2.0
   Status: Resolved  (was: Patch Available)

Pushed to 0.98 and up. 

> Fix HBase RPC protection documentation
> --
>
> Key: HBASE-14400
> URL: https://issues.apache.org/jira/browse/HBASE-14400
> Project: HBase
>  Issue Type: Bug
>  Components: encryption, rpc, security
>Reporter: Apekshit Sharma
>Assignee: Apekshit Sharma
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: HBASE-14400-branch-0.98.patch, 
> HBASE-14400-branch-1.0.patch, HBASE-14400-branch-1.1.patch, 
> HBASE-14400-branch-1.2.patch, HBASE-14400-master-v2.patch, 
> HBASE-14400-master.patch
>
>
> HBase configuration 'hbase.rpc.protection' can be set to 'authentication', 
> 'integrity' or 'privacy'.
> "authentication means authentication only and no integrity or privacy; 
> integrity implies
> authentication and integrity are enabled; and privacy implies all of
> authentication, integrity and privacy are enabled."
> However hbase ref guide incorrectly suggests in some places to set the value 
> to 'auth-conf' instead of 'privacy'. Setting value to 'auth-conf' doesn't 
> provide rpc encryption which is what user wants.
> This jira will fix:
> - documentation: change 'auth-conf' references to 'privacy'
> - SaslUtil to support both set of values (privacy/integrity/authentication 
> and auth-conf/auth-int/auth) to be backward compatible with what was being 
> suggested till now.
> - change 'hbase.thrift.security.qop' to be consistent with other similar 
> configurations by using same set of values (privacy/integrity/authentication).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14439) New/Improved Filesystem Abstractions

2015-09-15 Thread Ben Lau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Lau updated HBASE-14439:

Description: 
Ticket for work in progress on new FileSystem abstractions.  Previously, we 
(Yahoo) submitted a ticket that would add support for humongous (1 million 
region+) tables via a hierarchical layout (HBASE-13991).  However open source 
is moving in a similar but not identical direction in the future and so the 
patch will not be merged into open source.

We will be working on a different patch now with folks from open source.  It 
will create/add to 2 layers-- a path abstraction layer and a use-oriented 
abstraction layer.  The path abstraction layer is epitomized by classes like 
FsUtils (and in the patch new classes like AFsLayout).  The use oriented 
abstraction layer is epitomized by existing classes like 
MasterFileSystem/HRegionFileSystem (and possibly new classes later) that build 
on the path abstraction layer and focus on 'doing things' (eg creating regions) 
and less on the gritty details like the paths.

This work on abstracting and isolating the paths from the use cases will help 
Yahoo not diverge too much from open source with its internal 'Humongous' table 
hierarchical layout, while also helping open source move further towards the 
eventual goal of redoing the FS layout in a similar (but different) 
hierarchical layout later that focuses on data directory uniformity (unlike the 
humongous patch) and storing hierarchy in the meta table instead which enables 
new optimizations (see HBASE-14090.)

Attached to this ticket is some work we've done at Yahoo so far that will be 
put into an open source HBase branch for further collaboration.  The patch is 
not meant to be complete yet and is a work in progress.  (Please wait on patch 
comments/reviews.)  It also includes some Yahoo-specific 'humongous' layout 
code that will be removed before submission in open source.

  was:
Ticket for work in progress on new FileSystem abstractions.  Previously, we 
(Yahoo) submitted a ticket that would add support for humongous (1 million 
region+) tables via a hierarchical layout (HBASE-13991).  However open source 
is moving in a similar but not identical direction in the future and so the 
patch will not be merged into open source.

We will be working with Cloudera on a different patch now.  It will create/add 
to 2 layers-- a path abstraction layer and a use-oriented abstraction layer.  
The path abstraction layer is epitomized by classes like FsUtils (and in the 
patch new classes like AFsLayout).  The use oriented abstraction layer is 
epitomized by existing classes like MasterFileSystem/HRegionFileSystem (and 
possibly new classes later) that build on the path abstraction layer and focus 
on 'doing things' (eg creating regions) and less on the gritty details like the 
paths.

This work on abstracting and isolating the paths from the use cases will help 
Yahoo not diverge too much from open source with its internal 'Humongous' table 
hierarchical layout, while also helping open source move further towards the 
eventual goal of redoing the FS layout in a similar (but different) 
hierarchical layout later that focuses on data directory uniformity (unlike the 
humongous patch) and storing hierarchy in the meta table instead which enables 
new optimizations (see HBASE-14090.)

Attached to this ticket is some work we've done at Yahoo so far that will be 
put into an open source HBase branch for further collaboration.  The patch is 
not meant to be complete yet and is a work in progress.  (Please wait on patch 
comments/reviews.)  It also includes some Yahoo-specific 'humongous' layout 
code that will be removed before submission in open source.


> New/Improved Filesystem Abstractions
> 
>
> Key: HBASE-14439
> URL: https://issues.apache.org/jira/browse/HBASE-14439
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ben Lau
>Assignee: Matteo Bertozzi
> Attachments: abstraction.patch
>
>
> Ticket for work in progress on new FileSystem abstractions.  Previously, we 
> (Yahoo) submitted a ticket that would add support for humongous (1 million 
> region+) tables via a hierarchical layout (HBASE-13991).  However open source 
> is moving in a similar but not identical direction in the future and so the 
> patch will not be merged into open source.
> We will be working on a different patch now with folks from open source.  It 
> will create/add to 2 layers-- a path abstraction layer and a use-oriented 
> abstraction layer.  The path abstraction layer is epitomized by classes like 
> FsUtils (and in the patch new classes like AFsLayout).  The use oriented 
> abstraction layer is epitomized by existing classes like 
> MasterFileSystem/HRegionFileSystem (and possibly new classes later) that 
> build on 

[jira] [Commented] (HBASE-14411) Fix UT failures when using multiwal as default provider

2015-09-15 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746399#comment-14746399
 ] 

Ted Yu commented on HBASE-14411:


Ran failed tests locally and they passed.

+1

> Fix UT failures when using multiwal as default provider
> ---
>
> Key: HBASE-14411
> URL: https://issues.apache.org/jira/browse/HBASE-14411
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14411.patch, HBASE-14411_v2.patch
>
>
> If we set hbase.wal.provider to multiwal in 
> hbase-server/src/test/resources/hbase-site.xml which allows us to use 
> BoundedRegionGroupingProvider in UT, we will observe below failures in 
> current code base:
> {noformat}
> Failed tests:
>   TestHLogRecordReader>TestWALRecordReader.testPartialRead:164 expected:<1> 
> but was:<2>
>   TestHLogRecordReader>TestWALRecordReader.testWALRecordReader:216 
> expected:<2> but was:<3>
>   TestWALRecordReader.testPartialRead:164 expected:<1> but was:<2>
>   TestWALRecordReader.testWALRecordReader:216 expected:<2> but was:<3>
>   TestDistributedLogSplitting.testRecoveredEdits:276 edits dir should have 
> more than a single file in it. instead has 1
>   TestAtomicOperation.testMultiRowMutationMultiThreads:499 expected:<0> but 
> was:<1>
>   TestHRegionServerBulkLoad.testAtomicBulkLoad:307
> Expected: is 
>  but: was 
>   TestLogRolling.testCompactionRecordDoesntBlockRolling:611 Should have WAL; 
> one table is not flushed expected:<1> but was:<0>
>   TestLogRolling.testLogRollOnDatanodeDeath:359 null
>   TestLogRolling.testLogRollOnPipelineRestart:472 Missing datanode should've 
> triggered a log roll
>   TestReplicationSourceManager.testLogRoll:237 expected:<6> but was:<7>
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 if 
> skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong number of files in the 
> archive log expected:<11> but was:<12>
>   TestWALSplit.testMovedWALDuringRecovery:810->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   TestWALSplit.testRetryOpenDuringRecovery:838->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594
>  if skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplitCompressed>TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong 
> number of files in the archive log expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testMovedWALDuringRecovery:810->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testRetryOpenDuringRecovery:838->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
> {noformat}
> While patch for HBASE-14306 could resolve failures of TestHLogRecordReader, 
> TestReplicationSourceManager and TestReplicationWALReaderManager, this JIRA 
> will focus on resolving the others



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14394) Properly close the connection after reading records from table.

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746499#comment-14746499
 ] 

Hudson commented on HBASE-14394:


FAILURE: Integrated in HBase-TRUNK #6809 (See 
[https://builds.apache.org/job/HBase-TRUNK/6809/])
HBASE-14394 Properly close the connection after reading records from table. 
(ssrungarapu: rev 938d2a0c9cfa4c033ccc72de490672f151bb0351)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableRecordReader.java


> Properly close the connection after reading records from table.
> ---
>
> Key: HBASE-14394
> URL: https://issues.apache.org/jira/browse/HBASE-14394
> Project: HBase
>  Issue Type: Bug
>Reporter: Srikanth Srungarapu
>Assignee: Srikanth Srungarapu
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: HBASE-14394.patch, HBASE-14394_v2.patch, 
> HBASE-14394_v3.patch, HBASE-14394_v4.patch
>
>
> This was brought to notice by one of our observant customers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14145) Allow the Canary in regionserver mode to try all regions on the server, not just one

2015-09-15 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-14145:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master, branch-1, and branch-1.2.
Fixed line length nit.

Thanks for the patch.

> Allow the Canary in regionserver mode to try all regions on the server, not 
> just one
> 
>
> Key: HBASE-14145
> URL: https://issues.apache.org/jira/browse/HBASE-14145
> Project: HBase
>  Issue Type: Bug
>  Components: canary, util
>Affects Versions: 2.0.0, 1.1.0.1
>Reporter: Elliott Clark
>Assignee: Sanjeev Srivatsa
>  Labels: beginner
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14145-v1.patch, HBASE-14145-v2.patch, 
> HBASE-14145-v2.patch, testrun.txt, testrun2.txt
>
>
> We want a pretty in-depth canary that will try every region on a cluster. 
> When doing that for the whole cluster one machine is too slow, so we wanted 
> to split it up and have each regionserver run a canary. That works however 
> the canary does less work as it just tries one random region.
> Lets add a flag that will allow the canary to try all regions on a 
> regionserver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-15 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746564#comment-14746564
 ] 

Vladimir Rodionov commented on HBASE-14383:
---

Need a feedback on usefulness of *hbase.regionserver.maxlogs* configuration 
setting.

LogRoller runs periodically (1h by default) and does two things:

# Archive old logs (WAL files which have all WALEdits already flushed)
# Then checks number of active WAL files and if it exceeds 
hbase.regionserver.maxlogs then all regions which have edits from the oldest 
WAL file will be flushed. 

rollWriter from FSHLog:
{code}
  @Override
  public byte [][] rollWriter(boolean force) throws FailedLogCloseException, 
IOException {
rollWriterLock.lock();
try {
  // Return if nothing to flush.
  if (!force && (this.writer != null && this.numEntries.get() <= 0)) return 
null;
  byte [][] regionsToFlush = null;
  if (this.closed) {
LOG.debug("WAL closed. Skipping rolling of writer");
return regionsToFlush;
  }
  if (!closeBarrier.beginOp()) {
LOG.debug("WAL closing. Skipping rolling of writer");
return regionsToFlush;
  }
  TraceScope scope = Trace.startSpan("FSHLog.rollWriter");
  try {
Path oldPath = getOldPath();
Path newPath = getNewPath();
// Any exception from here on is catastrophic, non-recoverable so we 
currently abort.
Writer nextWriter = this.createWriterInstance(newPath);
FSDataOutputStream nextHdfsOut = null;
if (nextWriter instanceof ProtobufLogWriter) {
  nextHdfsOut = ((ProtobufLogWriter)nextWriter).getStream();
  // If a ProtobufLogWriter, go ahead and try and sync to force setup 
of pipeline.
  // If this fails, we just keep going it is an optimization, not 
the end of the world.
  preemptiveSync((ProtobufLogWriter)nextWriter);
}
tellListenersAboutPreLogRoll(oldPath, newPath);
// NewPath could be equal to oldPath if replaceWriter fails.
newPath = replaceWriter(oldPath, newPath, nextWriter, nextHdfsOut);
tellListenersAboutPostLogRoll(oldPath, newPath);
// Can we delete any of the old log files?
if (getNumRolledLogFiles() > 0) {
  cleanOldLogs();
  regionsToFlush = findRegionsToForceFlush();
}
  } finally {
closeBarrier.endOp();
assert scope == NullScope.INSTANCE || !scope.isDetached();
scope.close();
  }
  return regionsToFlush;
} finally {
  rollWriterLock.unlock();
}
  }
{code}

There is a clear duplication in functionality between LogRoller (LR) and 
PeriodicMemstoreFlsuher (PMF). PMF already takes care of old memstores and 
flushes them - no need to call regionsToFlush = findRegionsToForceFlush() in a 
rollWriter call and hence there is no need in *hbase.regionserver.maxlogs* 
config option. PMF flushes periodically oldest memstores and LogRoller archives 
periodically old WAL files. That is it.   


> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14434) Merge of HBASE-7332 to 0.98 dropped a hunk

2015-09-15 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14434:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to 0.98 

> Merge of HBASE-7332 to 0.98 dropped a hunk
> --
>
> Key: HBASE-14434
> URL: https://issues.apache.org/jira/browse/HBASE-14434
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.11
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14434-0.98.patch
>
>
> The merge of HBASE-7332 to 0.98 dropped a hunk. Spotted by [~cuijianwei] 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry

2015-09-15 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14207:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to 0.98

> Region was hijacked and remained in transition when RS failed to open a 
> region and later regionplan changed to new RS on retry
> --
>
> Key: HBASE-14207
> URL: https://issues.apache.org/jira/browse/HBASE-14207
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.98.6
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 0.98.15
>
> Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98-V2.patch, 
> HBASE-14207-0.98.patch
>
>
> On production environment, following events happened
> 1. Master is trying to assign a region to RS, but due to 
> KeeperException$SessionExpiredException RS failed to open the region.
>   In RS log, saw multiple WARN log related to 
> KeeperException$SessionExpiredException 
>   > KeeperErrorCode = Session expired for 
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
>   > Unable to get data of znode 
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
> 2. Master retried to assign the region to same RS, but RS again failed.
> 3. On second retry new plan formed and this time plan destination (RS) is 
> different, so master send the request to new RS to open the region. But new 
> RS failed to open the region as there was server mismatch in ZNODE than the  
> expected current server name. 
> Logs Snippet:
> {noformat}
> HM
> 2015-07-14 03:50:29,759 | INFO  | master:T101PC03VM13:21300 | Processing 
> 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | 
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644)
> 2015-07-14 03:50:29,759 | INFO  | master:T101PC03VM13:21300 | Transitioned 
> {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, 
> server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, 
> ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | 
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:50:29,760 | INFO  | master:T101PC03VM13:21300 | Processed 
> region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on 
> server: T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768)
> 2015-07-14 03:50:29,800 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:29,801 | WARN  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 
> of 10 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:29,802 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> the same failed server. | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123)
> 2015-07-14 03:50:31,804 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:31,806 | WARN  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 
> of 10 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:31,807 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned 
> {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, 
> server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b 
> state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | 
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:50:31,807 | INFO  | 
> 

[jira] [Commented] (HBASE-14145) Allow the Canary in regionserver mode to try all regions on the server, not just one

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746743#comment-14746743
 ] 

Hudson commented on HBASE-14145:


FAILURE: Integrated in HBase-TRUNK #6810 (See 
[https://builds.apache.org/job/HBase-TRUNK/6810/])
HBASE-14145 added flag to canary to try all regions in regionserver mode 
(eclark: rev 903d876f29aeb11a290d0daed6e0778c8f4ac961)
* hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java


> Allow the Canary in regionserver mode to try all regions on the server, not 
> just one
> 
>
> Key: HBASE-14145
> URL: https://issues.apache.org/jira/browse/HBASE-14145
> Project: HBase
>  Issue Type: Bug
>  Components: canary, util
>Affects Versions: 2.0.0, 1.1.0.1
>Reporter: Elliott Clark
>Assignee: Sanjeev Srivatsa
>  Labels: beginner
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14145-v1.patch, HBASE-14145-v2.patch, 
> HBASE-14145-v2.patch, testrun.txt, testrun2.txt
>
>
> We want a pretty in-depth canary that will try every region on a cluster. 
> When doing that for the whole cluster one machine is too slow, so we wanted 
> to split it up and have each regionserver run a canary. That works however 
> the canary does less work as it just tries one random region.
> Lets add a flag that will allow the canary to try all regions on a 
> regionserver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14411) Fix unit test failures when using multiwal as default WAL provider

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746744#comment-14746744
 ] 

Hudson commented on HBASE-14411:


FAILURE: Integrated in HBase-TRUNK #6810 (See 
[https://builds.apache.org/job/HBase-TRUNK/6810/])
HBASE-14411 Fix unit test failures when using multiwal as default WAL provider 
(Yu Li) (tedyu: rev 76f4e157adc12ef8ec6b9fd3eb3c60d3c648cb80)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestWALSplit.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DefaultWALProvider.java


> Fix unit test failures when using multiwal as default WAL provider
> --
>
> Key: HBASE-14411
> URL: https://issues.apache.org/jira/browse/HBASE-14411
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14411.patch, HBASE-14411_v2.patch
>
>
> If we set hbase.wal.provider to multiwal in 
> hbase-server/src/test/resources/hbase-site.xml which allows us to use 
> BoundedRegionGroupingProvider in UT, we will observe below failures in 
> current code base:
> {noformat}
> Failed tests:
>   TestHLogRecordReader>TestWALRecordReader.testPartialRead:164 expected:<1> 
> but was:<2>
>   TestHLogRecordReader>TestWALRecordReader.testWALRecordReader:216 
> expected:<2> but was:<3>
>   TestWALRecordReader.testPartialRead:164 expected:<1> but was:<2>
>   TestWALRecordReader.testWALRecordReader:216 expected:<2> but was:<3>
>   TestDistributedLogSplitting.testRecoveredEdits:276 edits dir should have 
> more than a single file in it. instead has 1
>   TestAtomicOperation.testMultiRowMutationMultiThreads:499 expected:<0> but 
> was:<1>
>   TestHRegionServerBulkLoad.testAtomicBulkLoad:307
> Expected: is 
>  but: was 
>   TestLogRolling.testCompactionRecordDoesntBlockRolling:611 Should have WAL; 
> one table is not flushed expected:<1> but was:<0>
>   TestLogRolling.testLogRollOnDatanodeDeath:359 null
>   TestLogRolling.testLogRollOnPipelineRestart:472 Missing datanode should've 
> triggered a log roll
>   TestReplicationSourceManager.testLogRoll:237 expected:<6> but was:<7>
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 if 
> skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong number of files in the 
> archive log expected:<11> but was:<12>
>   TestWALSplit.testMovedWALDuringRecovery:810->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   TestWALSplit.testRetryOpenDuringRecovery:838->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594
>  if skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplitCompressed>TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong 
> number of files in the archive log expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testMovedWALDuringRecovery:810->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testRetryOpenDuringRecovery:838->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
> {noformat}
> While patch for HBASE-14306 could resolve failures of TestHLogRecordReader, 
> TestReplicationSourceManager and TestReplicationWALReaderManager, this JIRA 
> will focus on resolving the others



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14145) Allow the Canary in regionserver mode to try all regions on the server, not just one

2015-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746460#comment-14746460
 ] 

Hadoop QA commented on HBASE-14145:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12756091/HBASE-14145-v2.patch
  against master branch at commit 938d2a0c9cfa4c033ccc72de490672f151bb0351.
  ATTACHMENT ID: 12756091

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1837 checkstyle errors (more than the master's current 1835 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+tasks.add(new RegionServerTask(this.connection, serverName, 
region, getSink(), successes));
+  tasks.add(new RegionServerTask(this.connection, serverName, region, 
getSink(), successes));

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.TestChoreService.testForceTrigger(TestChoreService.java:379)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15610//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15610//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15610//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15610//console

This message is automatically generated.

> Allow the Canary in regionserver mode to try all regions on the server, not 
> just one
> 
>
> Key: HBASE-14145
> URL: https://issues.apache.org/jira/browse/HBASE-14145
> Project: HBase
>  Issue Type: Bug
>  Components: canary, util
>Affects Versions: 2.0.0, 1.1.0.1
>Reporter: Elliott Clark
>Assignee: Sanjeev Srivatsa
>  Labels: beginner
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-14145-v1.patch, HBASE-14145-v2.patch, 
> HBASE-14145-v2.patch, testrun.txt, testrun2.txt
>
>
> We want a pretty in-depth canary that will try every region on a cluster. 
> When doing that for the whole cluster one machine is too slow, so we wanted 
> to split it up and have each regionserver run a canary. That works however 
> the canary does less work as it just tries one random region.
> Lets add a flag that will allow the canary to try all regions on a 
> regionserver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry

2015-09-15 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746528#comment-14746528
 ] 

Andrew Purtell commented on HBASE-14207:


Sorry this has languished a bit. Committing today assuming some local checks 
pass. 

> Region was hijacked and remained in transition when RS failed to open a 
> region and later regionplan changed to new RS on retry
> --
>
> Key: HBASE-14207
> URL: https://issues.apache.org/jira/browse/HBASE-14207
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.98.6
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 0.98.15
>
> Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98-V2.patch, 
> HBASE-14207-0.98.patch
>
>
> On production environment, following events happened
> 1. Master is trying to assign a region to RS, but due to 
> KeeperException$SessionExpiredException RS failed to open the region.
>   In RS log, saw multiple WARN log related to 
> KeeperException$SessionExpiredException 
>   > KeeperErrorCode = Session expired for 
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
>   > Unable to get data of znode 
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
> 2. Master retried to assign the region to same RS, but RS again failed.
> 3. On second retry new plan formed and this time plan destination (RS) is 
> different, so master send the request to new RS to open the region. But new 
> RS failed to open the region as there was server mismatch in ZNODE than the  
> expected current server name. 
> Logs Snippet:
> {noformat}
> HM
> 2015-07-14 03:50:29,759 | INFO  | master:T101PC03VM13:21300 | Processing 
> 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | 
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644)
> 2015-07-14 03:50:29,759 | INFO  | master:T101PC03VM13:21300 | Transitioned 
> {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, 
> server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, 
> ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | 
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:50:29,760 | INFO  | master:T101PC03VM13:21300 | Processed 
> region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on 
> server: T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768)
> 2015-07-14 03:50:29,800 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:29,801 | WARN  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 
> of 10 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:29,802 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> the same failed server. | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123)
> 2015-07-14 03:50:31,804 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:31,806 | WARN  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 
> of 10 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:31,807 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned 
> {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, 
> server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b 
> state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | 
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:50:31,807 | INFO  | 
> 

[jira] [Commented] (HBASE-12911) Client-side metrics

2015-09-15 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746612#comment-14746612
 ] 

Nick Dimiduk commented on HBASE-12911:
--

That's true, we are looking at "quite a few". On master:

{noformat}
--- deps.master.txt 2015-09-15 17:47:43.0 -0700
+++ deps.12911.txt  2015-09-15 17:47:54.0 -0700
@@ -7,13 +7,24 @@
 [INFO] --- maven-dependency-plugin:2.8:list (default-cli) @ hbase-client ---
 [INFO] 
 [INFO] The following files have been resolved:
+[INFO]aopalliance:aopalliance:jar:1.0:compile
+[INFO]asm:asm:jar:3.1:compile
 [INFO]com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1:compile
 [INFO]com.google.code.findbugs:jsr305:jar:1.3.9:compile
 [INFO]com.google.code.gson:gson:jar:2.2.4:compile
 [INFO]com.google.guava:guava:jar:12.0.1:compile
+[INFO]com.google.inject.extensions:guice-servlet:jar:3.0:compile
+[INFO]com.google.inject:guice:jar:3.0:compile
 [INFO]com.google.protobuf:protobuf-java:jar:2.5.0:compile
 [INFO]com.jcraft:jsch:jar:0.1.42:compile
+[INFO]com.sun.jersey.contribs:jersey-guice:jar:1.9:compile
+[INFO]com.sun.jersey:jersey-client:jar:1.9:compile
+[INFO]com.sun.jersey:jersey-core:jar:1.9:compile
+[INFO]com.sun.jersey:jersey-json:jar:1.9:compile
+[INFO]com.sun.jersey:jersey-server:jar:1.9:compile
+[INFO]com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile
 [INFO]com.thoughtworks.paranamer:paranamer:jar:2.3:compile
+[INFO]com.yammer.metrics:metrics-core:jar:2.2.0:compile
 [INFO]commons-beanutils:commons-beanutils-core:jar:1.8.0:compile
 [INFO]commons-beanutils:commons-beanutils:jar:1.7.0:compile
 [INFO]commons-cli:commons-cli:jar:1.2:compile
@@ -27,12 +38,17 @@
 [INFO]commons-logging:commons-logging:jar:1.2:compile
 [INFO]commons-net:commons-net:jar:3.1:compile
 [INFO]io.netty:netty-all:jar:4.0.30.Final:compile
+[INFO]javax.activation:activation:jar:1.1:compile
+[INFO]javax.inject:javax.inject:jar:1:compile
+[INFO]javax.servlet:servlet-api:jar:2.5:compile
+[INFO]javax.xml.bind:jaxb-api:jar:2.2.2:compile
 [INFO]jdk.tools:jdk.tools:jar:1.7:system
 [INFO]junit:junit:jar:4.12:compile
 [INFO]log4j:log4j:jar:1.2.17:test
 [INFO]org.apache.avro:avro:jar:1.7.4:compile
 [INFO]org.apache.commons:commons-compress:jar:1.4.1:compile
 [INFO]org.apache.commons:commons-math3:jar:3.1.1:compile
+[INFO]org.apache.commons:commons-math:jar:2.2:compile
 [INFO]org.apache.curator:curator-client:jar:2.7.1:compile
 [INFO]org.apache.curator:curator-framework:jar:2.7.1:compile
 [INFO]org.apache.curator:curator-recipes:jar:2.7.1:compile
@@ -43,17 +59,27 @@
 [INFO]org.apache.hadoop:hadoop-annotations:jar:2.7.1:compile
 [INFO]org.apache.hadoop:hadoop-auth:jar:2.7.1:compile
 [INFO]org.apache.hadoop:hadoop-common:jar:2.7.1:compile
+[INFO]org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.7.1:compile
+[INFO]org.apache.hadoop:hadoop-yarn-api:jar:2.7.1:compile
+[INFO]org.apache.hadoop:hadoop-yarn-common:jar:2.7.1:compile
 [INFO]org.apache.hbase:hbase-annotations:jar:2.0.0-SNAPSHOT:compile
 [INFO]org.apache.hbase:hbase-annotations:test-jar:tests:2.0.0-SNAPSHOT:test
 [INFO]org.apache.hbase:hbase-common:jar:2.0.0-SNAPSHOT:compile
 [INFO]org.apache.hbase:hbase-common:test-jar:tests:2.0.0-SNAPSHOT:test
+[INFO]org.apache.hbase:hbase-hadoop-compat:jar:2.0.0-SNAPSHOT:compile
+[INFO]
org.apache.hbase:hbase-hadoop-compat:test-jar:tests:2.0.0-SNAPSHOT:test
+[INFO]org.apache.hbase:hbase-hadoop2-compat:jar:2.0.0-SNAPSHOT:compile
+[INFO]
org.apache.hbase:hbase-hadoop2-compat:test-jar:tests:2.0.0-SNAPSHOT:test
 [INFO]org.apache.hbase:hbase-protocol:jar:2.0.0-SNAPSHOT:compile
 [INFO]org.apache.htrace:htrace-core:jar:3.1.0-incubating:compile
 [INFO]org.apache.httpcomponents:httpclient:jar:4.2.5:compile
 [INFO]org.apache.httpcomponents:httpcore:jar:4.2.4:compile
 [INFO]org.apache.zookeeper:zookeeper:jar:3.4.6:compile
 [INFO]org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile
+[INFO]org.codehaus.jackson:jackson-jaxrs:jar:1.9.13:compile
 [INFO]org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile
+[INFO]org.codehaus.jackson:jackson-xc:jar:1.9.13:compile
+[INFO]org.codehaus.jettison:jettison:jar:1.3.3:compile
 [INFO]org.hamcrest:hamcrest-core:jar:1.3:test
 [INFO]org.hamcrest:hamcrest-library:jar:1.1:test
 [INFO]org.jmock:jmock-junit4:jar:2.6.0:test
{noformat}

and on branch-1 backport:

{noformat}
@@ -13,6 +13,7 @@
 [INFO]com.google.protobuf:protobuf-java:jar:2.5.0:compile
 [INFO]com.jcraft:jsch:jar:0.1.42:compile
 [INFO]com.thoughtworks.paranamer:paranamer:jar:2.3:compile
+[INFO]com.yammer.metrics:metrics-core:jar:2.2.0:compile
 [INFO]commons-beanutils:commons-beanutils-core:jar:1.8.0:compile
 [INFO]

[jira] [Commented] (HBASE-14207) Region was hijacked and remained in transition when RS failed to open a region and later regionplan changed to new RS on retry

2015-09-15 Thread Pankaj Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746717#comment-14746717
 ] 

Pankaj Kumar commented on HBASE-14207:
--

Thanks [~apurtell] :)

> Region was hijacked and remained in transition when RS failed to open a 
> region and later regionplan changed to new RS on retry
> --
>
> Key: HBASE-14207
> URL: https://issues.apache.org/jira/browse/HBASE-14207
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.98.6
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 0.98.15
>
> Attachments: HBASE-14207-0.98-V2.patch, HBASE-14207-0.98-V2.patch, 
> HBASE-14207-0.98.patch
>
>
> On production environment, following events happened
> 1. Master is trying to assign a region to RS, but due to 
> KeeperException$SessionExpiredException RS failed to open the region.
>   In RS log, saw multiple WARN log related to 
> KeeperException$SessionExpiredException 
>   > KeeperErrorCode = Session expired for 
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
>   > Unable to get data of znode 
> /hbase/region-in-transition/08f1935d652e5dbdac09b423b8f9401b
> 2. Master retried to assign the region to same RS, but RS again failed.
> 3. On second retry new plan formed and this time plan destination (RS) is 
> different, so master send the request to new RS to open the region. But new 
> RS failed to open the region as there was server mismatch in ZNODE than the  
> expected current server name. 
> Logs Snippet:
> {noformat}
> HM
> 2015-07-14 03:50:29,759 | INFO  | master:T101PC03VM13:21300 | Processing 
> 08f1935d652e5dbdac09b423b8f9401b in state: M_ZK_REGION_OFFLINE | 
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:644)
> 2015-07-14 03:50:29,759 | INFO  | master:T101PC03VM13:21300 | Transitioned 
> {08f1935d652e5dbdac09b423b8f9401b state=OFFLINE, ts=1436817029679, 
> server=null} to {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, 
> ts=1436817029759, server=T101PC03VM13,21302,1436816690692} | 
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:50:29,760 | INFO  | master:T101PC03VM13:21300 | Processed 
> region 08f1935d652e5dbdac09b423b8f9401b in state M_ZK_REGION_OFFLINE, on 
> server: T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:768)
> 2015-07-14 03:50:29,800 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:29,801 | WARN  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=1 
> of 10 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:29,802 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Trying to re-assign 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> the same failed server. | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2123)
> 2015-07-14 03:50:31,804 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1983)
> 2015-07-14 03:50:31,806 | WARN  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Failed assignment of 
> INTER_CONCURRENCY_SETTING,,1436596137981.08f1935d652e5dbdac09b423b8f9401b. to 
> T101PC03VM13,21302,1436816690692, trying to assign elsewhere instead; try=2 
> of 10 | 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2077)
> 2015-07-14 03:50:31,807 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Transitioned 
> {08f1935d652e5dbdac09b423b8f9401b state=PENDING_OPEN, ts=1436817031804, 
> server=T101PC03VM13,21302,1436816690692} to {08f1935d652e5dbdac09b423b8f9401b 
> state=OFFLINE, ts=1436817031807, server=T101PC03VM13,21302,1436816690692} | 
> org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:327)
> 2015-07-14 03:50:31,807 | INFO  | 
> MASTER_SERVER_OPERATIONS-T101PC03VM13:21300-3 | Assigning 
> 

[jira] [Created] (HBASE-14440) Restore to snapshot

2015-09-15 Thread Vladimir Rodionov (JIRA)
Vladimir Rodionov created HBASE-14440:
-

 Summary: Restore to snapshot
 Key: HBASE-14440
 URL: https://issues.apache.org/jira/browse/HBASE-14440
 Project: HBase
  Issue Type: New Feature
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
 Fix For: 2.0.0


Good to have feature: restore backup to snapshot. This will allow to massage 
data with custom M/R job over snapshot. Restore time range only, or particular 
key range (ranges).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14433) Set down the client executor core thread count from 256 to number of processors

2015-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746462#comment-14746462
 ] 

Hadoop QA commented on HBASE-14433:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12756104/14433v3.txt
  against master branch at commit 938d2a0c9cfa4c033ccc72de490672f151bb0351.
  ATTACHMENT ID: 12756104

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15611//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15611//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15611//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15611//console

This message is automatically generated.

> Set down the client executor core thread count from 256 to number of 
> processors
> ---
>
> Key: HBASE-14433
> URL: https://issues.apache.org/jira/browse/HBASE-14433
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Attachments: 14433 (1).txt, 14433.txt, 14433v2.txt, 14433v3.txt, 
> 14433v3.txt, 14433v3.txt, 14433v3.txt, 14433v3.txt
>
>
> HBASE-10449 upped our core count from 0 to 256 (max is 256). Looking in a 
> recent test run core dump, I see up to 256 threads per client and all are 
> idle. At a minimum it makes it hard reading test thread dumps. Trying to 
> learn more about why we went a core of 256 over in HBASE-10449. Meantime will 
> try setting down configs for test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14400) Fix HBase RPC protection documentation

2015-09-15 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746531#comment-14746531
 ] 

Andrew Purtell commented on HBASE-14400:


Thanks for the ping [~appy]. Let me commit this today assuming some quick local 
checks pan out.

> Fix HBase RPC protection documentation
> --
>
> Key: HBASE-14400
> URL: https://issues.apache.org/jira/browse/HBASE-14400
> Project: HBase
>  Issue Type: Bug
>  Components: encryption, rpc, security
>Reporter: Apekshit Sharma
>Assignee: Apekshit Sharma
>Priority: Critical
> Fix For: 2.0.0, 1.2.1, 1.0.3, 1.1.3
>
> Attachments: HBASE-14400-branch-0.98.patch, 
> HBASE-14400-branch-1.0.patch, HBASE-14400-branch-1.1.patch, 
> HBASE-14400-branch-1.2.patch, HBASE-14400-master-v2.patch, 
> HBASE-14400-master.patch
>
>
> HBase configuration 'hbase.rpc.protection' can be set to 'authentication', 
> 'integrity' or 'privacy'.
> "authentication means authentication only and no integrity or privacy; 
> integrity implies
> authentication and integrity are enabled; and privacy implies all of
> authentication, integrity and privacy are enabled."
> However hbase ref guide incorrectly suggests in some places to set the value 
> to 'auth-conf' instead of 'privacy'. Setting value to 'auth-conf' doesn't 
> provide rpc encryption which is what user wants.
> This jira will fix:
> - documentation: change 'auth-conf' references to 'privacy'
> - SaslUtil to support both set of values (privacy/integrity/authentication 
> and auth-conf/auth-int/auth) to be backward compatible with what was being 
> suggested till now.
> - change 'hbase.thrift.security.qop' to be consistent with other similar 
> configurations by using same set of values (privacy/integrity/authentication).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-13116) Enhance the documentation for usage of "doAs" through REST and Thrift gateways

2015-09-15 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He reassigned HBASE-13116:


Assignee: Jerry He

> Enhance the documentation for usage of "doAs" through REST and Thrift gateways
> --
>
> Key: HBASE-13116
> URL: https://issues.apache.org/jira/browse/HBASE-13116
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Srikanth Srungarapu
>Assignee: Jerry He
>Priority: Minor
>
> The existing documentation on the instructions to use "doAs" feature is a bit 
> misleading. A little more explanation about the overlapping configurations 
> for impersonation and doAs would make things smoother.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14145) Allow the Canary in regionserver mode to try all regions on the server, not just one

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746602#comment-14746602
 ] 

Hudson commented on HBASE-14145:


SUCCESS: Integrated in HBase-1.2-IT #148 (See 
[https://builds.apache.org/job/HBase-1.2-IT/148/])
HBASE-14145 added flag to canary to try all regions in regionserver mode 
(eclark: rev a59579c86671f931b249523b5e92d002dfa69446)
* hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java


> Allow the Canary in regionserver mode to try all regions on the server, not 
> just one
> 
>
> Key: HBASE-14145
> URL: https://issues.apache.org/jira/browse/HBASE-14145
> Project: HBase
>  Issue Type: Bug
>  Components: canary, util
>Affects Versions: 2.0.0, 1.1.0.1
>Reporter: Elliott Clark
>Assignee: Sanjeev Srivatsa
>  Labels: beginner
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14145-v1.patch, HBASE-14145-v2.patch, 
> HBASE-14145-v2.patch, testrun.txt, testrun2.txt
>
>
> We want a pretty in-depth canary that will try every region on a cluster. 
> When doing that for the whole cluster one machine is too slow, so we wanted 
> to split it up and have each regionserver run a canary. That works however 
> the canary does less work as it just tries one random region.
> Lets add a flag that will allow the canary to try all regions on a 
> regionserver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13116) Enhance the documentation for usage of "doAs" through REST and Thrift gateways

2015-09-15 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746604#comment-14746604
 ] 

Jerry He commented on HBASE-13116:
--

I will find time to post a patch for this one.

> Enhance the documentation for usage of "doAs" through REST and Thrift gateways
> --
>
> Key: HBASE-13116
> URL: https://issues.apache.org/jira/browse/HBASE-13116
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Srikanth Srungarapu
>Assignee: Jerry He
>Priority: Minor
>
> The existing documentation on the instructions to use "doAs" feature is a bit 
> misleading. A little more explanation about the overlapping configurations 
> for impersonation and doAs would make things smoother.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14145) Allow the Canary in regionserver mode to try all regions on the server, not just one

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746712#comment-14746712
 ] 

Hudson commented on HBASE-14145:


FAILURE: Integrated in HBase-1.3 #176 (See 
[https://builds.apache.org/job/HBase-1.3/176/])
HBASE-14145 added flag to canary to try all regions in regionserver mode 
(eclark: rev 9e7f9b621ac759d3b6a6569c7a79529a2bf910ee)
* hbase-server/src/main/java/org/apache/hadoop/hbase/tool/Canary.java


> Allow the Canary in regionserver mode to try all regions on the server, not 
> just one
> 
>
> Key: HBASE-14145
> URL: https://issues.apache.org/jira/browse/HBASE-14145
> Project: HBase
>  Issue Type: Bug
>  Components: canary, util
>Affects Versions: 2.0.0, 1.1.0.1
>Reporter: Elliott Clark
>Assignee: Sanjeev Srivatsa
>  Labels: beginner
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14145-v1.patch, HBASE-14145-v2.patch, 
> HBASE-14145-v2.patch, testrun.txt, testrun2.txt
>
>
> We want a pretty in-depth canary that will try every region on a cluster. 
> When doing that for the whole cluster one machine is too slow, so we wanted 
> to split it up and have each regionserver run a canary. That works however 
> the canary does less work as it just tries one random region.
> Lets add a flag that will allow the canary to try all regions on a 
> regionserver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12911) Client-side metrics

2015-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746306#comment-14746306
 ] 

Hadoop QA commented on HBASE-12911:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12756082/12911-branch-1.00.patch
  against branch-1.0 branch at commit 938d2a0c9cfa4c033ccc72de490672f151bb0351.
  ATTACHMENT ID: 12756082

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 41 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15609//console

This message is automatically generated.

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-branch-1.00.patch, am.jpg, 
> client metrics RS-Master.jpg, client metrics client.jpg, conn_agg.jpg, 
> connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12911) Client-side metrics

2015-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746339#comment-14746339
 ] 

Hadoop QA commented on HBASE-12911:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12756071/0001-HBASE-12911-Client-side-metrics.patch
  against master branch at commit 938d2a0c9cfa4c033ccc72de490672f151bb0351.
  ATTACHMENT ID: 12756071

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 36 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.ipc.TestRpcClientLeaks

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15607//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15607//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15607//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15607//console

This message is automatically generated.

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-branch-1.00.patch, am.jpg, 
> client metrics RS-Master.jpg, client metrics client.jpg, conn_agg.jpg, 
> connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12911) Client-side metrics

2015-09-15 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746341#comment-14746341
 ] 

Elliott Clark commented on HBASE-12911:
---

This is going to pull in a BOAT load of new dependencies to the client. Are we 
sure that we can/should do this on a point release?

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-branch-1.00.patch, am.jpg, 
> client metrics RS-Master.jpg, client metrics client.jpg, conn_agg.jpg, 
> connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14439) New Filesystem Abstraction Layer

2015-09-15 Thread Ben Lau (JIRA)
Ben Lau created HBASE-14439:
---

 Summary: New Filesystem Abstraction Layer
 Key: HBASE-14439
 URL: https://issues.apache.org/jira/browse/HBASE-14439
 Project: HBase
  Issue Type: New Feature
Reporter: Ben Lau


Ticket for work in progress on new FileSystem abstractions.  Previously, we 
(Yahoo) submitted a ticket that would add support for humongous (1 million 
region+) tables via a hierarchical layout (HBASE-13991).  However open source 
is moving in a similar but not identical direction in the future and so the 
patch will not be merged.

We will be working with Cloudera on a different patch now.  It will create/add 
to 2 layers-- a path abstraction layer and a use-oriented abstraction layer.  
The path abstraction layer is epitomized by classes like FsUtils (and in the 
patch new classes like AFsLayout).  The use oriented abstraction layer is 
epitomized by existing classes like MasterFileSystem/HRegionFileSystem (and 
possibly new classes later) that build on the path abstraction layer and focus 
on 'doing things' (eg creating regions) and less on the gritty details like the 
paths.

This work on abstracting and isolating the paths from the use cases will help 
Yahoo not diverge too much from open source with its internal 'Humongous' table 
hierarchical layout, while also helping open source move further towards the 
eventual goal of redoing the FS layout in a similar (but different) 
hierarchical layout later that focuses on data directory uniformity and storing 
hierarchy in the meta table instead (see HBASE-14090.)

Attached to this ticket is some work we've done at Yahoo so far that will be 
put into an open source HBase branch for further collaboration.  The patch is 
not meant to be complete yet and is a work in progress.  (Please wait on patch 
comments/reviews.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14439) New Filesystem Abstraction Layer

2015-09-15 Thread Ben Lau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Lau updated HBASE-14439:

Attachment: abstraction.patch

> New Filesystem Abstraction Layer
> 
>
> Key: HBASE-14439
> URL: https://issues.apache.org/jira/browse/HBASE-14439
> Project: HBase
>  Issue Type: New Feature
>Reporter: Ben Lau
> Attachments: abstraction.patch
>
>
> Ticket for work in progress on new FileSystem abstractions.  Previously, we 
> (Yahoo) submitted a ticket that would add support for humongous (1 million 
> region+) tables via a hierarchical layout (HBASE-13991).  However open source 
> is moving in a similar but not identical direction in the future and so the 
> patch will not be merged.
> We will be working with Cloudera on a different patch now.  It will 
> create/add to 2 layers-- a path abstraction layer and a use-oriented 
> abstraction layer.  The path abstraction layer is epitomized by classes like 
> FsUtils (and in the patch new classes like AFsLayout).  The use oriented 
> abstraction layer is epitomized by existing classes like 
> MasterFileSystem/HRegionFileSystem (and possibly new classes later) that 
> build on the path abstraction layer and focus on 'doing things' (eg creating 
> regions) and less on the gritty details like the paths.
> This work on abstracting and isolating the paths from the use cases will help 
> Yahoo not diverge too much from open source with its internal 'Humongous' 
> table hierarchical layout, while also helping open source move further 
> towards the eventual goal of redoing the FS layout in a similar (but 
> different) hierarchical layout later that focuses on data directory 
> uniformity and storing hierarchy in the meta table instead (see HBASE-14090.)
> Attached to this ticket is some work we've done at Yahoo so far that will be 
> put into an open source HBase branch for further collaboration.  The patch is 
> not meant to be complete yet and is a work in progress.  (Please wait on 
> patch comments/reviews.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14439) New/Improved Filesystem Abstractions

2015-09-15 Thread Ben Lau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Lau updated HBASE-14439:

Description: 
Ticket for work in progress on new FileSystem abstractions.  Previously, we 
(Yahoo) submitted a ticket that would add support for humongous (1 million 
region+) tables via a hierarchical layout (HBASE-13991).  However open source 
is moving in a similar but not identical direction in the future and so the 
patch will not be merged into open source.

We will be working with Cloudera on a different patch now.  It will create/add 
to 2 layers-- a path abstraction layer and a use-oriented abstraction layer.  
The path abstraction layer is epitomized by classes like FsUtils (and in the 
patch new classes like AFsLayout).  The use oriented abstraction layer is 
epitomized by existing classes like MasterFileSystem/HRegionFileSystem (and 
possibly new classes later) that build on the path abstraction layer and focus 
on 'doing things' (eg creating regions) and less on the gritty details like the 
paths.

This work on abstracting and isolating the paths from the use cases will help 
Yahoo not diverge too much from open source with its internal 'Humongous' table 
hierarchical layout, while also helping open source move further towards the 
eventual goal of redoing the FS layout in a similar (but different) 
hierarchical layout later that focuses on data directory uniformity (unlike the 
humongous patch) and storing hierarchy in the meta table instead which enables 
new optimizations (see HBASE-14090.)

Attached to this ticket is some work we've done at Yahoo so far that will be 
put into an open source HBase branch for further collaboration.  The patch is 
not meant to be complete yet and is a work in progress.  (Please wait on patch 
comments/reviews.)

  was:
Ticket for work in progress on new FileSystem abstractions.  Previously, we 
(Yahoo) submitted a ticket that would add support for humongous (1 million 
region+) tables via a hierarchical layout (HBASE-13991).  However open source 
is moving in a similar but not identical direction in the future and so the 
patch will not be merged.

We will be working with Cloudera on a different patch now.  It will create/add 
to 2 layers-- a path abstraction layer and a use-oriented abstraction layer.  
The path abstraction layer is epitomized by classes like FsUtils (and in the 
patch new classes like AFsLayout).  The use oriented abstraction layer is 
epitomized by existing classes like MasterFileSystem/HRegionFileSystem (and 
possibly new classes later) that build on the path abstraction layer and focus 
on 'doing things' (eg creating regions) and less on the gritty details like the 
paths.

This work on abstracting and isolating the paths from the use cases will help 
Yahoo not diverge too much from open source with its internal 'Humongous' table 
hierarchical layout, while also helping open source move further towards the 
eventual goal of redoing the FS layout in a similar (but different) 
hierarchical layout later that focuses on data directory uniformity and storing 
hierarchy in the meta table instead (see HBASE-14090.)

Attached to this ticket is some work we've done at Yahoo so far that will be 
put into an open source HBase branch for further collaboration.  The patch is 
not meant to be complete yet and is a work in progress.  (Please wait on patch 
comments/reviews.)

Summary: New/Improved Filesystem Abstractions  (was: New Filesystem 
Abstraction Layer)

> New/Improved Filesystem Abstractions
> 
>
> Key: HBASE-14439
> URL: https://issues.apache.org/jira/browse/HBASE-14439
> Project: HBase
>  Issue Type: New Feature
>Reporter: Ben Lau
> Attachments: abstraction.patch
>
>
> Ticket for work in progress on new FileSystem abstractions.  Previously, we 
> (Yahoo) submitted a ticket that would add support for humongous (1 million 
> region+) tables via a hierarchical layout (HBASE-13991).  However open source 
> is moving in a similar but not identical direction in the future and so the 
> patch will not be merged into open source.
> We will be working with Cloudera on a different patch now.  It will 
> create/add to 2 layers-- a path abstraction layer and a use-oriented 
> abstraction layer.  The path abstraction layer is epitomized by classes like 
> FsUtils (and in the patch new classes like AFsLayout).  The use oriented 
> abstraction layer is epitomized by existing classes like 
> MasterFileSystem/HRegionFileSystem (and possibly new classes later) that 
> build on the path abstraction layer and focus on 'doing things' (eg creating 
> regions) and less on the gritty details like the paths.
> This work on abstracting and isolating the paths from the use cases will help 
> Yahoo not diverge too much from open source with its internal 'Humongous' 

[jira] [Assigned] (HBASE-14439) New/Improved Filesystem Abstractions

2015-09-15 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi reassigned HBASE-14439:
---

Assignee: Matteo Bertozzi

> New/Improved Filesystem Abstractions
> 
>
> Key: HBASE-14439
> URL: https://issues.apache.org/jira/browse/HBASE-14439
> Project: HBase
>  Issue Type: New Feature
>Reporter: Ben Lau
>Assignee: Matteo Bertozzi
> Attachments: abstraction.patch
>
>
> Ticket for work in progress on new FileSystem abstractions.  Previously, we 
> (Yahoo) submitted a ticket that would add support for humongous (1 million 
> region+) tables via a hierarchical layout (HBASE-13991).  However open source 
> is moving in a similar but not identical direction in the future and so the 
> patch will not be merged into open source.
> We will be working with Cloudera on a different patch now.  It will 
> create/add to 2 layers-- a path abstraction layer and a use-oriented 
> abstraction layer.  The path abstraction layer is epitomized by classes like 
> FsUtils (and in the patch new classes like AFsLayout).  The use oriented 
> abstraction layer is epitomized by existing classes like 
> MasterFileSystem/HRegionFileSystem (and possibly new classes later) that 
> build on the path abstraction layer and focus on 'doing things' (eg creating 
> regions) and less on the gritty details like the paths.
> This work on abstracting and isolating the paths from the use cases will help 
> Yahoo not diverge too much from open source with its internal 'Humongous' 
> table hierarchical layout, while also helping open source move further 
> towards the eventual goal of redoing the FS layout in a similar (but 
> different) hierarchical layout later that focuses on data directory 
> uniformity (unlike the humongous patch) and storing hierarchy in the meta 
> table instead which enables new optimizations (see HBASE-14090.)
> Attached to this ticket is some work we've done at Yahoo so far that will be 
> put into an open source HBase branch for further collaboration.  The patch is 
> not meant to be complete yet and is a work in progress.  (Please wait on 
> patch comments/reviews.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14439) New/Improved Filesystem Abstractions

2015-09-15 Thread Ben Lau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Lau updated HBASE-14439:

Description: 
Ticket for work in progress on new FileSystem abstractions.  Previously, we 
(Yahoo) submitted a ticket that would add support for humongous (1 million 
region+) tables via a hierarchical layout (HBASE-13991).  However open source 
is moving in a similar but not identical direction in the future and so the 
patch will not be merged into open source.

We will be working with Cloudera on a different patch now.  It will create/add 
to 2 layers-- a path abstraction layer and a use-oriented abstraction layer.  
The path abstraction layer is epitomized by classes like FsUtils (and in the 
patch new classes like AFsLayout).  The use oriented abstraction layer is 
epitomized by existing classes like MasterFileSystem/HRegionFileSystem (and 
possibly new classes later) that build on the path abstraction layer and focus 
on 'doing things' (eg creating regions) and less on the gritty details like the 
paths.

This work on abstracting and isolating the paths from the use cases will help 
Yahoo not diverge too much from open source with its internal 'Humongous' table 
hierarchical layout, while also helping open source move further towards the 
eventual goal of redoing the FS layout in a similar (but different) 
hierarchical layout later that focuses on data directory uniformity (unlike the 
humongous patch) and storing hierarchy in the meta table instead which enables 
new optimizations (see HBASE-14090.)

Attached to this ticket is some work we've done at Yahoo so far that will be 
put into an open source HBase branch for further collaboration.  The patch is 
not meant to be complete yet and is a work in progress.  (Please wait on patch 
comments/reviews.)  It also includes some Yahoo-specific 'humongous' layout 
code that will be removed before submission in open source.

  was:
Ticket for work in progress on new FileSystem abstractions.  Previously, we 
(Yahoo) submitted a ticket that would add support for humongous (1 million 
region+) tables via a hierarchical layout (HBASE-13991).  However open source 
is moving in a similar but not identical direction in the future and so the 
patch will not be merged into open source.

We will be working with Cloudera on a different patch now.  It will create/add 
to 2 layers-- a path abstraction layer and a use-oriented abstraction layer.  
The path abstraction layer is epitomized by classes like FsUtils (and in the 
patch new classes like AFsLayout).  The use oriented abstraction layer is 
epitomized by existing classes like MasterFileSystem/HRegionFileSystem (and 
possibly new classes later) that build on the path abstraction layer and focus 
on 'doing things' (eg creating regions) and less on the gritty details like the 
paths.

This work on abstracting and isolating the paths from the use cases will help 
Yahoo not diverge too much from open source with its internal 'Humongous' table 
hierarchical layout, while also helping open source move further towards the 
eventual goal of redoing the FS layout in a similar (but different) 
hierarchical layout later that focuses on data directory uniformity (unlike the 
humongous patch) and storing hierarchy in the meta table instead which enables 
new optimizations (see HBASE-14090.)

Attached to this ticket is some work we've done at Yahoo so far that will be 
put into an open source HBase branch for further collaboration.  The patch is 
not meant to be complete yet and is a work in progress.  (Please wait on patch 
comments/reviews.)


> New/Improved Filesystem Abstractions
> 
>
> Key: HBASE-14439
> URL: https://issues.apache.org/jira/browse/HBASE-14439
> Project: HBase
>  Issue Type: New Feature
>Reporter: Ben Lau
>Assignee: Matteo Bertozzi
> Attachments: abstraction.patch
>
>
> Ticket for work in progress on new FileSystem abstractions.  Previously, we 
> (Yahoo) submitted a ticket that would add support for humongous (1 million 
> region+) tables via a hierarchical layout (HBASE-13991).  However open source 
> is moving in a similar but not identical direction in the future and so the 
> patch will not be merged into open source.
> We will be working with Cloudera on a different patch now.  It will 
> create/add to 2 layers-- a path abstraction layer and a use-oriented 
> abstraction layer.  The path abstraction layer is epitomized by classes like 
> FsUtils (and in the patch new classes like AFsLayout).  The use oriented 
> abstraction layer is epitomized by existing classes like 
> MasterFileSystem/HRegionFileSystem (and possibly new classes later) that 
> build on the path abstraction layer and focus on 'doing things' (eg creating 
> regions) and less on the gritty details like the paths.
> This work on 

[jira] [Updated] (HBASE-14433) Set down the client executor core thread count from 256 to number of processors

2015-09-15 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14433:
--
Attachment: 14433v3.txt

> Set down the client executor core thread count from 256 to number of 
> processors
> ---
>
> Key: HBASE-14433
> URL: https://issues.apache.org/jira/browse/HBASE-14433
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Attachments: 14433 (1).txt, 14433.txt, 14433v2.txt, 14433v3.txt, 
> 14433v3.txt, 14433v3.txt, 14433v3.txt, 14433v3.txt
>
>
> HBASE-10449 upped our core count from 0 to 256 (max is 256). Looking in a 
> recent test run core dump, I see up to 256 threads per client and all are 
> idle. At a minimum it makes it hard reading test thread dumps. Trying to 
> learn more about why we went a core of 256 over in HBASE-10449. Meantime will 
> try setting down configs for test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12298) Support BB usage in PrefixTree

2015-09-15 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-12298:
---
Attachment: HBASE-12298_6.patch

Updated patch with a test case for the new ByteBuff API.

> Support BB usage in PrefixTree
> --
>
> Key: HBASE-12298
> URL: https://issues.apache.org/jira/browse/HBASE-12298
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-12298.patch, HBASE-12298_1.patch, 
> HBASE-12298_2.patch, HBASE-12298_3.patch, HBASE-12298_4 (1).patch, 
> HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, 
> HBASE-12298_4 (1).patch, HBASE-12298_4.patch, HBASE-12298_4.patch, 
> HBASE-12298_4.patch, HBASE-12298_4.patch, HBASE-12298_5.patch, 
> HBASE-12298_6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12298) Support BB usage in PrefixTree

2015-09-15 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-12298:
---
Status: Patch Available  (was: Open)

> Support BB usage in PrefixTree
> --
>
> Key: HBASE-12298
> URL: https://issues.apache.org/jira/browse/HBASE-12298
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-12298.patch, HBASE-12298_1.patch, 
> HBASE-12298_2.patch, HBASE-12298_3.patch, HBASE-12298_4 (1).patch, 
> HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, 
> HBASE-12298_4 (1).patch, HBASE-12298_4.patch, HBASE-12298_4.patch, 
> HBASE-12298_4.patch, HBASE-12298_4.patch, HBASE-12298_5.patch, 
> HBASE-12298_6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12298) Support BB usage in PrefixTree

2015-09-15 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-12298:
---
Status: Open  (was: Patch Available)

> Support BB usage in PrefixTree
> --
>
> Key: HBASE-12298
> URL: https://issues.apache.org/jira/browse/HBASE-12298
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-12298.patch, HBASE-12298_1.patch, 
> HBASE-12298_2.patch, HBASE-12298_3.patch, HBASE-12298_4 (1).patch, 
> HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, 
> HBASE-12298_4 (1).patch, HBASE-12298_4.patch, HBASE-12298_4.patch, 
> HBASE-12298_4.patch, HBASE-12298_4.patch, HBASE-12298_5.patch, 
> HBASE-12298_6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12298) Support BB usage in PrefixTree

2015-09-15 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744948#comment-14744948
 ] 

Anoop Sam John commented on HBASE-12298:


bq.public abstract void get(byte[] dst, int sourceOffset, int offset, int 
length);

It will be better to keep the dest related param together.  Now in btw that the 
srcOffset is coming.  So
get(int srcOffset, byte[] dst, int offset, int length)  will look better.

Seeing the impl, we do positioning there and then copy and set back old 
position.  Why?  In SBB it is direct copy using the exact offset and no 
positioning is needed at all.  In MBB also we can impl with out positioning?  I 
believe we can easily do that.  In order to save this cost of 2 times 
positioning suggested to have a new API.

> Support BB usage in PrefixTree
> --
>
> Key: HBASE-12298
> URL: https://issues.apache.org/jira/browse/HBASE-12298
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE-12298.patch, HBASE-12298_1.patch, 
> HBASE-12298_2.patch, HBASE-12298_3.patch, HBASE-12298_4 (1).patch, 
> HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, HBASE-12298_4 (1).patch, 
> HBASE-12298_4 (1).patch, HBASE-12298_4.patch, HBASE-12298_4.patch, 
> HBASE-12298_4.patch, HBASE-12298_4.patch, HBASE-12298_5.patch, 
> HBASE-12298_6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14435) thrift tests don't have test-specific hbase-site.xml so 'BindException: Address already in use' because info port is not turned off

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744947#comment-14744947
 ] 

Hudson commented on HBASE-14435:


FAILURE: Integrated in HBase-1.3 #172 (See 
[https://builds.apache.org/job/HBase-1.3/172/])
HBASE-14435 thrift tests don't have test-specific hbase-site.xml so 
'BindException: Address already in use' because info port is not turned off 
(stack: rev 42b37d8c83153e8d8962c15108ada331327f612a)
* hbase-thrift/src/test/resources/hbase-site.xml


> thrift tests don't have test-specific hbase-site.xml so 'BindException: 
> Address already in use' because info port is not turned off
> ---
>
> Key: HBASE-14435
> URL: https://issues.apache.org/jira/browse/HBASE-14435
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14435.txt
>
>
> Running on my test rig:
> {code}
>name="org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels"
>  
> classname="org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels"
>  time="0.006">
>  

[jira] [Commented] (HBASE-14352) Replication is terribly slow with WAL compression

2015-09-15 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744971#comment-14744971
 ] 

ramkrishna.s.vasudevan commented on HBASE-14352:


In your case does both the source and peer cluster both have WAL compression 
enabled?

> Replication is terribly slow with WAL compression
> -
>
> Key: HBASE-14352
> URL: https://issues.apache.org/jira/browse/HBASE-14352
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.13
>Reporter: Abhishek Singh Chouhan
> Attachments: age_of_last_shipped.png, size_of_log_queue.png
>
>
> For the same load, replication with WAL compression enabled is almost 6x 
> slower than with compression turned off. Age of last shipped operation is 
> also correspondingly much higher when compression is turned on. 
> By observing Size of log queue we can see that it is taking too much time for 
> the queue to clear up.
> Attaching corresponding graphs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14428) Upgrade our surefire-plugin from 2.18 to 2.18.1

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744946#comment-14744946
 ] 

Hudson commented on HBASE-14428:


FAILURE: Integrated in HBase-1.3 #172 (See 
[https://builds.apache.org/job/HBase-1.3/172/])
HBASE-14428 Upgrade our surefire-plugin from 2.18 to 2.18.1 (stack: rev 
dedd9d8dd4d61be7e9b99f1eb8c341466ae81849)
* pom.xml


> Upgrade our surefire-plugin from 2.18 to 2.18.1
> ---
>
> Key: HBASE-14428
> URL: https://issues.apache.org/jira/browse/HBASE-14428
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14428.txt
>
>
> Release notes here: 
> https://maven.apache.org/surefire/maven-surefire-plugin/jira-report.html
> I've been seeing NPEs going to second phase running hbase-server tests 
> (SUREFIRE-1121) supposedly fixed in this version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14435) thrift tests don't have test-specific hbase-site.xml so 'BindException: Address already in use' because info port is not turned off

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744998#comment-14744998
 ] 

Hudson commented on HBASE-14435:


SUCCESS: Integrated in HBase-1.2-IT #145 (See 
[https://builds.apache.org/job/HBase-1.2-IT/145/])
HBASE-14435 thrift tests don't have test-specific hbase-site.xml so 
'BindException: Address already in use' because info port is not turned off 
(stack: rev d67f03c49e475e3744cfa780f27ec04bd8c33b70)
* hbase-thrift/src/test/resources/hbase-site.xml


> thrift tests don't have test-specific hbase-site.xml so 'BindException: 
> Address already in use' because info port is not turned off
> ---
>
> Key: HBASE-14435
> URL: https://issues.apache.org/jira/browse/HBASE-14435
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14435.txt
>
>
> Running on my test rig:
> {code}
>name="org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels"
>  
> classname="org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels"
>  time="0.006">
>  

[jira] [Commented] (HBASE-14401) Stamp failed appends with sequenceid too.... Cleans up latches

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745025#comment-14745025
 ] 

Hudson commented on HBASE-14401:


FAILURE: Integrated in HBase-1.2 #171 (See 
[https://builds.apache.org/job/HBase-1.2/171/])
HBASE-14401 Stamp failed appends with sequenceid too Cleans up latches 
(stack: rev 481d3f43504114f630d8dc319705f1aecc7a45db)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFailedAppendAndSync.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWALLockup.java


> Stamp failed appends with sequenceid too Cleans up latches
> --
>
> Key: HBASE-14401
> URL: https://issues.apache.org/jira/browse/HBASE-14401
> Project: HBase
>  Issue Type: Sub-task
>  Components: test, wal
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14401.txt, 14401.v7.txt, 14401.v7.txt, 14401.v7.txt, 
> 14401v3.txt, 14401v3.txt, 14401v3.txt, 14401v6.txt
>
>
> Looking in test output I see we can sometimes get stuck waiting on 
> sequenceid... The parent issues redo of our semantic makes it so we encounter 
> failed append more often around damaged WAL.
> This patch makes it so we stamp sequenceid always, even if the append fails. 
> This way all sequenceids are accounted for but more important, the latch on 
> sequenceid down in WALKey will be cleared.. where before it was not being 
> cleared (there is no global list of outstanding WALKeys waiting on 
> sequenceids so no way to clean them up... we don't need such a list if we 
> ALWAYS stamp the sequenceid).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14435) thrift tests don't have test-specific hbase-site.xml so 'BindException: Address already in use' because info port is not turned off

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745026#comment-14745026
 ] 

Hudson commented on HBASE-14435:


FAILURE: Integrated in HBase-1.2 #171 (See 
[https://builds.apache.org/job/HBase-1.2/171/])
HBASE-14435 thrift tests don't have test-specific hbase-site.xml so 
'BindException: Address already in use' because info port is not turned off 
(stack: rev d67f03c49e475e3744cfa780f27ec04bd8c33b70)
* hbase-thrift/src/test/resources/hbase-site.xml


> thrift tests don't have test-specific hbase-site.xml so 'BindException: 
> Address already in use' because info port is not turned off
> ---
>
> Key: HBASE-14435
> URL: https://issues.apache.org/jira/browse/HBASE-14435
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14435.txt
>
>
> Running on my test rig:
> {code}
>name="org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels"
>  
> classname="org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels"
>  time="0.006">
>  

[jira] [Updated] (HBASE-14411) Fix UT failures when using multiwal as default provider

2015-09-15 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-14411:
--
Status: Patch Available  (was: Open)

> Fix UT failures when using multiwal as default provider
> ---
>
> Key: HBASE-14411
> URL: https://issues.apache.org/jira/browse/HBASE-14411
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14411.patch
>
>
> If we set hbase.wal.provider to multiwal in 
> hbase-server/src/test/resources/hbase-site.xml which allows us to use 
> BoundedRegionGroupingProvider in UT, we will observe below failures in 
> current code base:
> {noformat}
> Failed tests:
>   TestHLogRecordReader>TestWALRecordReader.testPartialRead:164 expected:<1> 
> but was:<2>
>   TestHLogRecordReader>TestWALRecordReader.testWALRecordReader:216 
> expected:<2> but was:<3>
>   TestWALRecordReader.testPartialRead:164 expected:<1> but was:<2>
>   TestWALRecordReader.testWALRecordReader:216 expected:<2> but was:<3>
>   TestDistributedLogSplitting.testRecoveredEdits:276 edits dir should have 
> more than a single file in it. instead has 1
>   TestAtomicOperation.testMultiRowMutationMultiThreads:499 expected:<0> but 
> was:<1>
>   TestHRegionServerBulkLoad.testAtomicBulkLoad:307
> Expected: is 
>  but: was 
>   TestLogRolling.testCompactionRecordDoesntBlockRolling:611 Should have WAL; 
> one table is not flushed expected:<1> but was:<0>
>   TestLogRolling.testLogRollOnDatanodeDeath:359 null
>   TestLogRolling.testLogRollOnPipelineRestart:472 Missing datanode should've 
> triggered a log roll
>   TestReplicationSourceManager.testLogRoll:237 expected:<6> but was:<7>
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 if 
> skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong number of files in the 
> archive log expected:<11> but was:<12>
>   TestWALSplit.testMovedWALDuringRecovery:810->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   TestWALSplit.testRetryOpenDuringRecovery:838->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594
>  if skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplitCompressed>TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong 
> number of files in the archive log expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testMovedWALDuringRecovery:810->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testRetryOpenDuringRecovery:838->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
> {noformat}
> While patch for HBASE-14306 could resolve failures of TestHLogRecordReader, 
> TestReplicationSourceManager and TestReplicationWALReaderManager, this JIRA 
> will focus on resolving the others



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14411) Fix UT failures when using multiwal as default provider

2015-09-15 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-14411:
--
Attachment: HBASE-14411.patch

Upload the patch to ask HadoopQA to check UT

> Fix UT failures when using multiwal as default provider
> ---
>
> Key: HBASE-14411
> URL: https://issues.apache.org/jira/browse/HBASE-14411
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14411.patch
>
>
> If we set hbase.wal.provider to multiwal in 
> hbase-server/src/test/resources/hbase-site.xml which allows us to use 
> BoundedRegionGroupingProvider in UT, we will observe below failures in 
> current code base:
> {noformat}
> Failed tests:
>   TestHLogRecordReader>TestWALRecordReader.testPartialRead:164 expected:<1> 
> but was:<2>
>   TestHLogRecordReader>TestWALRecordReader.testWALRecordReader:216 
> expected:<2> but was:<3>
>   TestWALRecordReader.testPartialRead:164 expected:<1> but was:<2>
>   TestWALRecordReader.testWALRecordReader:216 expected:<2> but was:<3>
>   TestDistributedLogSplitting.testRecoveredEdits:276 edits dir should have 
> more than a single file in it. instead has 1
>   TestAtomicOperation.testMultiRowMutationMultiThreads:499 expected:<0> but 
> was:<1>
>   TestHRegionServerBulkLoad.testAtomicBulkLoad:307
> Expected: is 
>  but: was 
>   TestLogRolling.testCompactionRecordDoesntBlockRolling:611 Should have WAL; 
> one table is not flushed expected:<1> but was:<0>
>   TestLogRolling.testLogRollOnDatanodeDeath:359 null
>   TestLogRolling.testLogRollOnPipelineRestart:472 Missing datanode should've 
> triggered a log roll
>   TestReplicationSourceManager.testLogRoll:237 expected:<6> but was:<7>
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 if 
> skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong number of files in the 
> archive log expected:<11> but was:<12>
>   TestWALSplit.testMovedWALDuringRecovery:810->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   TestWALSplit.testRetryOpenDuringRecovery:838->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594
>  if skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplitCompressed>TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong 
> number of files in the archive log expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testMovedWALDuringRecovery:810->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testRetryOpenDuringRecovery:838->TestWALSplit.retryOverHdfsProblem:793
>  expected:<11> but was:<12>
> {noformat}
> While patch for HBASE-14306 could resolve failures of TestHLogRecordReader, 
> TestReplicationSourceManager and TestReplicationWALReaderManager, this JIRA 
> will focus on resolving the others



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14401) Stamp failed appends with sequenceid too.... Cleans up latches

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744919#comment-14744919
 ] 

Hudson commented on HBASE-14401:


FAILURE: Integrated in HBase-TRUNK #6807 (See 
[https://builds.apache.org/job/HBase-TRUNK/6807/])
HBASE-14401 Stamp failed appends with sequenceid too Cleans up latches 
(stack: rev 72b4c906b806236cd5fcf5a69f12628d00941df9)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFailedAppendAndSync.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWALLockup.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java


> Stamp failed appends with sequenceid too Cleans up latches
> --
>
> Key: HBASE-14401
> URL: https://issues.apache.org/jira/browse/HBASE-14401
> Project: HBase
>  Issue Type: Sub-task
>  Components: test, wal
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14401.txt, 14401.v7.txt, 14401.v7.txt, 14401.v7.txt, 
> 14401v3.txt, 14401v3.txt, 14401v3.txt, 14401v6.txt
>
>
> Looking in test output I see we can sometimes get stuck waiting on 
> sequenceid... The parent issues redo of our semantic makes it so we encounter 
> failed append more often around damaged WAL.
> This patch makes it so we stamp sequenceid always, even if the append fails. 
> This way all sequenceids are accounted for but more important, the latch on 
> sequenceid down in WALKey will be cleared.. where before it was not being 
> cleared (there is no global list of outstanding WALKeys waiting on 
> sequenceids so no way to clean them up... we don't need such a list if we 
> ALWAYS stamp the sequenceid).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13153) enable bulkload to support replication

2015-09-15 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744921#comment-14744921
 ] 

ramkrishna.s.vasudevan commented on HBASE-13153:


Thanks Matteo for mentioning the WAL entry for bulk load events.  Nice write up 
Ashish with the updated doc. Will wait for patch. 

> enable bulkload to support replication
> --
>
> Key: HBASE-13153
> URL: https://issues.apache.org/jira/browse/HBASE-13153
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: sunhaitao
>Assignee: Ashish Singhi
> Fix For: 2.0.0
>
> Attachments: HBase Bulk Load Replication-v1-1.pdf, HBase Bulk Load 
> Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14401) Stamp failed appends with sequenceid too.... Cleans up latches

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744996#comment-14744996
 ] 

Hudson commented on HBASE-14401:


SUCCESS: Integrated in HBase-1.2-IT #145 (See 
[https://builds.apache.org/job/HBase-1.2-IT/145/])
HBASE-14401 Stamp failed appends with sequenceid too Cleans up latches 
(stack: rev 481d3f43504114f630d8dc319705f1aecc7a45db)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFailedAppendAndSync.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWALLockup.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java


> Stamp failed appends with sequenceid too Cleans up latches
> --
>
> Key: HBASE-14401
> URL: https://issues.apache.org/jira/browse/HBASE-14401
> Project: HBase
>  Issue Type: Sub-task
>  Components: test, wal
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14401.txt, 14401.v7.txt, 14401.v7.txt, 14401.v7.txt, 
> 14401v3.txt, 14401v3.txt, 14401v3.txt, 14401v6.txt
>
>
> Looking in test output I see we can sometimes get stuck waiting on 
> sequenceid... The parent issues redo of our semantic makes it so we encounter 
> failed append more often around damaged WAL.
> This patch makes it so we stamp sequenceid always, even if the append fails. 
> This way all sequenceids are accounted for but more important, the latch on 
> sequenceid down in WALKey will be cleared.. where before it was not being 
> cleared (there is no global list of outstanding WALKeys waiting on 
> sequenceids so no way to clean them up... we don't need such a list if we 
> ALWAYS stamp the sequenceid).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14428) Upgrade our surefire-plugin from 2.18 to 2.18.1

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744997#comment-14744997
 ] 

Hudson commented on HBASE-14428:


SUCCESS: Integrated in HBase-1.2-IT #145 (See 
[https://builds.apache.org/job/HBase-1.2-IT/145/])
HBASE-14428 Upgrade our surefire-plugin from 2.18 to 2.18.1 (stack: rev 
b738c89637280198ff39e1f49b4d0b17370598f8)
* pom.xml


> Upgrade our surefire-plugin from 2.18 to 2.18.1
> ---
>
> Key: HBASE-14428
> URL: https://issues.apache.org/jira/browse/HBASE-14428
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14428.txt
>
>
> Release notes here: 
> https://maven.apache.org/surefire/maven-surefire-plugin/jira-report.html
> I've been seeing NPEs going to second phase running hbase-server tests 
> (SUREFIRE-1121) supposedly fixed in this version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14435) thrift tests don't have test-specific hbase-site.xml so 'BindException: Address already in use' because info port is not turned off

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745012#comment-14745012
 ] 

Hudson commented on HBASE-14435:


SUCCESS: Integrated in HBase-1.3-IT #154 (See 
[https://builds.apache.org/job/HBase-1.3-IT/154/])
HBASE-14435 thrift tests don't have test-specific hbase-site.xml so 
'BindException: Address already in use' because info port is not turned off 
(stack: rev 42b37d8c83153e8d8962c15108ada331327f612a)
* hbase-thrift/src/test/resources/hbase-site.xml


> thrift tests don't have test-specific hbase-site.xml so 'BindException: 
> Address already in use' because info port is not turned off
> ---
>
> Key: HBASE-14435
> URL: https://issues.apache.org/jira/browse/HBASE-14435
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14435.txt
>
>
> Running on my test rig:
> {code}
>name="org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels"
>  
> classname="org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels"
>  time="0.006">
>  

[jira] [Commented] (HBASE-14401) Stamp failed appends with sequenceid too.... Cleans up latches

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745011#comment-14745011
 ] 

Hudson commented on HBASE-14401:


SUCCESS: Integrated in HBase-1.3-IT #154 (See 
[https://builds.apache.org/job/HBase-1.3-IT/154/])
HBASE-14401 Stamp failed appends with sequenceid too Cleans up latches 
(stack: rev 042a63c24d4bad3136202db357c6210a3bd0d6b4)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWALLockup.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFailedAppendAndSync.java


> Stamp failed appends with sequenceid too Cleans up latches
> --
>
> Key: HBASE-14401
> URL: https://issues.apache.org/jira/browse/HBASE-14401
> Project: HBase
>  Issue Type: Sub-task
>  Components: test, wal
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14401.txt, 14401.v7.txt, 14401.v7.txt, 14401.v7.txt, 
> 14401v3.txt, 14401v3.txt, 14401v3.txt, 14401v6.txt
>
>
> Looking in test output I see we can sometimes get stuck waiting on 
> sequenceid... The parent issues redo of our semantic makes it so we encounter 
> failed append more often around damaged WAL.
> This patch makes it so we stamp sequenceid always, even if the append fails. 
> This way all sequenceids are accounted for but more important, the latch on 
> sequenceid down in WALKey will be cleared.. where before it was not being 
> cleared (there is no global list of outstanding WALKeys waiting on 
> sequenceids so no way to clean them up... we don't need such a list if we 
> ALWAYS stamp the sequenceid).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14411) Fix UT failures when using multiwal as default provider

2015-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745031#comment-14745031
 ] 

Hadoop QA commented on HBASE-14411:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12755930/HBASE-14411.patch
  against master branch at commit 72b4c906b806236cd5fcf5a69f12628d00941df9.
  ATTACHMENT ID: 12755930

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:red}-1 findbugs{color}.  The patch appears to cause Findbugs 
(version 2.0.3) to fail.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn post-site goal 
to fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15598//testReport/
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15598//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15598//console

This message is automatically generated.

> Fix UT failures when using multiwal as default provider
> ---
>
> Key: HBASE-14411
> URL: https://issues.apache.org/jira/browse/HBASE-14411
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14411.patch
>
>
> If we set hbase.wal.provider to multiwal in 
> hbase-server/src/test/resources/hbase-site.xml which allows us to use 
> BoundedRegionGroupingProvider in UT, we will observe below failures in 
> current code base:
> {noformat}
> Failed tests:
>   TestHLogRecordReader>TestWALRecordReader.testPartialRead:164 expected:<1> 
> but was:<2>
>   TestHLogRecordReader>TestWALRecordReader.testWALRecordReader:216 
> expected:<2> but was:<3>
>   TestWALRecordReader.testPartialRead:164 expected:<1> but was:<2>
>   TestWALRecordReader.testWALRecordReader:216 expected:<2> but was:<3>
>   TestDistributedLogSplitting.testRecoveredEdits:276 edits dir should have 
> more than a single file in it. instead has 1
>   TestAtomicOperation.testMultiRowMutationMultiThreads:499 expected:<0> but 
> was:<1>
>   TestHRegionServerBulkLoad.testAtomicBulkLoad:307
> Expected: is 
>  but: was 
>   TestLogRolling.testCompactionRecordDoesntBlockRolling:611 Should have WAL; 
> one table is not flushed expected:<1> but was:<0>
>   TestLogRolling.testLogRollOnDatanodeDeath:359 null
>   TestLogRolling.testLogRollOnPipelineRestart:472 Missing datanode should've 
> triggered a log roll
>   TestReplicationSourceManager.testLogRoll:237 expected:<6> but was:<7>
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestReplicationWALReaderManager.test:155 null
>   TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594 if 
> skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong number of files in the 
> archive log expected:<11> but was:<12>
>   TestWALSplit.testMovedWALDuringRecovery:810->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   TestWALSplit.testRetryOpenDuringRecovery:838->retryOverHdfsProblem:793 
> expected:<11> but was:<12>
>   
> TestWALSplitCompressed>TestWALSplit.testCorruptedLogFilesSkipErrorsFalseDoesNotTouchLogs:594
>  if skip.errors is false all files should remain in place expected:<11> but 
> was:<12>
>   TestWALSplitCompressed>TestWALSplit.testLogsGetArchivedAfterSplit:649 wrong 
> number of files in the archive log expected:<11> but was:<12>

[jira] [Commented] (HBASE-14380) Correct data gets skipped along with bad data in importTsv bulk load thru TsvImporterTextMapper

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745843#comment-14745843
 ] 

Hudson commented on HBASE-14380:


FAILURE: Integrated in HBase-1.0 #1050 (See 
[https://builds.apache.org/job/HBase-1.0/1050/])
HBASE-14380 Correct data gets skipped along with bad data in importTsv bulk 
load thru TsvImporterTextMapper (Bhupendra Kumar Jain) (tedyu: rev 
f5dd51cea8ee51902d9b502ae7d242bef5a1e43e)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TextSortReducer.java


> Correct data gets skipped along with bad data in importTsv bulk load thru 
> TsvImporterTextMapper
> ---
>
> Key: HBASE-14380
> URL: https://issues.apache.org/jira/browse/HBASE-14380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Bhupendra Kumar Jain
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 0001-HBASE-14380.patch, 14380-v2.txt, 
> HBASE-14380-branch-1.2-v1.patch, HBASE-14380-branch-1.2.patch, 
> HBASE-14380_v1.patch
>
>
> Cosider the input data is as below 
> ROWKEY, TIEMSTAMP, Col_Value
> r1,1,v1   >> Correct line
> r1 >> Bad line
> r1,3,v3   >> Correct line
> r1,4,v4   >> Correct line
> When data is bulk loaded using importTsv with mapper as TsvImporterTextMapper 
> ,  All the lines are getting ignored even though skipBadLines is set to true. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14361) ReplicationSink should create Connection instances lazily

2015-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745867#comment-14745867
 ] 

Hudson commented on HBASE-14361:


FAILURE: Integrated in HBase-1.1 #662 (See 
[https://builds.apache.org/job/HBase-1.1/662/])
HBASE-14361 ReplicationSink should create Connection instances lazily (stack: 
rev 7e797d3d47778609dd0f916e4c4f4c11bacc3352)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java


> ReplicationSink should create Connection instances lazily
> -
>
> Key: HBASE-14361
> URL: https://issues.apache.org/jira/browse/HBASE-14361
> Project: HBase
>  Issue Type: Task
>  Components: Replication
>Reporter: Nick Dimiduk
>Assignee: Heng Chen
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14361.v3.txt, 14361.v3.txt, 14361.v3.txt, 14361.v3.txt, 
> HBASE-14361-0.98.patch, HBASE-14361.patch, HBASE-14361_v1.patch, 
> HBASE-14361_v2.patch, hmaster.log
>
>
> Over on HBASE-12911 I have a patch that registers Connection instances with 
> the metrics system. In both standalone server and tll client applications, I 
> was surprised to see multiple connection objects showing up that are unused. 
> These are pretty heavy objects, including lots of client threads for the 
> batch pool. We should track these down and remove them -- if they're not some 
> kind of phantom artifacts of my WIP patch over there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >