[jira] [Commented] (HBASE-14082) Add replica id to JMX metrics names

2015-08-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698572#comment-14698572
 ] 

Hadoop QA commented on HBASE-14082:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12750690/HBASE-14082-v5.patch
  against master branch at commit 737f264509284420e6fa8c14d92fe9fbdb49f67f.
  ATTACHMENT ID: 12750690

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

 {color:red}-1 core zombie tests{color}.  There are 2 zombie test(s):   
at 
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat2.testMRIncrementalLoadWithLocality(TestHFileOutputFormat2.java:399)
at 
org.apache.activemq.broker.RecoveryBrokerTest.testConsumedQueuePersistentMessagesLostOnRestart(RecoveryBrokerTest.java:193)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15120//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15120//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15120//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15120//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15120//console

This message is automatically generated.

 Add replica id to JMX metrics names
 ---

 Key: HBASE-14082
 URL: https://issues.apache.org/jira/browse/HBASE-14082
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Reporter: Lei Chen
Assignee: Lei Chen
 Attachments: HBASE-14082-v1.patch, HBASE-14082-v2.patch, 
 HBASE-14082-v3.patch, HBASE-14082-v4.patch, HBASE-14082-v5.patch


 Today, via JMX, one cannot distinguish a primary region from a replica. A 
 possible solution is to add replica id to JMX metrics names. The benefits may 
 include, for example:
 # Knowing the latency of a read request on a replica region means the first 
 attempt to the primary region has timeout.
 # Write requests on replicas are due to the replication process, while the 
 ones on primary are from clients.
 # In case of looking for hot spots of read operations, replicas should be 
 excluded since TIMELINE reads are sent to all replicas.
 To implement, we can change the format of metrics names found at 
 {code}Hadoop-HBase-RegionServer-Regions-Attributes{code}
 from 
 {code}namespace_namespace_table_tablename_region_regionname_metric_metricname{code}
 to
 {code}namespace_namespace_table_tablename_region_regionname_replicaid_replicaid_metric_metricname{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14190) Assign system tables ahead of user region assignment

2015-08-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698587#comment-14698587
 ] 

Ted Yu commented on HBASE-14190:


The original intention of this JIRA didn't go as far as giving system tables 
their own WALs.
HBASE-13556 can be kept open, in my opinion.

 Assign system tables ahead of user region assignment
 

 Key: HBASE-14190
 URL: https://issues.apache.org/jira/browse/HBASE-14190
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Critical
 Attachments: 14190-v12.txt, 14190-v6.txt, 14190-v7.txt, 14190-v8.txt


 Currently the namespace table region is assigned like user regions.
 I spent several hours working with a customer where master couldn't finish 
 initialization.
 Even though master was restarted quite a few times, it went down with the 
 following:
 {code}
 2015-08-05 17:16:57,530 FATAL [hdpmaster1:6.activeMasterManager] 
 master.HMaster: Master server abort: loaded coprocessors are: []
 2015-08-05 17:16:57,530 FATAL [hdpmaster1:6.activeMasterManager] 
 master.HMaster: Unhandled exception. Starting shutdown.
 java.io.IOException: Timedout 30ms waiting for namespace table to be 
 assigned
   at 
 org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:104)
   at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:985)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:779)
   at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182)
   at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646)
   at java.lang.Thread.run(Thread.java:744)
 {code}
 During previous run(s), namespace table was created, hence leaving an entry 
 in hbase:meta.
 The following if block in TableNamespaceManager#start() was skipped:
 {code}
 if (!MetaTableAccessor.tableExists(masterServices.getConnection(),
   TableName.NAMESPACE_TABLE_NAME)) {
 {code}
 TableNamespaceManager#start() spins, waiting for namespace region to be 
 assigned.
 There was issue in master assigning user regions.
 We tried issuing 'assign' command from hbase shell which didn't work because 
 of the following check in MasterRpcServices#assignRegion():
 {code}
   master.checkInitialized();
 {code}
 This scenario can be avoided if we assign hbase:namespace table after 
 hbase:meta is assigned but before user table region assignment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13127) Add timeouts on all tests so less zombie sightings

2015-08-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698559#comment-14698559
 ] 

Hadoop QA commented on HBASE-13127:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12750688/13127.alternate.v3.txt
  against master branch at commit 737f264509284420e6fa8c14d92fe9fbdb49f67f.
  ATTACHMENT ID: 12750688

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.security.visibility.TestDefaultScanLabelGeneratorStack

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15119//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15119//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15119//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15119//console

This message is automatically generated.

 Add timeouts on all tests so less zombie sightings
 --

 Key: HBASE-13127
 URL: https://issues.apache.org/jira/browse/HBASE-13127
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: stack
Assignee: stack
 Attachments: 13127.alternate.txt, 13127.alternate.txt, 
 13127.alternate.txt, 13127.alternate.txt, 13127.alternate.v2.txt, 
 13127.alternate.v3.txt, 13127.alternate.v3.txt, 13127.txt, 13127v2.txt


 [~Apache9] and [~octo47] have been working hard at trying to get our builds 
 passing again. They are almost there. TRUNK just failed with a zombie 
 TestMasterObserver. Help the lads out by adding timeouts on all tests so less 
 zombie incidence... will help identify the frequent failing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12812) Update Netty dependency to latest release

2015-08-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698579#comment-14698579
 ] 

Hudson commented on HBASE-12812:


SUCCESS: Integrated in HBase-TRUNK #6731 (See 
[https://builds.apache.org/job/HBase-TRUNK/6731/])
HBASE-12812 Update Netty dependency to latest release (Jurriaan Mous) (stack: 
rev 737f264509284420e6fa8c14d92fe9fbdb49f67f)
* pom.xml


 Update Netty dependency to latest release
 -

 Key: HBASE-12812
 URL: https://issues.apache.org/jira/browse/HBASE-12812
 Project: HBase
  Issue Type: Improvement
Reporter: Jurriaan Mous
Assignee: Jurriaan Mous
 Fix For: 2.0.0

 Attachments: 12812v2.txt, HBASE-12812.patch


 Netty version was 4.0.23.Release of august 15th. 
 Lets update to 4.0.25 which contains some performance improvements and bug 
 fixes.
 http://netty.io/news/2014/10/29/4-0-24-Final.html
 http://netty.io/news/2014/12/31/4-0-25-Final.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13838) Fix shared TaskStatusTmpl.jamon issues (coloring, content, etc.)

2015-08-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698693#comment-14698693
 ] 

Hadoop QA commented on HBASE-13838:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12750704/hbase-13838_post_put_command.txt
  against master branch at commit 737f264509284420e6fa8c14d92fe9fbdb49f67f.
  ATTACHMENT ID: 12750704

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev-support patch that doesn't require tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15121//console

This message is automatically generated.

 Fix shared TaskStatusTmpl.jamon issues (coloring, content, etc.)
 

 Key: HBASE-13838
 URL: https://issues.apache.org/jira/browse/HBASE-13838
 Project: HBase
  Issue Type: Bug
  Components: UI
Affects Versions: 1.1.0
Reporter: Lars George
Assignee: Matt Warhaftig
  Labels: beginner
 Fix For: 2.0.0, 1.3.0

 Attachments: hbase-13838-v1.patch, hbase-13838_post.tiff, 
 hbase-13838_post_put_command.txt, hbase-13838_pre.tiff


 There are a few issues with the shared TaskStatusTmpl:
 - Client operations tab is always empty 
 For Master this is expected, but for RegionServers there is never anything 
 listed either. Fix for RS status page (probably caused by params not 
 containing Operation subclass anymore, but some PB generated classes?)
 - Hide “Client Operations” tab for master UI
 Since operations are RS only. Or we fix this and make other calls show here.
 - The alert-error for aborted tasks is not set in CSS at all
 When a task was aborted it should be amber or red, but the assigned style is 
 not in any of the linked stylesheets (abort-error). Add.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13838) Fix shared TaskStatusTmpl.jamon issues (coloring, content, etc.)

2015-08-16 Thread Matt Warhaftig (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Warhaftig updated HBASE-13838:
---
Attachment: hbase-13838_post_put_command.txt
hbase-13838_post.tiff
hbase-13838_pre.tiff

Attached 'hbase-13838_pre.tiff' and 'hbase-13838_post.tiff' show that master UI 
page no longer has Operations tab and that alerts are now properly colored via 
Bootstrap.  

Attached 'hbase-13838_post_put_command.txt' shows that the Operations JSON 
lists actual data for a PUT (and any RPC operation).  This displayed data was 
the open question about security concerns.

 Fix shared TaskStatusTmpl.jamon issues (coloring, content, etc.)
 

 Key: HBASE-13838
 URL: https://issues.apache.org/jira/browse/HBASE-13838
 Project: HBase
  Issue Type: Bug
  Components: UI
Affects Versions: 1.1.0
Reporter: Lars George
Assignee: Matt Warhaftig
  Labels: beginner
 Fix For: 2.0.0, 1.3.0

 Attachments: hbase-13838-v1.patch, hbase-13838_post.tiff, 
 hbase-13838_post_put_command.txt, hbase-13838_pre.tiff


 There are a few issues with the shared TaskStatusTmpl:
 - Client operations tab is always empty 
 For Master this is expected, but for RegionServers there is never anything 
 listed either. Fix for RS status page (probably caused by params not 
 containing Operation subclass anymore, but some PB generated classes?)
 - Hide “Client Operations” tab for master UI
 Since operations are RS only. Or we fix this and make other calls show here.
 - The alert-error for aborted tasks is not set in CSS at all
 When a task was aborted it should be amber or red, but the assigned style is 
 not in any of the linked stylesheets (abort-error). Add.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14078) improve error message when HMaster can't bind to port

2015-08-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698737#comment-14698737
 ] 

Hadoop QA commented on HBASE-14078:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12750708/hbase-14078_post_stack.txt
  against master branch at commit 737f264509284420e6fa8c14d92fe9fbdb49f67f.
  ATTACHMENT ID: 12750708

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev-support patch that doesn't require tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15122//console

This message is automatically generated.

 improve error message when HMaster can't bind to port
 -

 Key: HBASE-14078
 URL: https://issues.apache.org/jira/browse/HBASE-14078
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 2.0.0
Reporter: Sean Busbey
Assignee: Matt Warhaftig
  Labels: beginner
 Fix For: 2.0.0

 Attachments: hbase-14078_post_stack.txt, hbase-14708-v1.patch, 
 hbase-14708-v2.patch


 When the master fails to start becahse hbase.master.port is already taken, 
 the log messages could make it easier to tell.
 {quote}
 2015-07-14 13:10:02,667 INFO  [main] regionserver.RSRpcServices: 
 master/master01.example.com/10.20.188.121:16000 server-side HConnection 
 retries=350
 2015-07-14 13:10:02,879 INFO  [main] ipc.SimpleRpcScheduler: Using deadline 
 as user call queue, count=3
 2015-07-14 13:10:02,895 ERROR [main] master.HMasterCommandLine: Master exiting
 java.lang.RuntimeException: Failed construction of Master: class 
 org.apache.hadoop.hbase.master.HMaster
 at 
 org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2258)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:234)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
 at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2272)
 Caused by: java.net.BindException: Address already in use
 at sun.nio.ch.Net.bind0(Native Method)
 at sun.nio.ch.Net.bind(Net.java:444)
 at sun.nio.ch.Net.bind(Net.java:436)
 at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
 at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
 at org.apache.hadoop.hbase.ipc.RpcServer.bind(RpcServer.java:2513)
 at 
 org.apache.hadoop.hbase.ipc.RpcServer$Listener.init(RpcServer.java:599)
 at org.apache.hadoop.hbase.ipc.RpcServer.init(RpcServer.java:2000)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.init(RSRpcServices.java:919)
 at 
 org.apache.hadoop.hbase.master.MasterRpcServices.init(MasterRpcServices.java:211)
 at 
 org.apache.hadoop.hbase.master.HMaster.createRpcServices(HMaster.java:509)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:535)
 at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:351)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
 at 
 org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2253)
 ... 5 more
 {quote}
 I recognize that the RSRpcServices log message shows port 16000, but I 
 don't know why a new operator would. Additionally, it'd be nice to tell them 
 that the port is controlled by {{hbase.master.port}}. Maybe give a hint on 
 how to see what's using the port. Could be too os-dist specific?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13127) Add timeouts on all tests so less zombie sightings

2015-08-16 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-13127:
--
Attachment: 13127.alternate.v3.txt

Cryptic 'shutting down' message with hung threads in failed 
TestDefaultScanLabelGeneratorStack. 

 Add timeouts on all tests so less zombie sightings
 --

 Key: HBASE-13127
 URL: https://issues.apache.org/jira/browse/HBASE-13127
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: stack
Assignee: stack
 Attachments: 13127.alternate.txt, 13127.alternate.txt, 
 13127.alternate.txt, 13127.alternate.txt, 13127.alternate.v2.txt, 
 13127.alternate.v3.txt, 13127.alternate.v3.txt, 13127.alternate.v3.txt, 
 13127.txt, 13127v2.txt


 [~Apache9] and [~octo47] have been working hard at trying to get our builds 
 passing again. They are almost there. TRUNK just failed with a zombie 
 TestMasterObserver. Help the lads out by adding timeouts on all tests so less 
 zombie incidence... will help identify the frequent failing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14224) Fix coprocessor handling of duplicate classes

2015-08-16 Thread Lars George (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698757#comment-14698757
 ] 

Lars George commented on HBASE-14224:
-

Yes, and I hope I have that describe completely in the attach PDF (or the 
linked note). If not, please add here.

 Fix coprocessor handling of duplicate classes
 -

 Key: HBASE-14224
 URL: https://issues.apache.org/jira/browse/HBASE-14224
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 2.0.0, 1.0.1, 1.2.0, 1.1.1
Reporter: Lars George
Priority: Critical
 Attachments: problem.pdf


 While discussing with [~misty] over on HBASE-13907 we noticed some 
 inconsistency when copros are loaded. Sometimes you can load them more than 
 once, sometimes you can not. Need to consolidate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14078) improve error message when HMaster can't bind to port

2015-08-16 Thread Matt Warhaftig (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Warhaftig updated HBASE-14078:
---
Attachment: hbase-14078_post_stack.txt

Thanks for the feedback [~stack].  Below are responses inline.

{quote}
Formatting is a little off. Please check.
{quote}
Can you point me towards the misformatted code? - I didn't see it when looking 
at the diff.

{quote}
Do you have example of what the emission looks like now?
{quote}
See attached 'hbase-14078_post_stack.txt'.

{quote}
Whatever the IOE exception that comes up out of setting up the rpc server, we 
will always have this suffix about how to config port. Will it always be a port 
issue? Perhaps work on BindException only? And only if 'Address in use'?
{quote}
You are correct with your questioning.  I will tighten port issue error logic 
when making the earlier mentioned formatting change.

{quote}
The changes in HMaster.java do not seem to make for the same emission. Is that 
intentional? For example, before patch, if an Exception, we used to 
e.getCause().getMessage() if non-null but now if I read it right, we do 
e.toString
{quote}
Yes, the change was intentional because existing HMaster thrown errors include 
useful messages that were previously ignored when only e.getCause() was 
displayed.  The e.getCause() message is still displayed after this change, just 
one level down the error stack now.

 improve error message when HMaster can't bind to port
 -

 Key: HBASE-14078
 URL: https://issues.apache.org/jira/browse/HBASE-14078
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 2.0.0
Reporter: Sean Busbey
Assignee: Matt Warhaftig
  Labels: beginner
 Fix For: 2.0.0

 Attachments: hbase-14078_post_stack.txt, hbase-14708-v1.patch, 
 hbase-14708-v2.patch


 When the master fails to start becahse hbase.master.port is already taken, 
 the log messages could make it easier to tell.
 {quote}
 2015-07-14 13:10:02,667 INFO  [main] regionserver.RSRpcServices: 
 master/master01.example.com/10.20.188.121:16000 server-side HConnection 
 retries=350
 2015-07-14 13:10:02,879 INFO  [main] ipc.SimpleRpcScheduler: Using deadline 
 as user call queue, count=3
 2015-07-14 13:10:02,895 ERROR [main] master.HMasterCommandLine: Master exiting
 java.lang.RuntimeException: Failed construction of Master: class 
 org.apache.hadoop.hbase.master.HMaster
 at 
 org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2258)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:234)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
 at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2272)
 Caused by: java.net.BindException: Address already in use
 at sun.nio.ch.Net.bind0(Native Method)
 at sun.nio.ch.Net.bind(Net.java:444)
 at sun.nio.ch.Net.bind(Net.java:436)
 at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
 at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
 at org.apache.hadoop.hbase.ipc.RpcServer.bind(RpcServer.java:2513)
 at 
 org.apache.hadoop.hbase.ipc.RpcServer$Listener.init(RpcServer.java:599)
 at org.apache.hadoop.hbase.ipc.RpcServer.init(RpcServer.java:2000)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.init(RSRpcServices.java:919)
 at 
 org.apache.hadoop.hbase.master.MasterRpcServices.init(MasterRpcServices.java:211)
 at 
 org.apache.hadoop.hbase.master.HMaster.createRpcServices(HMaster.java:509)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:535)
 at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:351)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
 at 
 org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2253)
 ... 5 more
 {quote}
 I recognize that the RSRpcServices log message shows port 16000, but I 
 don't know why a new operator would. Additionally, it'd be nice to tell them 
 that the port is controlled by {{hbase.master.port}}. Maybe give a hint on 
 how to see what's using the port. Could be too os-dist 

[jira] [Commented] (HBASE-14224) Fix coprocessor handling of duplicate classes

2015-08-16 Thread Lars George (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698755#comment-14698755
 ] 

Lars George commented on HBASE-14224:
-

Whoops, here the online version if that helps: 
https://www.evernote.com/l/ACFO6OrjlNNHeZDPxhubGw8uDUSwAaOgxQU

 Fix coprocessor handling of duplicate classes
 -

 Key: HBASE-14224
 URL: https://issues.apache.org/jira/browse/HBASE-14224
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 2.0.0, 1.0.1, 1.2.0, 1.1.1
Reporter: Lars George
Priority: Critical
 Attachments: problem.pdf


 While discussing with [~misty] over on HBASE-13907 we noticed some 
 inconsistency when copros are loaded. Sometimes you can load them more than 
 once, sometimes you can not. Need to consolidate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13127) Add timeouts on all tests so less zombie sightings

2015-08-16 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-13127:
--
Attachment: 13127.alternate.v4.txt

 Add timeouts on all tests so less zombie sightings
 --

 Key: HBASE-13127
 URL: https://issues.apache.org/jira/browse/HBASE-13127
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: stack
Assignee: stack
 Attachments: 13127.alternate.txt, 13127.alternate.txt, 
 13127.alternate.txt, 13127.alternate.txt, 13127.alternate.v2.txt, 
 13127.alternate.v3.txt, 13127.alternate.v3.txt, 13127.alternate.v3.txt, 
 13127.alternate.v3.txt, 13127.alternate.v4.txt, 13127.txt, 13127v2.txt


 [~Apache9] and [~octo47] have been working hard at trying to get our builds 
 passing again. They are almost there. TRUNK just failed with a zombie 
 TestMasterObserver. Help the lads out by adding timeouts on all tests so less 
 zombie incidence... will help identify the frequent failing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13127) Add timeouts on all tests so less zombie sightings

2015-08-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698887#comment-14698887
 ] 

stack commented on HBASE-13127:
---

Says:

kalashnikov:hbase.git stack$ python ./dev-support/findHangingTests.py  
https://builds.apache.org/job/PreCommit-HBASE-Build/15123/consoleText
Fetching the console output from the URL
Printing hanging tests
Printing Failing tests
Failing test : org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Results :

Failed tests: 
  TestDistributedLogSplitting.testLogReplayTwoSequentialRSDown:653 
expected:1000 but was:896

So, this test looks like it can also report as a zombie.


Tests run: 18, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 263.523 sec 
 FAILURE! - in org.apache.hadoop.hbase.master.TestDistributedLogSplitting
testLogReplayTwoSequentialRSDown(org.apache.hadoop.hbase.master.TestDistributedLogSplitting)
  Time elapsed: 40.392 sec   FAILURE!
java.lang.AssertionError: expected:1000 but was:896
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hbase.master.TestDistributedLogSplitting.testLogReplayTwoSequentialRSDown(TestDistributedLogSplitting.java:653)



 Add timeouts on all tests so less zombie sightings
 --

 Key: HBASE-13127
 URL: https://issues.apache.org/jira/browse/HBASE-13127
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: stack
Assignee: stack
 Attachments: 13127.alternate.txt, 13127.alternate.txt, 
 13127.alternate.txt, 13127.alternate.txt, 13127.alternate.v2.txt, 
 13127.alternate.v3.txt, 13127.alternate.v3.txt, 13127.alternate.v3.txt, 
 13127.txt, 13127v2.txt


 [~Apache9] and [~octo47] have been working hard at trying to get our builds 
 passing again. They are almost there. TRUNK just failed with a zombie 
 TestMasterObserver. Help the lads out by adding timeouts on all tests so less 
 zombie incidence... will help identify the frequent failing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13127) Add timeouts on all tests so less zombie sightings

2015-08-16 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-13127:
--
Attachment: 13127.alternate.v3.txt

Cleanup of TDLS. Make it so should not be a zombie anymore (we were not 
shutting down zk clusters in a few places). Doubt it will fix flakey test but 
hopefully no longer a zombie (This change unrelated but test fixing)

 Add timeouts on all tests so less zombie sightings
 --

 Key: HBASE-13127
 URL: https://issues.apache.org/jira/browse/HBASE-13127
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: stack
Assignee: stack
 Attachments: 13127.alternate.txt, 13127.alternate.txt, 
 13127.alternate.txt, 13127.alternate.txt, 13127.alternate.v2.txt, 
 13127.alternate.v3.txt, 13127.alternate.v3.txt, 13127.alternate.v3.txt, 
 13127.alternate.v3.txt, 13127.txt, 13127v2.txt


 [~Apache9] and [~octo47] have been working hard at trying to get our builds 
 passing again. They are almost there. TRUNK just failed with a zombie 
 TestMasterObserver. Help the lads out by adding timeouts on all tests so less 
 zombie incidence... will help identify the frequent failing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13127) Add timeouts on all tests so less zombie sightings

2015-08-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698901#comment-14698901
 ] 

Hadoop QA commented on HBASE-13127:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12750729/13127.alternate.v4.txt
  against master branch at commit 737f264509284420e6fa8c14d92fe9fbdb49f67f.
  ATTACHMENT ID: 12750729

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+
WALSplitter.getRegionDirRecoveredEditsDir(HRegion.getRegionDir(tdir, 
hri.getEncodedName()));
+assertTrue(edits dir should have more than a single file in it. 
instead has  + files.length,
+
WALSplitter.getRegionDirRecoveredEditsDir(HRegion.getRegionDir(tdir, 
hri.getEncodedName()));
+
WALSplitter.getRegionDirRecoveredEditsDir(HRegion.getRegionDir(tdir, 
hri.getEncodedName()));
+new HLogKey(curRegionInfo.getEncodedNameAsBytes(), tableName, 
System.currentTimeMillis()),
+  NavigableSetPath recoveredEdits = 
WALSplitter.getSplitEditFilesSorted(fs, regionDirs.get(0));

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestTimeout

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15124//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15124//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15124//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15124//console

This message is automatically generated.

 Add timeouts on all tests so less zombie sightings
 --

 Key: HBASE-13127
 URL: https://issues.apache.org/jira/browse/HBASE-13127
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: stack
Assignee: stack
 Attachments: 13127.alternate.txt, 13127.alternate.txt, 
 13127.alternate.txt, 13127.alternate.txt, 13127.alternate.v2.txt, 
 13127.alternate.v3.txt, 13127.alternate.v3.txt, 13127.alternate.v3.txt, 
 13127.alternate.v3.txt, 13127.alternate.v4.txt, 13127.txt, 13127v2.txt


 [~Apache9] and [~octo47] have been working hard at trying to get our builds 
 passing again. They are almost there. TRUNK just failed with a zombie 
 TestMasterObserver. Help the lads out by adding timeouts on all tests so less 
 zombie incidence... will help identify the frequent failing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13158) When client supports CellBlock, return the result Cells as controller payload for get(Get) API also

2015-08-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699002#comment-14699002
 ] 

Anoop Sam John commented on HBASE-13158:


I had some performance degrade after this change.. That was like 2 or 3 %..
The  debug after that lead to some other perf optimizations as well.. I did not 
test for perf after all those..  Will do it ASAP Stack and report back. One 
diff is when the PB way of cell transfer happens, it encode Cell bytes into a 
stream of pre determined size.  Where as we do resizable OS.  May be that 
matters.  Will report back more on this in 2 days time Stack.

 When client supports CellBlock, return the result Cells as controller payload 
 for get(Get) API also
 ---

 Key: HBASE-13158
 URL: https://issues.apache.org/jira/browse/HBASE-13158
 Project: HBase
  Issue Type: Improvement
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0, 1.3.0

 Attachments: 13158v4.suggestion.txt, HBASE-13158.patch, 
 HBASE-13158_V2.patch, HBASE-13158_V3.patch, HBASE-13158_V4.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13127) Add timeouts on all tests so less zombie sightings

2015-08-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14698837#comment-14698837
 ] 

Hadoop QA commented on HBASE-13127:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12750714/13127.alternate.v3.txt
  against master branch at commit 737f264509284420e6fa8c14d92fe9fbdb49f67f.
  ATTACHMENT ID: 12750714

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15123//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15123//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15123//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15123//console

This message is automatically generated.

 Add timeouts on all tests so less zombie sightings
 --

 Key: HBASE-13127
 URL: https://issues.apache.org/jira/browse/HBASE-13127
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: stack
Assignee: stack
 Attachments: 13127.alternate.txt, 13127.alternate.txt, 
 13127.alternate.txt, 13127.alternate.txt, 13127.alternate.v2.txt, 
 13127.alternate.v3.txt, 13127.alternate.v3.txt, 13127.alternate.v3.txt, 
 13127.txt, 13127v2.txt


 [~Apache9] and [~octo47] have been working hard at trying to get our builds 
 passing again. They are almost there. TRUNK just failed with a zombie 
 TestMasterObserver. Help the lads out by adding timeouts on all tests so less 
 zombie incidence... will help identify the frequent failing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14229) Flushing canceled by coprocessor still leads to memstoreSize set down

2015-08-16 Thread sunyerui (JIRA)
sunyerui created HBASE-14229:


 Summary: Flushing canceled by coprocessor still leads to 
memstoreSize set down
 Key: HBASE-14229
 URL: https://issues.apache.org/jira/browse/HBASE-14229
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 1.1.1, 0.98.13, 1.0.2, 1.2.0
Reporter: sunyerui


A Coprocessor override public InternalScanner preFlush(final Store store, 
final InternalScanner scanner) and return NULL when calling this method, will 
cancel flush request, leaving snapshot un-flushed, and no new storefile 
created. But the HRegion.internalFlushCache still set down memstoreSize to 0 by 
totalFlushableSize. 
If there's no write requests anymore, the memstoreSize will remaining as 0, and 
no more flush quests will be processed because of the checking of 
memstoreSize.get() =0 at the beginning of internalFlushCache.
This issue may not cause data loss, but it will confuse coprocessor users. If 
we argree with this, I'll apply a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)