[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.patch

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5040) Secure HBase builds fail

2011-12-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171806#comment-13171806
 ] 

Hudson commented on HBASE-5040:
---

Integrated in HBase-0.92-security #41 (See 
[https://builds.apache.org/job/HBase-0.92-security/41/])
HBASE-5040 Secure HBase builds fail
HBASE-5040 Secure HBase builds fail

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt

stack : 
Files : 
* /hbase/branches/0.92/pom.xml


> Secure HBase builds fail
> 
>
> Key: HBASE-5040
> URL: https://issues.apache.org/jira/browse/HBASE-5040
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Zhihong Yu
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: 5040-v2.txt, 5040.txt
>
>
> I saw the following in HBase-0.92-security build #39:
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> :[590,4]
>  method does not override or implement a method from a supertype
> [ERROR] -> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> :[590,4]
>  method does not override or implement a method from a supertype
> {code}
> The above was probably introduced by HBASE-5006

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4934) Display Master server and Regionserver start time on respective info servers.

2011-12-18 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171807#comment-13171807
 ] 

Hudson commented on HBASE-4934:
---

Integrated in HBase-0.92-security #41 (See 
[https://builds.apache.org/job/HBase-0.92-security/41/])
HBASE-4934 Display Master server and Regionserver start time on respective 
info servers

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/jamon/org/apache/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/branches/0.92/src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Display Master server and Regionserver start time on respective info servers.
> -
>
> Key: HBASE-4934
> URL: https://issues.apache.org/jira/browse/HBASE-4934
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: hbase-4934.patch, hmaster.png, hregion.png
>
>
> With operations like rolling restart or master failovers, it is difficult to 
> tell if a server is the "old" instance or the "new" restarted instance.  
> Adding a start date stamp on the info web pages would be helpful for 
> determining this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171810#comment-13171810
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12507827/5064.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -152 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestMasterFailover
  org.apache.hadoop.hbase.util.TestMergeTool
  org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
  org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan
  org.apache.hadoop.hbase.client.TestInstantSchemaChange
  org.apache.hadoop.hbase.replication.TestMasterReplication
  org.apache.hadoop.hbase.TestHBaseTestingUtility
  org.apache.hadoop.hbase.util.TestRegionSplitter

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/536//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/536//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/536//console

This message is automatically generated.

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171811#comment-13171811
 ] 

nkeywal commented on HBASE-5064:


"java.lang.OutOfMemoryError: unable to create new native thread": hadoop-qa 
can't execute 3 tests in //
Let's try with 2.

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171815#comment-13171815
 ] 

nkeywal commented on HBASE-5064:


hum, it was already set to 2. Let's retry, but hadoop-qa is not really 
oversized...

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.patch

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171844#comment-13171844
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12507831/5064.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -152 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/537//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/537//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/537//console

This message is automatically generated.

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171847#comment-13171847
 ] 

nkeywal commented on HBASE-5064:


This time is worked: I have the errors above on another patch as well.

Logs says: Total time: 1:13:18.269s
To be compared with prebuild #535: Total time: 1:49:04.397s
=>50 minutes faster.

However, we have some tests without results: Tests run: 771, Failures: 3, 
Errors: 1, Skipped: 9
To be compared with prebuild #535: Tests run: 785, Failures: 0, Errors: 3, 
Skipped: 9


> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch, 5064.v2.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v2.patch

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch, 5064.v2.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch, 5064.v2.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171871#comment-13171871
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12507835/5064.v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -152 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
  org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol
  org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint
  org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan
  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase
  org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol
  org.apache.hadoop.hbase.util.TestHBaseFsck
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/538//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/538//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/538//console

This message is automatically generated.

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch, 5064.v2.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5065) wrong IllegalArgumentException thrown when creating an 'HServerAddress' with an un-reachable hostname

2011-12-18 Thread Eran Hirsch (Created) (JIRA)
wrong IllegalArgumentException thrown when creating an 'HServerAddress' with an 
un-reachable hostname
-

 Key: HBASE-5065
 URL: https://issues.apache.org/jira/browse/HBASE-5065
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.90.4
Reporter: Eran Hirsch
Priority: Trivial


When trying to build an 'HServerAddress' object with an unresolvable hostname:

e.g. new HServerAddress("www.IAMUNREACHABLE.com:80")

a call to 'getResolvedAddress' would cause the 'InetSocketAddress' c'tor to 
throw an IllegalArgumentException because it is called with a null 'hostname' 
parameter.
This happens because there is no null-check after the static 
'getBindAddressInternal' method returns a null value when the hostname is 
unresolved.

This is a trivial bug because the code HServerAddress is expected to throw this 
kind of exception when this error occurs, but it is thrown "for the wrong 
reason". The method 'checkBindAddressCanBeResolved' should be the one throwing 
the exception (and give a slightly different reason). Because of this reason 
the method call itself becomes redundent as it will always succeed in the 
current flow, because the case it checks is already "checked" for by the 
previous "getResolvedAddress" method.

In short:
an IllegalArgumentException is thrown with reason: "hostname can't be null" 
from the InetSocketAddress c'tor
INSTEAD OF
an IllegalArgumentException with reason: "Could not resolve the DNS name of 
[BADHOSTNAME]:[PORT]" from HServerAddress's checkBindCanBeResolved method.

Stack trace:
java.lang.IllegalArgumentException: hostname can't be null
at java.net.InetSocketAddress.(InetSocketAddress.java:139) 
~[na:1.7.0_02]
at 
org.apache.hadoop.hbase.HServerAddress.getResolvedAddress(HServerAddress.java:108)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.HServerAddress.(HServerAddress.java:64) 
~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.zookeeper.RootRegionTracker.dataToHServerAddress(RootRegionTracker.java:82)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:73)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:579)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:594)
 ~[hbase-0.90.4.jar:0.90.4]
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
 ~[hbase-0.90.4.jar:0.90.4]
at org.apache.hadoop.hbase.client.HTable.(HTable.java:173) 
~[hbase-0.90.4.jar:0.90.4]
at org.apache.hadoop.hbase.client.HTable.(HTable.java:147) 
~[hbase-0.90.4.jar:0.90.4]



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171932#comment-13171932
 ] 

Lars Hofhansl commented on HBASE-5058:
--

@stack: are you OK with this patch? 




> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171941#comment-13171941
 ] 

Lars Hofhansl commented on HBASE-5058:
--

Another option is to unwrap the UndeclaredThrowableException and rethrow the 
IOException (if any) that caused it. getMaster would then declare IOException 
rather than ZooKeeperException in its throws clause. That might be the cleanest 
approach.


> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v3.patch

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171959#comment-13171959
 ] 

stack commented on HBASE-5058:
--

+1

> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171964#comment-13171964
 ] 

stack commented on HBASE-5064:
--

Nice improvement.  Should we up the heap size for maven when running tests in 
parallel?

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171965#comment-13171965
 ] 

Lars Hofhansl commented on HBASE-5058:
--

+1 for the patch, or the latest comment? :)

(I did explore a patch for the last comment, but it turns out that this will 
either have API implications - user code now needs to deal with IOException, or 
it needs a lot of changes to map IOException back other exceptions.)


> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Allow better control of resource consumption in HTable (backport HBASE-4805 to 0.90 branch)

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171968#comment-13171968
 ] 

Lars Hofhansl commented on HBASE-4970:
--

Ok... In that case, let's just add the new parameter everywhere. Sorry for 
leading you astray.

I'll commit the initial patch and the 0.92 and trunk patches later today, 
unless somebody objects.


> Allow better control of resource consumption in HTable (backport HBASE-4805 
> to 0.90 branch)
> ---
>
> Key: HBASE-4970
> URL: https://issues.apache.org/jira/browse/HBASE-4970
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Trivial
> Fix For: 0.90.6
>
> Attachments: HBASE-4970_Branch90.patch, 
> HBASE-4970_Branch90_V1_trial.patch, HBASE-4970_Branch90_V2.patch, 
> HBASE-4970_Branch92_V2.patch, HBASE-4970_Trunk_V2.patch
>
>
> In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
> is slowed down.
> Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
> occurance [RES value increase]?
> You can go through the source of sun.nio.ch.Util. Every thread hold 3 
> softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
> softreferences buffercache. If the buffer was all occupied or none was 
> suitable in size, and new request comes, new direct buffer is allocated. 
> After the service, the bigger one replaces the smaller one in buffercache. 
> The replaced buffer is released.
> So I think we can add a parameter to change keepAliveTime of Htable thread 
> pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5062) Missing logons if security is enabled

2011-12-18 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171969#comment-13171969
 ] 

stack commented on HBASE-5062:
--

+1 on commit trunk and 0.92

> Missing logons if security is enabled
> -
>
> Key: HBASE-5062
> URL: https://issues.apache.org/jira/browse/HBASE-5062
> Project: HBase
>  Issue Type: Bug
>  Components: rest, security, thrift
>Affects Versions: 0.92.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Attachments: HBASE-5062-v2.patch, HBASE-5062.patch
>
>
> Somehow the attached changes are missing from the security integration. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5064) use surefire tests parallelization

2011-12-18 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171978#comment-13171978
 ] 

Hadoop QA commented on HBASE-5064:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12507849/5064.v3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -152 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestInstantSchemaChange
  org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/539//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/539//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/539//console

This message is automatically generated.

> use surefire tests parallelization
> --
>
> Key: HBASE-5064
> URL: https://issues.apache.org/jira/browse/HBASE-5064
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch
>
>
> To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Allow better control of resource consumption in HTable (backport HBASE-4805 to 0.90 branch)

2011-12-18 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171986#comment-13171986
 ] 

gaojinchao commented on HBASE-4970:
---

No problem, Thanks for your work!

> Allow better control of resource consumption in HTable (backport HBASE-4805 
> to 0.90 branch)
> ---
>
> Key: HBASE-4970
> URL: https://issues.apache.org/jira/browse/HBASE-4970
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Trivial
> Fix For: 0.90.6
>
> Attachments: HBASE-4970_Branch90.patch, 
> HBASE-4970_Branch90_V1_trial.patch, HBASE-4970_Branch90_V2.patch, 
> HBASE-4970_Branch92_V2.patch, HBASE-4970_Trunk_V2.patch
>
>
> In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
> is slowed down.
> Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
> occurance [RES value increase]?
> You can go through the source of sun.nio.ch.Util. Every thread hold 3 
> softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
> softreferences buffercache. If the buffer was all occupied or none was 
> suitable in size, and new request comes, new direct buffer is allocated. 
> After the service, the bigger one replaces the smaller one in buffercache. 
> The replaced buffer is released.
> So I think we can add a parameter to change keepAliveTime of Htable thread 
> pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5001) Improve the performance of block cache keys

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171987#comment-13171987
 ] 

Lars Hofhansl commented on HBASE-5001:
--

Did the profiler show any other low-hanging fruit?
Do you have 0.90 comparisons? (Maybe we can figure out what causes the 
difference)

> Improve the performance of block cache keys
> ---
>
> Key: HBASE-5001
> URL: https://issues.apache.org/jira/browse/HBASE-5001
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 5001-0.92.txt, 5001-v1.txt, 5001-v2.txt
>
>
> Doing a pure random read test on data that's 100% block cache, I see that we 
> are spending quite some time in getBlockCacheKey:
> {quote}
> "IPC Server handler 19 on 62023" daemon prio=10 tid=0x7fe0501ff800 
> nid=0x6c87 runnable [0x7fe0577f6000]
>java.lang.Thread.State: RUNNABLE
>   at java.util.Arrays.copyOf(Arrays.java:2882)
>   at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
>   at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
>   at java.lang.StringBuilder.append(StringBuilder.java:119)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFile.getBlockCacheKey(HFile.java:457)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:249)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:209)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:521)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:536)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:178)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:111)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekExactly(StoreFileScanner.java:219)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:80)
>   at 
> org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1689)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:2857)
> {quote}
> Since the HFile name size is known and the offset is a long, it should be 
> possible to allocate exactly what we need. Maybe use byte[] as the key and 
> drop the separator too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down.

2011-12-18 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172002#comment-13172002
 ] 

Zhihong Yu commented on HBASE-5063:
---

TestTableMapReduce (two of them) on TRUNK run successfully on MacBook.

Recently mapred.TestTableMapReduce.testMultiRegionTable showed up as failure 
mysteriously by Hadoop QA (not because of 'Too many open files').
We should find out why.

> RegionServers fail to report to backup HMaster after primary goes down.
> ---
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Attachments: HBASE-5063.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2011-12-18 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172041#comment-13172041
 ] 

Zhihong Yu commented on HBASE-4720:
---

The diff was made against 0.90.5
I got the following when trying to apply to TRUNK:
{code}
1 out of 2 hunks FAILED -- saving rejects to file 
src/main/java/org/apache/hadoop/hbase/rest/TableResource.java.rej
{code}

I suggest the following steps based on TRUNK:
1. continue with refactoring
2. add unit tests for the new XXResource classes (remember to add test category)
3. let Hadoop QA test the patch

Minor comment: year isn't needed for license

Thanks for your effort Mubarak.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
> Attachments: HBASE-4720.v1.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172042#comment-13172042
 ] 

Zhihong Yu commented on HBASE-5058:
---

Since this JIRA is targeting TRUNK, I think change of API should be allowed if 
we agree it is on the right track.

> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5065) wrong IllegalArgumentException thrown when creating an 'HServerAddress' with an un-reachable hostname

2011-12-18 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172043#comment-13172043
 ] 

Zhihong Yu commented on HBASE-5065:
---

@Eran:
Nice summary.

If you can provide a patch, I will assign this JIRA to you.

> wrong IllegalArgumentException thrown when creating an 'HServerAddress' with 
> an un-reachable hostname
> -
>
> Key: HBASE-5065
> URL: https://issues.apache.org/jira/browse/HBASE-5065
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.90.4
>Reporter: Eran Hirsch
>Priority: Trivial
>
> When trying to build an 'HServerAddress' object with an unresolvable hostname:
> e.g. new HServerAddress("www.IAMUNREACHABLE.com:80")
> a call to 'getResolvedAddress' would cause the 'InetSocketAddress' c'tor to 
> throw an IllegalArgumentException because it is called with a null 'hostname' 
> parameter.
> This happens because there is no null-check after the static 
> 'getBindAddressInternal' method returns a null value when the hostname is 
> unresolved.
> This is a trivial bug because the code HServerAddress is expected to throw 
> this kind of exception when this error occurs, but it is thrown "for the 
> wrong reason". The method 'checkBindAddressCanBeResolved' should be the one 
> throwing the exception (and give a slightly different reason). Because of 
> this reason the method call itself becomes redundent as it will always 
> succeed in the current flow, because the case it checks is already "checked" 
> for by the previous "getResolvedAddress" method.
> In short:
> an IllegalArgumentException is thrown with reason: "hostname can't be null" 
> from the InetSocketAddress c'tor
> INSTEAD OF
> an IllegalArgumentException with reason: "Could not resolve the DNS name of 
> [BADHOSTNAME]:[PORT]" from HServerAddress's checkBindCanBeResolved method.
> Stack trace:
> java.lang.IllegalArgumentException: hostname can't be null
>   at java.net.InetSocketAddress.(InetSocketAddress.java:139) 
> ~[na:1.7.0_02]
>   at 
> org.apache.hadoop.hbase.HServerAddress.getResolvedAddress(HServerAddress.java:108)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.HServerAddress.(HServerAddress.java:64) 
> ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker.dataToHServerAddress(RootRegionTracker.java:82)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:73)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:579)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:688)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:594)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:559)
>  ~[hbase-0.90.4.jar:0.90.4]
>   at org.apache.hadoop.hbase.client.HTable.(HTable.java:173) 
> ~[hbase-0.90.4.jar:0.90.4]
>   at org.apache.hadoop.hbase.client.HTable.(HTable.java:147) 
> ~[hbase-0.90.4.jar:0.90.4]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4970) Add a parameter so that keepAliveTime of Htable thread pool can be changed

2011-12-18 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4970:
--

Summary: Add a parameter so that keepAliveTime of Htable thread pool can be 
changed  (was: Allow better control of resource consumption in HTable (backport 
HBASE-4805 to 0.90 branch))

Switching subject back.

> Add a parameter so that keepAliveTime of Htable thread pool can be changed
> --
>
> Key: HBASE-4970
> URL: https://issues.apache.org/jira/browse/HBASE-4970
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Trivial
> Fix For: 0.90.6
>
> Attachments: HBASE-4970_Branch90.patch, 
> HBASE-4970_Branch90_V1_trial.patch, HBASE-4970_Branch90_V2.patch, 
> HBASE-4970_Branch92_V2.patch, HBASE-4970_Trunk_V2.patch
>
>
> In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
> is slowed down.
> Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
> occurance [RES value increase]?
> You can go through the source of sun.nio.ch.Util. Every thread hold 3 
> softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
> softreferences buffercache. If the buffer was all occupied or none was 
> suitable in size, and new request comes, new direct buffer is allocated. 
> After the service, the bigger one replaces the smaller one in buffercache. 
> The replaced buffer is released.
> So I think we can add a parameter to change keepAliveTime of Htable thread 
> pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down.

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172045#comment-13172045
 ] 

Lars Hofhansl commented on HBASE-5063:
--

Looks like this.masterAddressManager.getMasterAddress() could return null (see 
first loop), so this could lead to an NPE.

I am wondering we shouldn't just fold the check from the first loop (where we 
get masterServerName) into the 2nd loop and completely remove the first loop.
I.e. if masterServerName is null, continue the loop, sleep for a bit... Means 
that the sleep needs to be pulled out of the try/catch. If masterServerName is 
not null, try to connect.


> RegionServers fail to report to backup HMaster after primary goes down.
> ---
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Attachments: HBASE-5063.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter so that keepAliveTime of Htable thread pool can be changed

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172050#comment-13172050
 ] 

Lars Hofhansl commented on HBASE-4970:
--

About to commit. What do we do with the CHANGES.txt file now. This is a 0.90 
change, so I want to add it to the 0.90 section in trunk and 0.92 as well. They 
still have 0.90.5 marked as unreleased.

In 0.90 there is a 0.90.6 entry also marked as unreleased (in addition to 
0.90.5, which is also marked as unreleased). Should I add a 0.90.6 row in 0.92 
in trunk?


> Add a parameter so that keepAliveTime of Htable thread pool can be changed
> --
>
> Key: HBASE-4970
> URL: https://issues.apache.org/jira/browse/HBASE-4970
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Trivial
> Fix For: 0.90.6
>
> Attachments: HBASE-4970_Branch90.patch, 
> HBASE-4970_Branch90_V1_trial.patch, HBASE-4970_Branch90_V2.patch, 
> HBASE-4970_Branch92_V2.patch, HBASE-4970_Trunk_V2.patch
>
>
> In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
> is slowed down.
> Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
> occurance [RES value increase]?
> You can go through the source of sun.nio.ch.Util. Every thread hold 3 
> softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
> softreferences buffercache. If the buffer was all occupied or none was 
> suitable in size, and new request comes, new direct buffer is allocated. 
> After the service, the bigger one replaces the smaller one in buffercache. 
> The replaced buffer is released.
> So I think we can add a parameter to change keepAliveTime of Htable thread 
> pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down.

2011-12-18 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172052#comment-13172052
 ] 

Zhihong Yu commented on HBASE-5063:
---

bq. I am wondering we shouldn't just fold
I guess what you meant was 'wondering (if) we should just'

> RegionServers fail to report to backup HMaster after primary goes down.
> ---
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Attachments: HBASE-5063.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4970) Add a parameter so that keepAliveTime of Htable thread pool can be changed

2011-12-18 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172053#comment-13172053
 ] 

Zhihong Yu commented on HBASE-4970:
---

I think the likelihood of 0.90.5 RC going through is high.
So we should add 0.90.6 section to CHANGES.txt of 0.92 and TRUNK.

> Add a parameter so that keepAliveTime of Htable thread pool can be changed
> --
>
> Key: HBASE-4970
> URL: https://issues.apache.org/jira/browse/HBASE-4970
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Trivial
> Fix For: 0.90.6
>
> Attachments: HBASE-4970_Branch90.patch, 
> HBASE-4970_Branch90_V1_trial.patch, HBASE-4970_Branch90_V2.patch, 
> HBASE-4970_Branch92_V2.patch, HBASE-4970_Trunk_V2.patch
>
>
> In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
> is slowed down.
> Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
> occurance [RES value increase]?
> You can go through the source of sun.nio.ch.Util. Every thread hold 3 
> softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
> softreferences buffercache. If the buffer was all occupied or none was 
> suitable in size, and new request comes, new direct buffer is allocated. 
> After the service, the bigger one replaces the smaller one in buffercache. 
> The replaced buffer is released.
> So I think we can add a parameter to change keepAliveTime of Htable thread 
> pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172054#comment-13172054
 ] 

Lars Hofhansl commented on HBASE-5058:
--

Hmm... Good point. Although I now think my current patch is the safest. I think 
I would just wrap UndeclaredThrowableException (rather than RuntimeException) 
into a MasterNotRunning exception, but that's it.
There's also the question what happens when the master moves (2nd master takes 
over for example). Since the HConnectionImplementation caches the 
HMasterInterface, it will probably never switch.

In general this code is a bit messy, which is why I am flip-flopping here... No 
solution seems quite right.


> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down.

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172056#comment-13172056
 ] 

Lars Hofhansl commented on HBASE-5063:
--

Yes :)

> RegionServers fail to report to backup HMaster after primary goes down.
> ---
>
> Key: HBASE-5063
> URL: https://issues.apache.org/jira/browse/HBASE-5063
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Critical
> Attachments: HBASE-5063.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4970) Add a parameter so that keepAliveTime of Htable thread pool can be changed

2011-12-18 Thread Lars Hofhansl (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-4970.
--

   Resolution: Fixed
Fix Version/s: 0.92.1
   0.94.0
 Hadoop Flags: Reviewed

Committed to 0.90, 0.92, and trunk.
Thanks for the patch and your patience, gaojinchao.


> Add a parameter so that keepAliveTime of Htable thread pool can be changed
> --
>
> Key: HBASE-4970
> URL: https://issues.apache.org/jira/browse/HBASE-4970
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Trivial
> Fix For: 0.94.0, 0.92.1, 0.90.6
>
> Attachments: HBASE-4970_Branch90.patch, 
> HBASE-4970_Branch90_V1_trial.patch, HBASE-4970_Branch90_V2.patch, 
> HBASE-4970_Branch92_V2.patch, HBASE-4970_Trunk_V2.patch
>
>
> In my cluster, I changed keepAliveTime from 60 s to 3600 s.  Increasing RES 
> is slowed down.
> Why increasing keepAliveTime of HBase thread pool is slowing down our problem 
> occurance [RES value increase]?
> You can go through the source of sun.nio.ch.Util. Every thread hold 3 
> softreference of direct buffer(mustangsrc) for reusage. The code names the 3 
> softreferences buffercache. If the buffer was all occupied or none was 
> suitable in size, and new request comes, new direct buffer is allocated. 
> After the service, the bigger one replaces the smaller one in buffercache. 
> The replaced buffer is released.
> So I think we can add a parameter to change keepAliveTime of Htable thread 
> pool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172060#comment-13172060
 ] 

Zhihong Yu commented on HBASE-5058:
---

{code}
+// If didn't get the master and this is a managed connection, give up.
+// Otherwise give subsequent calls a chance to try again.
+this.masterChecked = managed || master != null;
{code}
Judging from the first sentence of the javadoc above, the assignment should be
{code}
+this.masterChecked = !managed || master != null;
{code}
Basically the negation of RHS becomes: managed && master == null

> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172061#comment-13172061
 ] 

Lars Hofhansl commented on HBASE-5058:
--

Actually it is the other way round... :)

masterChecked is set to true to avoid trying to retrieve the master in the 
future. This is fine for the managed HConnection as it will just be removed and 
another is created when needed.
For an HConnection that is passed from the outside, it has to be possible to 
try again. So if the HConnection is managed we retain the old behavior (i.e. 
only try once, give up after that, even if that failed).
For an unmanaged connection we try again unless we actually found a master. So 
masterChecked is set to true if either the connection is managed (always avoid 
retrying), or we found a master.


> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172068#comment-13172068
 ] 

Zhihong Yu commented on HBASE-5058:
---

The above description is clearer than the javadoc in the patch (esp. 'give up').

> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5060) HBase client is blocked forever

2011-12-18 Thread gaojinchao (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-5060:
--

Attachment: HBASE-5060_trunk.patch

Patch for trunk


> HBase client is blocked forever
> ---
>
> Key: HBASE-5060
> URL: https://issues.apache.org/jira/browse/HBASE-5060
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Critical
> Fix For: 0.90.6
>
> Attachments: HBASE-5060_Branch90trial.patch, HBASE-5060_trunk.patch
>
>
> Since the client had a temporary network failure, After it recovered.
> I found my client thread was blocked. 
> Looks below stack and logs, It said that we use a invalid CatalogTracker in 
> function "tableExists".
> Block stack:
> "WriteHbaseThread33" prio=10 tid=0x7f76bc27a800 nid=0x2540 in 
> Object.wait() [0x7f76af4f3000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>  at java.lang.Object.wait(Native Method)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:331)
>  - locked <0x7f7a67817c98> (a 
> java.util.concurrent.atomic.AtomicBoolean)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:366)
>  at 
> org.apache.hadoop.hbase.catalog.MetaReader.tableExists(MetaReader.java:427)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:164)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  - locked <0x7f7a4c5dc578> (a com.huawei.hdi.hbase.HbaseReOper)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> In ZooKeeperNodeTracker, We don't throw the KeeperException to high level.
> So in CatalogTracker level, We think ZooKeeperNodeTracker start success and
> continue to process .
> [WriteHbaseThread33]2011-12-16 17:07:33,153[WARN ]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Unable to 
> get data of znode /hbase/root-region-server | 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:557)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:931)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:73)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:136)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:111)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:162)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> [WriteHbaseThread33]2011-12-16 17:07:33,361[ERROR]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Received 
> unexpected KeeperException, re-throwing exception | 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.keeperException(ZooKeeperWatcher.java:385)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:931)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:73)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:136)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:111)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:162)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hd

[jira] [Commented] (HBASE-5060) HBase client is blocked forever

2011-12-18 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172069#comment-13172069
 ] 

gaojinchao commented on HBASE-5060:
---

Test case passed:
My test code:
try {
  HBaseAdmin hbase = new HBaseAdmin(config);
  while (true) {
try {
if (hbase.tableExists(tableName)) {
  System.out.println("[FATAL] The usertable: " + tableName
  + " is already existed");
}
try {
  Thread.sleep(50);
} catch (InterruptedException e) {
  continue;
}
}catch(IOException e){
   e.printStackTrace();
   continue;
}
  }
1. run test case
2. kill two zk servers(total three zk servers)
3. start the killed server again



> HBase client is blocked forever
> ---
>
> Key: HBASE-5060
> URL: https://issues.apache.org/jira/browse/HBASE-5060
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Critical
> Fix For: 0.92.1, 0.90.6
>
> Attachments: HBASE-5060_Branch90trial.patch, HBASE-5060_trunk.patch
>
>
> Since the client had a temporary network failure, After it recovered.
> I found my client thread was blocked. 
> Looks below stack and logs, It said that we use a invalid CatalogTracker in 
> function "tableExists".
> Block stack:
> "WriteHbaseThread33" prio=10 tid=0x7f76bc27a800 nid=0x2540 in 
> Object.wait() [0x7f76af4f3000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>  at java.lang.Object.wait(Native Method)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:331)
>  - locked <0x7f7a67817c98> (a 
> java.util.concurrent.atomic.AtomicBoolean)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:366)
>  at 
> org.apache.hadoop.hbase.catalog.MetaReader.tableExists(MetaReader.java:427)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:164)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  - locked <0x7f7a4c5dc578> (a com.huawei.hdi.hbase.HbaseReOper)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> In ZooKeeperNodeTracker, We don't throw the KeeperException to high level.
> So in CatalogTracker level, We think ZooKeeperNodeTracker start success and
> continue to process .
> [WriteHbaseThread33]2011-12-16 17:07:33,153[WARN ]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Unable to 
> get data of znode /hbase/root-region-server | 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:557)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:931)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:73)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:136)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:111)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:162)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> [WriteHbaseThread33]2011-12-16 17:07:33,361[ERROR]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Received 
> unexpected KeeperException, re-throwing exception | 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.keeperException(ZooKeeperWatcher.java:385)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperExceptio

[jira] [Updated] (HBASE-5060) HBase client is blocked forever

2011-12-18 Thread gaojinchao (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-5060:
--

Fix Version/s: 0.92.1
   Status: Patch Available  (was: Open)

> HBase client is blocked forever
> ---
>
> Key: HBASE-5060
> URL: https://issues.apache.org/jira/browse/HBASE-5060
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Critical
> Fix For: 0.92.1, 0.90.6
>
> Attachments: HBASE-5060_Branch90trial.patch, HBASE-5060_trunk.patch
>
>
> Since the client had a temporary network failure, After it recovered.
> I found my client thread was blocked. 
> Looks below stack and logs, It said that we use a invalid CatalogTracker in 
> function "tableExists".
> Block stack:
> "WriteHbaseThread33" prio=10 tid=0x7f76bc27a800 nid=0x2540 in 
> Object.wait() [0x7f76af4f3000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>  at java.lang.Object.wait(Native Method)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:331)
>  - locked <0x7f7a67817c98> (a 
> java.util.concurrent.atomic.AtomicBoolean)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:366)
>  at 
> org.apache.hadoop.hbase.catalog.MetaReader.tableExists(MetaReader.java:427)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:164)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  - locked <0x7f7a4c5dc578> (a com.huawei.hdi.hbase.HbaseReOper)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> In ZooKeeperNodeTracker, We don't throw the KeeperException to high level.
> So in CatalogTracker level, We think ZooKeeperNodeTracker start success and
> continue to process .
> [WriteHbaseThread33]2011-12-16 17:07:33,153[WARN ]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Unable to 
> get data of znode /hbase/root-region-server | 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:557)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:931)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:73)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:136)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:111)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:162)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> [WriteHbaseThread33]2011-12-16 17:07:33,361[ERROR]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Received 
> unexpected KeeperException, re-throwing exception | 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.keeperException(ZooKeeperWatcher.java:385)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:931)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:73)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:136)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:111)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:162)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at 

[jira] [Updated] (HBASE-5058) Allow HBaseAmin to use an existing connection

2011-12-18 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5058:
-

Attachment: 5058-v3.txt

Here's an attempt I can live with.
Notice how (1) the master null check is pulled into the synchronized block and 
(2) master is now set to null before the start of the loop (3) sets 
masterChecked to managed.

The effect is that the current behavior is not changed. I.e. for a managed 
connection we try only once. Unmanaged connection get a chance to retry on 
subsequent calls, and since master is set to null, this would work even when 
the master has moved since the first attempt.

> Allow HBaseAmin to use an existing connection
> -
>
> Key: HBASE-5058
> URL: https://issues.apache.org/jira/browse/HBASE-5058
> Project: HBase
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 0.94.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: 5058-v2.txt, 5058-v3.txt, 5058.txt
>
>
> What HBASE-4805 does for HTables, this should do for HBaseAdmin.
> Along with this the shared error handling and retrying between HBaseAdmin and 
> HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5060) HBase client is blocked forever

2011-12-18 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172074#comment-13172074
 ] 

Zhihong Yu commented on HBASE-5060:
---

+1 if tests pass.

> HBase client is blocked forever
> ---
>
> Key: HBASE-5060
> URL: https://issues.apache.org/jira/browse/HBASE-5060
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Critical
> Fix For: 0.92.0, 0.90.6
>
> Attachments: HBASE-5060_Branch90trial.patch, HBASE-5060_trunk.patch
>
>
> Since the client had a temporary network failure, After it recovered.
> I found my client thread was blocked. 
> Looks below stack and logs, It said that we use a invalid CatalogTracker in 
> function "tableExists".
> Block stack:
> "WriteHbaseThread33" prio=10 tid=0x7f76bc27a800 nid=0x2540 in 
> Object.wait() [0x7f76af4f3000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>  at java.lang.Object.wait(Native Method)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:331)
>  - locked <0x7f7a67817c98> (a 
> java.util.concurrent.atomic.AtomicBoolean)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:366)
>  at 
> org.apache.hadoop.hbase.catalog.MetaReader.tableExists(MetaReader.java:427)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:164)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  - locked <0x7f7a4c5dc578> (a com.huawei.hdi.hbase.HbaseReOper)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> In ZooKeeperNodeTracker, We don't throw the KeeperException to high level.
> So in CatalogTracker level, We think ZooKeeperNodeTracker start success and
> continue to process .
> [WriteHbaseThread33]2011-12-16 17:07:33,153[WARN ]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Unable to 
> get data of znode /hbase/root-region-server | 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:557)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:931)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:73)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:136)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:111)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:162)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> [WriteHbaseThread33]2011-12-16 17:07:33,361[ERROR]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Received 
> unexpected KeeperException, re-throwing exception | 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.keeperException(ZooKeeperWatcher.java:385)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:931)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:73)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:136)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:111)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:162)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>

[jira] [Updated] (HBASE-5060) HBase client is blocked forever

2011-12-18 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5060:
--

Fix Version/s: (was: 0.92.1)
   0.92.0

Since this is critical, we should include this in 0.92.0

> HBase client is blocked forever
> ---
>
> Key: HBASE-5060
> URL: https://issues.apache.org/jira/browse/HBASE-5060
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Critical
> Fix For: 0.92.0, 0.90.6
>
> Attachments: HBASE-5060_Branch90trial.patch, HBASE-5060_trunk.patch
>
>
> Since the client had a temporary network failure, After it recovered.
> I found my client thread was blocked. 
> Looks below stack and logs, It said that we use a invalid CatalogTracker in 
> function "tableExists".
> Block stack:
> "WriteHbaseThread33" prio=10 tid=0x7f76bc27a800 nid=0x2540 in 
> Object.wait() [0x7f76af4f3000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>  at java.lang.Object.wait(Native Method)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:331)
>  - locked <0x7f7a67817c98> (a 
> java.util.concurrent.atomic.AtomicBoolean)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:366)
>  at 
> org.apache.hadoop.hbase.catalog.MetaReader.tableExists(MetaReader.java:427)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:164)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  - locked <0x7f7a4c5dc578> (a com.huawei.hdi.hbase.HbaseReOper)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> In ZooKeeperNodeTracker, We don't throw the KeeperException to high level.
> So in CatalogTracker level, We think ZooKeeperNodeTracker start success and
> continue to process .
> [WriteHbaseThread33]2011-12-16 17:07:33,153[WARN ]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Unable to 
> get data of znode /hbase/root-region-server | 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:557)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:931)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:73)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:136)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:111)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:162)
>  at com.huawei.hdi.hbase.HbaseFileOperate.checkHtableState(Unknown 
> Source)
>  at com.huawei.hdi.hbase.HbaseReOper.reCreateHtable(Unknown Source)
>  at com.huawei.hdi.hbase.HbaseFileOperate.writeToHbase(Unknown Source)
>  at com.huawei.hdi.hbase.WriteHbaseThread.run(Unknown Source)
> [WriteHbaseThread33]2011-12-16 17:07:33,361[ERROR]  | 
> hconnection-0x334129cf6890051-0x334129cf6890051-0x334129cf6890051 Received 
> unexpected KeeperException, re-throwing exception | 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.keeperException(ZooKeeperWatcher.java:385)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>  at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:931)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:73)
>  at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:136)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:111)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:162)
>  at com.huawei.hdi.hbase.HbaseFil