date:20121226


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539676#comment-13539676
 ] 

Lars Hofhansl commented on HBASE-7438:
--

This test has quite some subtle changes in 0.96.

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7224) Remove references to Writable in the ipc package


 [ 
https://issues.apache.org/jira/browse/HBASE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7224:
-

Attachment: 7224v4.txt

Retrying.  Failing tests pass locally.

 Remove references to Writable in the ipc package
 

 Key: HBASE-7224
 URL: https://issues.apache.org/jira/browse/HBASE-7224
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC, Protobufs
Reporter: Devaraj Das
Assignee: stack
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 7224.txt, 7224v2.txt, 7224v3.txt, 7224v4.txt, 
 7224v4.txt, purge_more_writables.txt


 I see references to Writable in the ipc package, most notably in the 
 Invocation class. This class is not being used that much in the core ipc 
 package but used in the coprocessor protocol implementations (there are some 
 coprocessor protocols that are Writable based still). This jira is to track 
 removing those references and the Invocation class (once HBASE-6895 is 
 resolved).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7412) Fix how HTableDescriptor handles default max file size and flush size


[ 
https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539679#comment-13539679
 ] 

Jimmy Xiang commented on HBASE-7412:


Making HTableDescriptor take in a Conf in constructor will make the interface 
cleaner.  However, it involves other changes.  We still need to change the code 
calling #getMaxFileSize and #getMemStoreFlushSize since the existing code is 
wrong assuming the default setting must not be from the table specific setting. 
Let me think about it how we fix this.

I will fix the javadoc errors.

Yes, the un-boxing code is not good.  It is some existing code.  I will fix it.

 Fix how HTableDescriptor handles default max file size and flush size
 -

 Key: HBASE-7412
 URL: https://issues.apache.org/jira/browse/HBASE-7412
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: 0.94-7412.patch, trunk-7412.patch


 If the region flush size is not set in the table, 
 IncreasingToUpperBoundRegionSplitPolicy will most likely always use the 
 default value: 128MB, even if the flush size is set to a different value in 
 hbase-site.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


 [ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-7438:
-

Status: Patch Available  (was: Open)

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


 [ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-7438:
-

Attachment: 7438-0.96.txt

0.96 patch

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7423) HFileArchiver should not use the configuration from the Filesystem


[ 
https://issues.apache.org/jira/browse/HBASE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539697#comment-13539697
 ] 

stack commented on HBASE-7423:
--

+1 on commit

 HFileArchiver should not use the configuration from the Filesystem
 --

 Key: HBASE-7423
 URL: https://issues.apache.org/jira/browse/HBASE-7423
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0, 0.94.4
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-7423_v1.patch, hbase-7423_v2.patch, 
 hbase-7423_v2.patch


 HFileArchiver gets the configuration from the FileSystem in 
 {code}
  public static void archiveRegion(FileSystem fs, HRegionInfo info)
   throws IOException {
 Path rootDir = FSUtils.getRootDir(fs.getConf());
 {code}
 In Pig's test cases, they construct a MiniDFSCluster and pass it to 
 HBaseTestingUtil, which causes the delete table to fail because it will refer 
 to the FileSystem's configuration rather than HBase's one. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7412) Fix how HTableDescriptor handles default max file size and flush size


[ 
https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539700#comment-13539700
 ] 

Jimmy Xiang commented on HBASE-7412:


HTableDescriptor is used in both client side and server side.  Client side and 
server side could have different configuration. That means it should not 
depends on configuration object?  Otherwise, we could get inconsistent values.

 Fix how HTableDescriptor handles default max file size and flush size
 -

 Key: HBASE-7412
 URL: https://issues.apache.org/jira/browse/HBASE-7412
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: 0.94-7412.patch, trunk-7412.patch


 If the region flush size is not set in the table, 
 IncreasingToUpperBoundRegionSplitPolicy will most likely always use the 
 default value: 128MB, even if the flush size is set to a different value in 
 hbase-site.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops

2012-12-26 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539703#comment-13539703
]

Hadoop QA commented on HBASE-7438:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12562406/7438-0.96.txt
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 3 new
or modified tests.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:red}-1 core tests{color}. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestHCM

{color:red}-1 core zombie tests{color}. There are 2 zombie test(s):

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/3700//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3700//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3700//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3700//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3700//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3700//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3700//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3700//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/3700//console

This message is automatically generated.

TestSplitTransactionOnCluster has too many infinite loops
-

Key: HBASE-7438
URL: https://issues.apache.org/jira/browse/HBASE-7438
Project: HBase
Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Fix For: 0.96.0, 0.94.4

Attachments: 7438-0.94.txt, 7438-0.96.txt

There are many cases in these test where we loop until a condition happens.
If that condition never occurs we'll wait forever, and the test will time out
instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7224) Remove references to Writable in the ipc package

2012-12-26 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539701#comment-13539701
]

Hadoop QA commented on HBASE-7224:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12562405/7224v4.txt
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 16 new
or modified tests.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:red}-1 core tests{color}. The patch failed these unit tests:
org.apache.hadoop.hbase.TestZooKeeper
org.apache.hadoop.hbase.client.TestHCM

{color:red}-1 core zombie tests{color}. There are 4 zombie test(s):

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/3701//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3701//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3701//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3701//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3701//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3701//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3701//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/3701//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/3701//console

This message is automatically generated.

Remove references to Writable in the ipc package

Key: HBASE-7224
URL: https://issues.apache.org/jira/browse/HBASE-7224
Project: HBase
Issue Type: Sub-task
Components: IPC/RPC, Protobufs
Reporter: Devaraj Das
Assignee: stack
Priority: Blocker
Fix For: 0.96.0

Attachments: 7224.txt, 7224v2.txt, 7224v3.txt, 7224v4.txt,
7224v4.txt, purge_more_writables.txt

I see references to Writable in the ipc package, most notably in the
Invocation class. This class is not being used that much in the core ipc
package but used in the coprocessor protocol implementations (there are some
coprocessor protocols that are Writable based still). This jira is to track
removing those references and the Invocation class (once HBASE-6895 is
resolved).

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539707#comment-13539707
 ] 

Lars Hofhansl commented on HBASE-7438:
--

TestSplitTransactionOnCluster ran successfully in this run.
Waiting for another +1, since this is a more contentious change (since there 
are new timeouts there is the possibility that they cause the test to fail).


 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7224) Remove references to Writable in the ipc package


 [ 
https://issues.apache.org/jira/browse/HBASE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7224:
-

Attachment: 7224v4.txt

Retry again.  Different failures this last time.

 Remove references to Writable in the ipc package
 

 Key: HBASE-7224
 URL: https://issues.apache.org/jira/browse/HBASE-7224
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC, Protobufs
Reporter: Devaraj Das
Assignee: stack
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 7224.txt, 7224v2.txt, 7224v3.txt, 7224v4.txt, 
 7224v4.txt, 7224v4.txt, purge_more_writables.txt


 I see references to Writable in the ipc package, most notably in the 
 Invocation class. This class is not being used that much in the core ipc 
 package but used in the coprocessor protocol implementations (there are some 
 coprocessor protocols that are Writable based still). This jira is to track 
 removing those references and the Invocation class (once HBASE-6895 is 
 resolved).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-7427) Check line lenghts in the test-patch script


 [ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar resolved HBASE-7427.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Release Note: 
Committed this. Let's see whether if it is useful. Feel free to revert 
otherwise.
Thanks Stack for review. 

 Check line lenghts in the test-patch script
 ---

 Key: HBASE-7427
 URL: https://issues.apache.org/jira/browse/HBASE-7427
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: hbase-7427_v1.patch


 Checkstyle is disabled in test-patch, and it is not very easy to make it 
 work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7427) Check line lenghts in the test-patch script


[ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539713#comment-13539713
 ] 

Enis Soztutar commented on HBASE-7427:
--

Committed this. Let's see whether if it is useful. Feel free to revert 
otherwise.
Thanks Stack for review. 

 Check line lenghts in the test-patch script
 ---

 Key: HBASE-7427
 URL: https://issues.apache.org/jira/browse/HBASE-7427
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: hbase-7427_v1.patch


 Checkstyle is disabled in test-patch, and it is not very easy to make it 
 work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7427) Check line lenghts in the test-patch script


 [ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-7427:
-

Release Note:   (was: Committed this. Let's see whether if it is useful. 
Feel free to revert otherwise.
Thanks Stack for review. )

 Check line lenghts in the test-patch script
 ---

 Key: HBASE-7427
 URL: https://issues.apache.org/jira/browse/HBASE-7427
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: hbase-7427_v1.patch


 Checkstyle is disabled in test-patch, and it is not very easy to make it 
 work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539714#comment-13539714
 ] 

Himanshu Vashishtha commented on HBASE-7438:


Hey [~lhofhansl], I wonder what is the root cause for the conditions for which 
this patch imposes a second level timeout. 

Re: If that condition never occurs we'll wait forever,
Wrong assumptions, or some bug lurking somewhere causes this to happen?

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539716#comment-13539716
 ] 

Lars Hofhansl commented on HBASE-7438:
--

That's what I want to find out. :)
There are various places where the code will wait forever for
* a region server to shut down
* a znode to be created
* daughter regions to come online
* region to to go in or out of transition
* tables to come online
* splits or split rollbacks to happen
* etc

These timeouts are not to make the test faster, but to guard against it never 
terminating.


 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7321) Simple Flush Snapshot


[ 
https://issues.apache.org/jira/browse/HBASE-7321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539721#comment-13539721
 ] 

Jonathan Hsieh commented on HBASE-7321:
---

Some of the recent underlaying changes has made the tests in this patch 
flakey...

 Simple Flush Snapshot
 -

 Key: HBASE-7321
 URL: https://issues.apache.org/jira/browse/HBASE-7321
 Project: HBase
  Issue Type: Sub-task
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-7321.v2.patch, pre-hbase-7321.v2.patch


 This snapshot style just issues a region flush and then snapshots the 
 region.  
 This is a simple implementation that gives the equivalent of copytable 
 consistency.  While by most definitions of consistency if a client writes A 
 and then write B to different region servers, only neither, only A, or both 
 A+B writes should be present, this one allows the only B case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539720#comment-13539720
 ] 

stack commented on HBASE-7438:
--

+1 on commit.  Just changes the test to have bounds within which it much 
complete.   If test continues to fail, needs deeper surgery... Much better fail 
than hang.

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539724#comment-13539724
 ] 

Lars Hofhansl commented on HBASE-7438:
--

And yes, there's possibly a bug lurking somewhere :(
This should help pinpointing that. If we timeout waiting for something, at 
least we'll know where.

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


 [ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-7438:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96. Thanks comments and reviews. Will continue to watch 
this.

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-7439) HFileLink should not use the configuration from the Filesystem

Matteo Bertozzi created HBASE-7439:
--

 Summary: HFileLink should not use the configuration from the 
Filesystem
 Key: HBASE-7439
 URL: https://issues.apache.org/jira/browse/HBASE-7439
 Project: HBase
  Issue Type: Sub-task
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker


This is related to HBASE-7423 and HBASE-7422
since the fs.getConf() can be the unexpected one, we should avoid to use that 
to get the root dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7439) HFileLink should not use the configuration from the Filesystem


 [ 
https://issues.apache.org/jira/browse/HBASE-7439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7439:
---

Attachment: HBASE-7439-v0.patch

 HFileLink should not use the configuration from the Filesystem
 --

 Key: HBASE-7439
 URL: https://issues.apache.org/jira/browse/HBASE-7439
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7439-v0.patch


 This is related to HBASE-7423 and HBASE-7422
 since the fs.getConf() can be the unexpected one, we should avoid to use that 
 to get the root dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7439) HFileLink should not use the configuration from the Filesystem


 [ 
https://issues.apache.org/jira/browse/HBASE-7439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7439:
---

Status: Patch Available  (was: Open)

 HFileLink should not use the configuration from the Filesystem
 --

 Key: HBASE-7439
 URL: https://issues.apache.org/jira/browse/HBASE-7439
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7439-v0.patch


 This is related to HBASE-7423 and HBASE-7422
 since the fs.getConf() can be the unexpected one, we should avoid to use that 
 to get the root dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7439) HFileLink should not use the configuration from the Filesystem


[ 
https://issues.apache.org/jira/browse/HBASE-7439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539733#comment-13539733
 ] 

Ted Yu commented on HBASE-7439:
---

Why is hdfs block distribution method removed ?

 HFileLink should not use the configuration from the Filesystem
 --

 Key: HBASE-7439
 URL: https://issues.apache.org/jira/browse/HBASE-7439
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7439-v0.patch


 This is related to HBASE-7423 and HBASE-7422
 since the fs.getConf() can be the unexpected one, we should avoid to use that 
 to get the root dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7439) HFileLink should not use the configuration from the Filesystem


[ 
https://issues.apache.org/jira/browse/HBASE-7439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539734#comment-13539734
 ] 

Matteo Bertozzi commented on HBASE-7439:


internal and not used... I can add to it a conf argument if you want.
HRegion uses StoreFile.getHDFSBlockDistribution()

 HFileLink should not use the configuration from the Filesystem
 --

 Key: HBASE-7439
 URL: https://issues.apache.org/jira/browse/HBASE-7439
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7439-v0.patch


 This is related to HBASE-7423 and HBASE-7422
 since the fs.getConf() can be the unexpected one, we should avoid to use that 
 to get the root dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7427) Check line lenghts in the test-patch script


[ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539737#comment-13539737
 ] 

Hudson commented on HBASE-7427:
---

Integrated in HBase-TRUNK #3657 (See 
[https://builds.apache.org/job/HBase-TRUNK/3657/])
HBASE-7427. Check line lenghts in the test-patch script (Revision 1426040)

 Result = FAILURE
enis : 
Files : 
* /hbase/trunk/dev-support/test-patch.properties
* /hbase/trunk/dev-support/test-patch.sh


 Check line lenghts in the test-patch script
 ---

 Key: HBASE-7427
 URL: https://issues.apache.org/jira/browse/HBASE-7427
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: hbase-7427_v1.patch


 Checkstyle is disabled in test-patch, and it is not very easy to make it 
 work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7412) Fix how HTableDescriptor handles default max file size and flush size


 [ 
https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7412:
---

Status: Open  (was: Patch Available)

 Fix how HTableDescriptor handles default max file size and flush size
 -

 Key: HBASE-7412
 URL: https://issues.apache.org/jira/browse/HBASE-7412
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: 0.94-7412.patch, trunk-7412.patch


 If the region flush size is not set in the table, 
 IncreasingToUpperBoundRegionSplitPolicy will most likely always use the 
 default value: 128MB, even if the flush size is set to a different value in 
 hbase-site.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7412) Fix how HTableDescriptor handles default max file size and flush size


 [ 
https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7412:
---

Attachment: trunk-7412_v2.patch

 Fix how HTableDescriptor handles default max file size and flush size
 -

 Key: HBASE-7412
 URL: https://issues.apache.org/jira/browse/HBASE-7412
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: 0.94-7412.patch, trunk-7412.patch, trunk-7412_v2.patch


 If the region flush size is not set in the table, 
 IncreasingToUpperBoundRegionSplitPolicy will most likely always use the 
 default value: 128MB, even if the flush size is set to a different value in 
 hbase-site.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7412) Fix how HTableDescriptor handles default max file size and flush size

[
https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jimmy Xiang updated HBASE-7412:
---

Status: Patch Available (was: Open)

Attached patch v2. In this patch, if the value (max file size, or mem flush
size) is not set, returns -1 instead of the default value. So the client can
decide what to do. If we return the default value, it is hard to tell if the
value is really set to the default value or not.

If the value is set to the default value, but it is configured to a different
value in hbase-site.xml, the existing code will use that value in the
hbase-site.xml. With the patch, the value set for the table will be used (in
this case, the default value).

Fix how HTableDescriptor handles default max file size and flush size
-

Key: HBASE-7412
URL: https://issues.apache.org/jira/browse/HBASE-7412
Project: HBase
Issue Type: Bug
Components: regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
Fix For: 0.96.0, 0.94.4

Attachments: 0.94-7412.patch, trunk-7412.patch, trunk-7412_v2.patch

If the region flush size is not set in the table,
IncreasingToUpperBoundRegionSplitPolicy will most likely always use the
default value: 128MB, even if the flush size is set to a different value in
hbase-site.xml.

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539740#comment-13539740
 ] 

Hudson commented on HBASE-7438:
---

Integrated in HBase-0.94 #671 (See 
[https://builds.apache.org/job/HBase-0.94/671/])
HBASE-7438 TestSplitTransactionOnCluster has too many infinite loops 
(Revision 1426067)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7412) Fix how HTableDescriptor handles default max file size and flush size


[ 
https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539741#comment-13539741
 ] 

Enis Soztutar commented on HBASE-7412:
--

bq. That's right. This patch is a valid bug fix. Also, HBASE-7236 is not for 
0.94.
I see. Makes sense to fix separately. 



 Fix how HTableDescriptor handles default max file size and flush size
 -

 Key: HBASE-7412
 URL: https://issues.apache.org/jira/browse/HBASE-7412
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: 0.94-7412.patch, trunk-7412.patch, trunk-7412_v2.patch


 If the region flush size is not set in the table, 
 IncreasingToUpperBoundRegionSplitPolicy will most likely always use the 
 default value: 128MB, even if the flush size is set to a different value in 
 hbase-site.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539758#comment-13539758
 ] 

Hudson commented on HBASE-7438:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #314 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/314/])
HBASE-7438 TestSplitTransactionOnCluster has too many infinite loops 
(Revision 1426066)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7427) Check line lenghts in the test-patch script


[ 
https://issues.apache.org/jira/browse/HBASE-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539759#comment-13539759
 ] 

Hudson commented on HBASE-7427:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #314 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/314/])
HBASE-7427. Check line lenghts in the test-patch script (Revision 1426040)

 Result = FAILURE
enis : 
Files : 
* /hbase/trunk/dev-support/test-patch.properties
* /hbase/trunk/dev-support/test-patch.sh


 Check line lenghts in the test-patch script
 ---

 Key: HBASE-7427
 URL: https://issues.apache.org/jira/browse/HBASE-7427
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: hbase-7427_v1.patch


 Checkstyle is disabled in test-patch, and it is not very easy to make it 
 work. We can just add some check for the line lengths in the mean time. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-7440) ReplicationZookeeper#addPeer is racy

Himanshu Vashishtha created HBASE-7440:
--

 Summary: ReplicationZookeeper#addPeer is racy
 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.4


While adding a peer, ReplicationZK does the znodes creation in three 
transactions. Create :
a) peers znode
b) peerId specific znode, and
c) peerState znode

There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
happens that while adding a peer, the control flows to getPeer() and step c) 
has not been processed, it may results in a state where the peer will not be 
added. This happens while running TestMasterReplication#testCyclicReplication().
{code}
2012-12-26 07:36:35,187 INFO  
[RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
already exists and this is not a retry
2012-12-26 07:36:35,188 ERROR 
[RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
new peer
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
NodeExists for /2/replication/peers/1/peer-state
at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
at 
org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
at 
org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
at 
org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2012-12-26 07:36:35,188 DEBUG 
[RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...

{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539766#comment-13539766
 ] 

Lars Hofhansl commented on HBASE-7438:
--

Interestingly TestSplitTransactionOnCluster does not appear to have run in this 
trunk test run. Is it hanging yet somewhere else?

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539768#comment-13539768
 ] 

Lars Hofhansl commented on HBASE-7438:
--

In fact I cannot find a single trunk run where this test ran. Even in a 
successful run I do not see this test...?

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7439) HFileLink should not use the configuration from the Filesystem


[ 
https://issues.apache.org/jira/browse/HBASE-7439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539770#comment-13539770
 ] 

Himanshu Vashishtha commented on HBASE-7439:


+1

Yes, we should get rid of the method as its not used; it looks more like a util 
method too.

 HFileLink should not use the configuration from the Filesystem
 --

 Key: HBASE-7439
 URL: https://issues.apache.org/jira/browse/HBASE-7439
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7439-v0.patch


 This is related to HBASE-7423 and HBASE-7422
 since the fs.getConf() can be the unexpected one, we should avoid to use that 
 to get the root dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7352) clone operation from HBaseAdmin can hang forever.


 [ 
https://issues.apache.org/jira/browse/HBASE-7352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7352:
---

Attachment: HBASE-7352-v1.patch

What about extracting the wait loop from enableTable() and use that also for 
the clone?

{code}
throw new IOException(Unable to enable table  +
  Bytes.toString(tableName));
{code}

The enable table code, after N retries says: Unable to enable table... but 
this looks like the wrong message... we don't know if the enable is failed or 
is just slow...

change the message in table not yet enabled, change the exception type in 
RetriesExhaustedException()?

 clone operation from HBaseAdmin can hang forever.
 -

 Key: HBASE-7352
 URL: https://issues.apache.org/jira/browse/HBASE-7352
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Reporter: Jonathan Hsieh
Assignee: Matteo Bertozzi
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7352-v0.patch, HBASE-7352-v1.patch


 Sometimes the clone operation from the hbase shell can hang.  The table has 
 been created (it shows up in the web ui), but does not have any entries in 
 META.
 There don't seem to be any clone, snapshot, enable or disable found in the 
 master's jstack.
 Here's a trace from the HBaseAdmin:
 {code}
 main prio=10 tid=0x7f782800d000 nid=0x25c waiting on condition 
 [0x7f782f9bf000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.cloneSnapshot(HBaseAdmin.java:2413)
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.cloneSnapshot(HBaseAdmin.java:2393)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:465)
 at org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:323)
 at 
 org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:69)
 at 
 org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:201)
 at org.jruby.ast.CallTwoArgNode.interpret(CallTwoArgNode.java:59)
 at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
 ... (more jruby stack) ... 
 {code}  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539775#comment-13539775
 ] 

Hudson commented on HBASE-7438:
---

Integrated in HBase-TRUNK #3658 (See 
[https://builds.apache.org/job/HBase-TRUNK/3658/])
HBASE-7438 TestSplitTransactionOnCluster has too many infinite loops 
(Revision 1426066)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java


 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy


 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7440:
---

Attachment: HBASE-7440-v0.patch

0.94.x patch; it adds a guard while creating the peer-state znode.
TestMasterReplication passes.

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7440-v0.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7439) HFileLink should not use the configuration from the Filesystem


[ 
https://issues.apache.org/jira/browse/HBASE-7439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539783#comment-13539783
 ] 

Ted Yu commented on HBASE-7439:
---

+1 on patch.

 HFileLink should not use the configuration from the Filesystem
 --

 Key: HBASE-7439
 URL: https://issues.apache.org/jira/browse/HBASE-7439
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7439-v0.patch


 This is related to HBASE-7423 and HBASE-7422
 since the fs.getConf() can be the unexpected one, we should avoid to use that 
 to get the root dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5776) HTableMultiplexer


 [ 
https://issues.apache.org/jira/browse/HBASE-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-5776:
--

Attachment: 5776-trunk-V7.patch

Patch v7 addresses Ram's comment.

TestHTableMultiplexer passes.

 HTableMultiplexer 
 --

 Key: HBASE-5776
 URL: https://issues.apache.org/jira/browse/HBASE-5776
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: binlijin
 Fix For: 0.96.0

 Attachments: 5776-trunk-V3.patch, 5776-trunk-V4.patch, 
 5776-trunk-V5.patch, 5776-trunk-V6.patch, 5776-trunk-V7.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.1.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.1.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.2.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.2.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.3.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.4.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.5.patch, HBASE-5776-trunk.patch, 
 HBASE-5776-trunk-V2.patch


 There is a known issue in HBase client that single slow/dead region server 
 could slow down the multiput operations across all the region servers. So the 
 HBase client will be as slow as the slowest region server in the cluster. 
  
 To solve this problem, HTableMultiplexer will separate the multiput 
 submitting threads with the flush threads, which means the multiput operation 
 will be a nonblocking operation. 
 The submitting thread will shard all the puts into different queues based on 
 its destination region server and return immediately. The flush threads will 
 flush these puts from each queue to its destination region server. 
 Currently the HTableMultiplexer only supports the put operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7403) Online Merge

2012-12-26 Thread chunhui shen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539789#comment-13539789
 ] 

chunhui shen commented on HBASE-7403:
-

Yes, any if conditions is meet, we will set available false and cancel the 
merge.

So, 'elseifs' is enough...

 Online Merge
 

 Key: HBASE-7403
 URL: https://issues.apache.org/jira/browse/HBASE-7403
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.3
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.5

 Attachments: 7403-trunkv5.patch, hbase-7403-94v1.patch, 
 hbase-7403-trunkv1.patch, hbase-7403-trunkv5.patch, hbase-7403-trunkv5.patch, 
 merge region.pdf


 The feature of this online merge:
 1.Online,no necessary to disable table
 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
 3.Easy to call merege request, no need to input a long region name, only 
 encoded name enough
 4.No limit when operation, you don't need to care event like Server Dead, 
 Balance, Split, Disabing/Enabing table, no need to care whether you send a 
 wrong merge request, it alread have done for you
 5.Only little offline time for two merging regions
 We need merge in the following cases：
 1.Region hole or region overlap, can’t be fix by hbck
 2.Region become empty because of TTL and not reasonable Rowkey design
 3.Region is always empty or very small because of presplit when create table
 4.Too many empty or small regions would reduce the system performance(e.g. 
 mslab)
 Current merge tools only support offline and are not able to redo if 
 exception is thrown in the process of merging, causing a dirty data
 For online system, we need a online merge.
 This implement logic of this patch for  Online Merge is :
 For example, merge regionA and regionB into regionC
 1.Offline the two regions A and B
 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
 regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
 regionB’s directory)
 3.Add the merged regionC to .META.
 4.Assign the merged regionC
 As design of this patch , once we do the merge work in the HDFS,we could redo 
 it until successful if it throws exception or abort or server restart, but 
 couldn’t be rolled back. 
 It depends on
 Use zookeeper to record the transaction journal state, make redo easier
 Use zookeeper to send/receive merge request
 Merge transaction is executed on the master
 Support calling merge request through API or shell tool
 About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops


[ 
https://issues.apache.org/jira/browse/HBASE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539790#comment-13539790
 ] 

Ted Yu commented on HBASE-7438:
---

From https://builds.apache.org/job/HBase-TRUNK/3658/console:
{code}
Running org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 255.873 sec
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 108.586 sec
{code}
Looks like the results for the two tests interleave.

 TestSplitTransactionOnCluster has too many infinite loops
 -

 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4

 Attachments: 7438-0.94.txt, 7438-0.96.txt


 There are many cases in these test where we loop until a condition happens. 
 If that condition never occurs we'll wait forever, and the test will time out 
 instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7258) Hbase needs to create baseZNode recursivly

2012-12-26 Thread Liu Shaohui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Shaohui updated HBASE-7258:
---

Attachment: HBASE-7258-0.94.patch

HBASE 7258 patch for hbase 0.94

 Hbase needs to create baseZNode recursivly
 --

 Key: HBASE-7258
 URL: https://issues.apache.org/jira/browse/HBASE-7258
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.94.2
Reporter: Liu Shaohui
Priority: Minor
  Labels: zookeeper, zookeeper.znode.parent
 Attachments: HBASE-7258-0.94.patch, HBASE-7258.diff, HBASE-7258.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 In deploy env, multi small hbase clusters may share a same zk cluster. So, 
 for hbase cluster1, its parent znode is /hbase/cluster1. But in hbase version 
 0.94.1, hbase use ZKUtil.createAndFailSilent(this, baseZNode) to create 
 parent path and it will throw a NoNode exception if znode /hbase donot exist.
 We want to change it to ZKUtil.createWithParents(this, baseZNode); to suport 
 create baseZNode recursivly. 
 The NoNode exception is:
 java.lang.RuntimeException: Failed construction of Master: class 
 org.apache.hadoop.hbase.master.HMaster
 at 
 org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1792)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:146)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:77)
 at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1806)
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /hbase/cluster1
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:420)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:402)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:905)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:166)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:159)
 at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:282)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.apache.hadoop.hbase.master.H

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7258) Hbase needs to create baseZNode recursivly

2012-12-26 Thread Liu Shaohui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539792#comment-13539792
 ] 

Liu Shaohui commented on HBASE-7258:


I have uploaded the patch for 0.94. Thx! [~ram_krish]

 Hbase needs to create baseZNode recursivly
 --

 Key: HBASE-7258
 URL: https://issues.apache.org/jira/browse/HBASE-7258
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.94.2
Reporter: Liu Shaohui
Priority: Minor
  Labels: zookeeper, zookeeper.znode.parent
 Attachments: HBASE-7258-0.94.patch, HBASE-7258.diff, HBASE-7258.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 In deploy env, multi small hbase clusters may share a same zk cluster. So, 
 for hbase cluster1, its parent znode is /hbase/cluster1. But in hbase version 
 0.94.1, hbase use ZKUtil.createAndFailSilent(this, baseZNode) to create 
 parent path and it will throw a NoNode exception if znode /hbase donot exist.
 We want to change it to ZKUtil.createWithParents(this, baseZNode); to suport 
 create baseZNode recursivly. 
 The NoNode exception is:
 java.lang.RuntimeException: Failed construction of Master: class 
 org.apache.hadoop.hbase.master.HMaster
 at 
 org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1792)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:146)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:77)
 at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1806)
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /hbase/cluster1
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:420)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:402)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:905)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:166)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:159)
 at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:282)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.apache.hadoop.hbase.master.H

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5776) HTableMultiplexer


[ 
https://issues.apache.org/jira/browse/HBASE-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539793#comment-13539793
 ] 

Ted Yu commented on HBASE-5776:
---

Looks like Hadoop QA was not running test suite against patch:
https://builds.apache.org/job/PreCommit-HBASE-Build/3707/parameters/

There is no effect on existing tests from patch v7 which is very close to patch 
v6 - there was no new modification to existing classes.

 HTableMultiplexer 
 --

 Key: HBASE-5776
 URL: https://issues.apache.org/jira/browse/HBASE-5776
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: binlijin
 Fix For: 0.96.0

 Attachments: 5776-trunk-V3.patch, 5776-trunk-V4.patch, 
 5776-trunk-V5.patch, 5776-trunk-V6.patch, 5776-trunk-V7.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.1.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.1.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.2.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.2.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.3.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.4.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.5.patch, HBASE-5776-trunk.patch, 
 HBASE-5776-trunk-V2.patch


 There is a known issue in HBase client that single slow/dead region server 
 could slow down the multiput operations across all the region servers. So the 
 HBase client will be as slow as the slowest region server in the cluster. 
  
 To solve this problem, HTableMultiplexer will separate the multiput 
 submitting threads with the flush threads, which means the multiput operation 
 will be a nonblocking operation. 
 The submitting thread will shard all the puts into different queues based on 
 its destination region server and return immediately. The flush threads will 
 flush these puts from each queue to its destination region server. 
 Currently the HTableMultiplexer only supports the put operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7188) Move classes into hbase-client


 [ 
https://issues.apache.org/jira/browse/HBASE-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7188:
-

Attachment: HBASE-7188-0.patch

* Mostly just moves.
* Some Security enums had to be split out.

 Move classes into hbase-client
 --

 Key: HBASE-7188
 URL: https://issues.apache.org/jira/browse/HBASE-7188
 Project: HBase
  Issue Type: Sub-task
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7188-0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7188) Move classes into hbase-client


 [ 
https://issues.apache.org/jira/browse/HBASE-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7188:
-

Affects Version/s: 0.96.0
   Status: Patch Available  (was: Open)

 Move classes into hbase-client
 --

 Key: HBASE-7188
 URL: https://issues.apache.org/jira/browse/HBASE-7188
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7188-0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7188) Move classes into hbase-client


 [ 
https://issues.apache.org/jira/browse/HBASE-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7188:
-

Component/s: IPC/RPC
 Client

 Move classes into hbase-client
 --

 Key: HBASE-7188
 URL: https://issues.apache.org/jira/browse/HBASE-7188
 Project: HBase
  Issue Type: Sub-task
  Components: Client, IPC/RPC
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7188-0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7403) Online Merge

2012-12-26 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539817#comment-13539817
 ] 

ramkrishna.s.vasudevan commented on HBASE-7403:
---

Oops, yes.  We do make available as 'false'.  Then its fine. 

 Online Merge
 

 Key: HBASE-7403
 URL: https://issues.apache.org/jira/browse/HBASE-7403
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.3
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.5

 Attachments: 7403-trunkv5.patch, hbase-7403-94v1.patch, 
 hbase-7403-trunkv1.patch, hbase-7403-trunkv5.patch, hbase-7403-trunkv5.patch, 
 merge region.pdf


 The feature of this online merge:
 1.Online,no necessary to disable table
 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
 3.Easy to call merege request, no need to input a long region name, only 
 encoded name enough
 4.No limit when operation, you don't need to care event like Server Dead, 
 Balance, Split, Disabing/Enabing table, no need to care whether you send a 
 wrong merge request, it alread have done for you
 5.Only little offline time for two merging regions
 We need merge in the following cases：
 1.Region hole or region overlap, can’t be fix by hbck
 2.Region become empty because of TTL and not reasonable Rowkey design
 3.Region is always empty or very small because of presplit when create table
 4.Too many empty or small regions would reduce the system performance(e.g. 
 mslab)
 Current merge tools only support offline and are not able to redo if 
 exception is thrown in the process of merging, causing a dirty data
 For online system, we need a online merge.
 This implement logic of this patch for  Online Merge is :
 For example, merge regionA and regionB into regionC
 1.Offline the two regions A and B
 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
 regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
 regionB’s directory)
 3.Add the merged regionC to .META.
 4.Assign the merged regionC
 As design of this patch , once we do the merge work in the HDFS,we could redo 
 it until successful if it throws exception or abort or server restart, but 
 couldn’t be rolled back. 
 It depends on
 Use zookeeper to record the transaction journal state, make redo easier
 Use zookeeper to send/receive merge request
 Merge transaction is executed on the master
 Support calling merge request through API or shell tool
 About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7188) Move classes into hbase-client


 [ 
https://issues.apache.org/jira/browse/HBASE-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7188:
-

Attachment: HBASE-7188-1.patch

Here's a patch that fixes the tests that were moved over.  I had to change the 
hbase-default.xml packaging into the hbase-common module.

hadoop qa seems to be mia though.

 Move classes into hbase-client
 --

 Key: HBASE-7188
 URL: https://issues.apache.org/jira/browse/HBASE-7188
 Project: HBase
  Issue Type: Sub-task
  Components: Client, IPC/RPC
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7188-0.patch, HBASE-7188-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7440) ReplicationZookeeper#addPeer is racy


[ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539824#comment-13539824
 ] 

Ted Yu commented on HBASE-7440:
---

{code}
+// There is a race b/w PeerWatcher and ReplicationZookeeper#add method 
to create the peer-state znode. 
{code}
Wrap long line above.
{code}
+  } catch (KeeperException.NodeExistsException nee) {
+LOG.warn(StateNode for  + peerStateNode
{code}
Shall we verify that the node carries data of PeerState.ENABLED ?
For the change in ReplicationZookeeper.java, similar comment applies in case of 
KeeperException.NodeExistsException.

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7440-v0.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7321) Simple Flush Snapshot


[ 
https://issues.apache.org/jira/browse/HBASE-7321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539830#comment-13539830
 ] 

Jonathan Hsieh commented on HBASE-7321:
---

Seems the unit tests are still flakey on the snapshot-work-1223 verison. :(

 Simple Flush Snapshot
 -

 Key: HBASE-7321
 URL: https://issues.apache.org/jira/browse/HBASE-7321
 Project: HBase
  Issue Type: Sub-task
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-7321.v2.patch, pre-hbase-7321.v2.patch


 This snapshot style just issues a region flush and then snapshots the 
 region.  
 This is a simple implementation that gives the equivalent of copytable 
 consistency.  While by most definitions of consistency if a client writes A 
 and then write B to different region servers, only neither, only A, or both 
 A+B writes should be present, this one allows the only B case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5776) HTableMultiplexer

2012-12-26 Thread binlijin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539832#comment-13539832
 ] 

binlijin commented on HBASE-5776:
-

@Ted, thanks , the patch is fine, +1 on the Patch

 HTableMultiplexer 
 --

 Key: HBASE-5776
 URL: https://issues.apache.org/jira/browse/HBASE-5776
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: binlijin
 Fix For: 0.96.0

 Attachments: 5776-trunk-V3.patch, 5776-trunk-V4.patch, 
 5776-trunk-V5.patch, 5776-trunk-V6.patch, 5776-trunk-V7.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.1.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.1.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.2.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.2.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.3.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.4.patch, 
 ASF.LICENSE.NOT.GRANTED--D2775.5.patch, HBASE-5776-trunk.patch, 
 HBASE-5776-trunk-V2.patch


 There is a known issue in HBase client that single slow/dead region server 
 could slow down the multiput operations across all the region servers. So the 
 HBase client will be as slow as the slowest region server in the cluster. 
  
 To solve this problem, HTableMultiplexer will separate the multiput 
 submitting threads with the flush threads, which means the multiput operation 
 will be a nonblocking operation. 
 The submitting thread will shard all the puts into different queues based on 
 its destination region server and return immediately. The flush threads will 
 flush these puts from each queue to its destination region server. 
 Currently the HTableMultiplexer only supports the put operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7440) ReplicationZookeeper#addPeer is racy


 [ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7440:
---

Attachment: HBASE-7440-v1.patch

Wrapping the comment line. 

Since it is creating the peer-state znode and it has default value ENABLED, 
adding a check looks to be redundant. I can add that if you think otherwise.

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7440-v0.patch, HBASE-7440-v1.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7440) ReplicationZookeeper#addPeer is racy


[ 
https://issues.apache.org/jira/browse/HBASE-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539847#comment-13539847
 ] 

Lars Hofhansl commented on HBASE-7440:
--

Could we use ZKUtil.createNodeIfNotExistsAndWatch for this?

 ReplicationZookeeper#addPeer is racy
 

 Key: HBASE-7440
 URL: https://issues.apache.org/jira/browse/HBASE-7440
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7440-v0.patch, HBASE-7440-v1.patch


 While adding a peer, ReplicationZK does the znodes creation in three 
 transactions. Create :
 a) peers znode
 b) peerId specific znode, and
 c) peerState znode
 There is a PeerWatcher which invokes getPeer() (after steps b) and c)). If it 
 happens that while adding a peer, the control flows to getPeer() and step c) 
 has not been processed, it may results in a state where the peer will not be 
 added. This happens while running 
 TestMasterReplication#testCyclicReplication().
 {code}
 2012-12-26 07:36:35,187 INFO  
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 zookeeper.RecoverableZooKeeper(447): Node /2/replication/peers/1/peer-state 
 already exists and this is not a retry
 2012-12-26 07:36:35,188 ERROR 
 [RegionServer:0;p0120.X,38423,1356536179470-EventThread] 
 regionserver.ReplicationSourceManager$PeersWatcher(527): Error while adding a 
 new peer
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /2/replication/peers/1/peer-state
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:428)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:410)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1044)
   at 
 org.apache.hadoop.hbase.replication.ReplicationPeer.startStateTracker(ReplicationPeer.java:82)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:344)
   at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:307)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$PeersWatcher.nodeChildrenChanged(ReplicationSourceManager.java:519)
   at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-26 07:36:35,188 DEBUG 
 [RegionServer:0;p0120.X,55742,1356536171947-EventThread] 
 zookeeper.ZKUtil(1545): regionserver:55742-0x13bd7db39580004 Retrieved 36 
 byte(s) of data from znode /1/hbaseid; data=9ce66123-d3e8-4ae9-a249-afe03...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7258) Hbase needs to create baseZNode recursivly

2012-12-26 Thread Liu Shaohui (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539517#comment-13539517
 ] 

Liu Shaohui commented on HBASE-7258:


What's other suggestions about this issue? 
[~stack][~ramkrishna.vasude...@huawei.com]

 Hbase needs to create baseZNode recursivly
 --

 Key: HBASE-7258
 URL: https://issues.apache.org/jira/browse/HBASE-7258
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.94.2
Reporter: Liu Shaohui
Priority: Minor
  Labels: zookeeper, zookeeper.znode.parent
 Attachments: HBASE-7258.diff, HBASE-7258.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 In deploy env, multi small hbase clusters may share a same zk cluster. So, 
 for hbase cluster1, its parent znode is /hbase/cluster1. But in hbase version 
 0.94.1, hbase use ZKUtil.createAndFailSilent(this, baseZNode) to create 
 parent path and it will throw a NoNode exception if znode /hbase donot exist.
 We want to change it to ZKUtil.createWithParents(this, baseZNode); to suport 
 create baseZNode recursivly. 
 The NoNode exception is:
 java.lang.RuntimeException: Failed construction of Master: class 
 org.apache.hadoop.hbase.master.HMaster
 at 
 org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1792)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:146)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:77)
 at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1806)
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /hbase/cluster1
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:420)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:402)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:905)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:166)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:159)
 at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:282)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.apache.hadoop.hbase.master.H

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7258) Hbase needs to create baseZNode recursivly

2012-12-26 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539519#comment-13539519
 ] 

ramkrishna.s.vasudevan commented on HBASE-7258:
---

Pls give a patch for 0.94 also.  I can commit it today.

 Hbase needs to create baseZNode recursivly
 --

 Key: HBASE-7258
 URL: https://issues.apache.org/jira/browse/HBASE-7258
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.94.2
Reporter: Liu Shaohui
Priority: Minor
  Labels: zookeeper, zookeeper.znode.parent
 Attachments: HBASE-7258.diff, HBASE-7258.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 In deploy env, multi small hbase clusters may share a same zk cluster. So, 
 for hbase cluster1, its parent znode is /hbase/cluster1. But in hbase version 
 0.94.1, hbase use ZKUtil.createAndFailSilent(this, baseZNode) to create 
 parent path and it will throw a NoNode exception if znode /hbase donot exist.
 We want to change it to ZKUtil.createWithParents(this, baseZNode); to suport 
 create baseZNode recursivly. 
 The NoNode exception is:
 java.lang.RuntimeException: Failed construction of Master: class 
 org.apache.hadoop.hbase.master.HMaster
 at 
 org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1792)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:146)
 at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:77)
 at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1806)
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /hbase/cluster1
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
 at 
 org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:420)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:402)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:905)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:166)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:159)
 at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:282)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.apache.hadoop.hbase.master.H

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7403) Online Merge

2012-12-26 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539578#comment-13539578
 ] 

ramkrishna.s.vasudevan commented on HBASE-7403:
---

I have gone thro the code at high level.
One doubt is regarding the prepare() method.
{code}
if (regionA.isSplit() || regionA.isOffline()) {
+  available = false;
+  LOG.debug(Region  + regionA.getRegionNameAsString()
+  +  is split or offline);
+} else if (regionB.isSplit() || regionB.isOffline()) {
+  available = false;
+  LOG.debug(Region  + regionB.getRegionNameAsString()
+  +  is split or offline);
...
{code}
Should these 'elseifs' all be 'ifs'.  Because even if RegionB is not available 
we should not be able to merge right?

 Online Merge
 

 Key: HBASE-7403
 URL: https://issues.apache.org/jira/browse/HBASE-7403
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.3
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.5

 Attachments: 7403-trunkv5.patch, hbase-7403-94v1.patch, 
 hbase-7403-trunkv1.patch, hbase-7403-trunkv5.patch, hbase-7403-trunkv5.patch, 
 merge region.pdf


 The feature of this online merge:
 1.Online,no necessary to disable table
 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
 3.Easy to call merege request, no need to input a long region name, only 
 encoded name enough
 4.No limit when operation, you don't need to care event like Server Dead, 
 Balance, Split, Disabing/Enabing table, no need to care whether you send a 
 wrong merge request, it alread have done for you
 5.Only little offline time for two merging regions
 We need merge in the following cases：
 1.Region hole or region overlap, can’t be fix by hbck
 2.Region become empty because of TTL and not reasonable Rowkey design
 3.Region is always empty or very small because of presplit when create table
 4.Too many empty or small regions would reduce the system performance(e.g. 
 mslab)
 Current merge tools only support offline and are not able to redo if 
 exception is thrown in the process of merging, causing a dirty data
 For online system, we need a online merge.
 This implement logic of this patch for  Online Merge is :
 For example, merge regionA and regionB into regionC
 1.Offline the two regions A and B
 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
 regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
 regionB’s directory)
 3.Add the merged regionC to .META.
 4.Assign the merged regionC
 As design of this patch , once we do the merge work in the HDFS,we could redo 
 it until successful if it throws exception or abort or server restart, but 
 couldn’t be rolled back. 
 It depends on
 Use zookeeper to record the transaction journal state, make redo easier
 Use zookeeper to send/receive merge request
 Merge transaction is executed on the master
 Support calling merge request through API or shell tool
 About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7186) Split Classes for Client/Server module split.


[ 
https://issues.apache.org/jira/browse/HBASE-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539579#comment-13539579
 ] 

Elliott Clark commented on HBASE-7186:
--

[~stack] Yep.  I'll get an addendum up right away for that.

 Split Classes for Client/Server module split.
 -

 Key: HBASE-7186
 URL: https://issues.apache.org/jira/browse/HBASE-7186
 Project: HBase
  Issue Type: Sub-task
  Components: Client, Protobufs
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7186-0.patch, HBASE-7186-1.patch, 
 HBASE-7186-2.patch, HBASE-7186-3.patch, HBASE-7186-4.patch, HBASE-7186-5.patch


 Prepare classes for the coming hbase-client/hbase-server split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7418) HFileLink flaky tests


 [ 
https://issues.apache.org/jira/browse/HBASE-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-7418:
--

   Resolution: Fixed
Fix Version/s: hbase-6055
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I committed to hbase-6055 branch.  I can't really tell of this makes a 
difference correctness wise, but removing pauses to make them run fast is worth 
it IMO.

 HFileLink flaky tests
 -

 Key: HBASE-7418
 URL: https://issues.apache.org/jira/browse/HBASE-7418
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: hbase-6055

 Attachments: HBASE-7418-v0.patch


 TestStoreFile and TestHFileLinkCleaner seems flaky.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7339) Splitting a hfilelink causes region servers to go down.

[
https://issues.apache.org/jira/browse/HBASE-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Hsieh updated HBASE-7339:
--

Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)

Committed to hbase-6055. After several runs, it the failing tests are flaky on
trunk and before this patch goes in.

Splitting a hfilelink causes region servers to go down.
---

Key: HBASE-7339
URL: https://issues.apache.org/jira/browse/HBASE-7339
Project: HBase
Issue Type: Sub-task
Components: snapshots
Affects Versions: hbase-6055
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
Fix For: hbase-6055

Attachments: hbase-7339.patch, hbase-7339.v2.patch,
pre-hbase-7339.patch, pre-hbase-7339.v2.patch

Steps:
- Have a single region table t with 15 hfiles in it.
- Snapshot it. (was done using online snapshot from HBASE-7321)
- Clone a snapshot to table t'.
- t' has its region do a post-open task that attempts to compact region.
policy does not compact all files. (default seems to be 10)
- after compaction we have hfile links and real hfiles mixed in the region
- t' starts splitting
- creating split references, opening daughers fails
- hfile links are split, creating hfile link daughter refs.
{{hfile\-region\-table.parentregion}}
- these split hfile links are interpreted as hfile links with table
{{table.parentregion}} -
{{hfile\-region\-table.parentregion}} (groupings interpreted
incorrectly)
- Since this is after the splitting PONR, this aborts the server. It then
spreads to the next server.

[jira] [Reopened] (HBASE-7186) Split Classes for Client/Server module split.


 [ 
https://issues.apache.org/jira/browse/HBASE-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark reopened HBASE-7186:
--


re-opening for addendum.

 Split Classes for Client/Server module split.
 -

 Key: HBASE-7186
 URL: https://issues.apache.org/jira/browse/HBASE-7186
 Project: HBase
  Issue Type: Sub-task
  Components: Client, Protobufs
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7186-0.patch, HBASE-7186-1.patch, 
 HBASE-7186-2.patch, HBASE-7186-3.patch, HBASE-7186-4.patch, 
 HBASE-7186-5.patch, HBASE-7186-ADD-0.patch


 Prepare classes for the coming hbase-client/hbase-server split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7186) Split Classes for Client/Server module split.


 [ 
https://issues.apache.org/jira/browse/HBASE-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7186:
-

Attachment: HBASE-7186-ADD-0.patch

 Split Classes for Client/Server module split.
 -

 Key: HBASE-7186
 URL: https://issues.apache.org/jira/browse/HBASE-7186
 Project: HBase
  Issue Type: Sub-task
  Components: Client, Protobufs
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7186-0.patch, HBASE-7186-1.patch, 
 HBASE-7186-2.patch, HBASE-7186-3.patch, HBASE-7186-4.patch, 
 HBASE-7186-5.patch, HBASE-7186-ADD-0.patch


 Prepare classes for the coming hbase-client/hbase-server split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7186) Split Classes for Client/Server module split.


 [ 
https://issues.apache.org/jira/browse/HBASE-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-7186:
-

Affects Version/s: 0.96.0
   Status: Patch Available  (was: Reopened)

 Split Classes for Client/Server module split.
 -

 Key: HBASE-7186
 URL: https://issues.apache.org/jira/browse/HBASE-7186
 Project: HBase
  Issue Type: Sub-task
  Components: Client, Protobufs
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7186-0.patch, HBASE-7186-1.patch, 
 HBASE-7186-2.patch, HBASE-7186-3.patch, HBASE-7186-4.patch, 
 HBASE-7186-5.patch, HBASE-7186-ADD-0.patch


 Prepare classes for the coming hbase-client/hbase-server split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7186) Split Classes for Client/Server module split.


 [ 
https://issues.apache.org/jira/browse/HBASE-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7186:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Resolving again.  No worries Elliott.  I thought it was needed.  The fact that 
it just left some unused imports is fine.  I take care of it over in the ipc 
issue I have up.

 Split Classes for Client/Server module split.
 -

 Key: HBASE-7186
 URL: https://issues.apache.org/jira/browse/HBASE-7186
 Project: HBase
  Issue Type: Sub-task
  Components: Client, Protobufs
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7186-0.patch, HBASE-7186-1.patch, 
 HBASE-7186-2.patch, HBASE-7186-3.patch, HBASE-7186-4.patch, 
 HBASE-7186-5.patch, HBASE-7186-ADD-0.patch


 Prepare classes for the coming hbase-client/hbase-server split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7186) Split Classes for Client/Server module split.


[ 
https://issues.apache.org/jira/browse/HBASE-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539603#comment-13539603
 ] 

Elliott Clark commented on HBASE-7186:
--

Awesome thanks.

 Split Classes for Client/Server module split.
 -

 Key: HBASE-7186
 URL: https://issues.apache.org/jira/browse/HBASE-7186
 Project: HBase
  Issue Type: Sub-task
  Components: Client, Protobufs
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: HBASE-7186-0.patch, HBASE-7186-1.patch, 
 HBASE-7186-2.patch, HBASE-7186-3.patch, HBASE-7186-4.patch, 
 HBASE-7186-5.patch, HBASE-7186-ADD-0.patch


 Prepare classes for the coming hbase-client/hbase-server split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7212) Globally Barriered Procedure mechanism

[
https://issues.apache.org/jira/browse/HBASE-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539605#comment-13539605
]

Jonathan Hsieh commented on HBASE-7212:
---

waitForLatch got moved out of the ErrorHandling patch over to here -- but this
is used in follow on patches with code that resides in different packages.

I agree with createProcedure, I've made it package protected. (there was some
concern about mokito limitations)

Thanks for the pointer to guava's MapMaker and the example in CoprocessorHost.
I've moved over to that api and removed WeakValueMapping.

Globally Barriered Procedure mechanism
--

Key: HBASE-7212
URL: https://issues.apache.org/jira/browse/HBASE-7212
Project: HBase
Issue Type: Sub-task
Components: snapshots
Affects Versions: hbase-6055
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Fix For: hbase-6055

Attachments: 121127-global-barrier-proc.pdf, hbase-7212.patch,
hbase-7212.v5.patch, pre-hbase-7212.patch, pre-hbase-7212.v5.patch

This is a simplified version of what was proposed in HBASE-6573. Instead of
claiming to be a 2pc or 3pc implementation (which implies logging at each
actor, and recovery operations) this is just provides a best effort global
barrier mechanism called a Procedure.
Users need only to implement a methods to acquireBarrier, to act when
insideBarrier, and to releaseBarrier that use the ExternalException
cooperative error checking mechanism.
Globally consistent snapshots require the ability to quiesce writes to a set
of region servers before a the snapshot operation is executed. Also if any
node fails, it needs to be able to notify them so that they abort.
The first cut of other online snapshots don't need the fully barrier but may
still use this for its error propagation mechanisms.
This version removes the extra layer incurred in the previous implementation
due to the use of generics, separates the coordinator and members, and
reduces the amount of inheritance used in favor of composition.

[jira] [Commented] (HBASE-7294) Check for snapshot file cleaners on start


[ 
https://issues.apache.org/jira/browse/HBASE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539614#comment-13539614
 ] 

Jonathan Hsieh commented on HBASE-7294:
---

We really push folks to use hbase.snapshot.enabled

nit:

Should we update this to describe hbase.snapshot.enabled?  (seems simpler)  
also mention that you have data in the .snapshots dir (might be strange if it 
works with the same config, but later does not.  at least if we mention the dir 
users will have a pointer and an option to remove snapshots.
{code}
+  /**
+   * Throws an exception if snapshot operations (take a snapshot, restore, 
clone) are not supported.
+   * Called at the beginning of snapshot() and restoreSnapshot() methods.
+   * @throws UnsupportedOperationException if snapshot are not supported
+   */
+  public void checkSnapshotSupport() throws UnsupportedOperationException {
+if (!this.isSnapshotSupported) {
+  throw new UnsupportedOperationException(
+To use snapshots, the HBase Master must have the proper archive 
cleaners enabled.  +
+You must add to the hbase-site.xml:  +
+' + HFileCleaner.MASTER_HFILE_CLEANER_PLUGINS + ' with ' +
+HFileLinkCleaner.class.getName() + ', ' + 
SnapshotHFileCleaner.class.getName() +
+' support. And add ' + HConstants.HBASE_MASTER_LOGCLEANER_PLUGINS + 
' with ' +
+SnapshotLogCleaner.class.getName() + ' support.);
+}
+  }
{code}

We should behave the same but emit a warning if folks are using the plugins and 
not the hbase.snapshot.enabled property.

We should have a test the verifies snapshots work if configured one way or the 
other (and possibly a negative test exercing the case where it does not.




 Check for snapshot file cleaners on start
 -

 Key: HBASE-7294
 URL: https://issues.apache.org/jira/browse/HBASE-7294
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jesse Yates
Assignee: Matteo Bertozzi
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7294-v1.patch, HBASE-7294-v2.patch, 
 HBASE-7294-v3.patch, HBASE-7294-v4.patch, HBASE-7294-v5.patch


 Snapshots currently use the SnaphotHfileCleaner and SnapshotHLogCleaner to 
 ensure that any hfiles or hlogs (respectively) that are currently part of a 
 snapshot are not removed from their respective archive directories (.archive 
 and .oldlogs).
 From Matteo Bertozzi:
 {quote}
 currently the snapshot cleaner is not in hbase-default.xml
 and there's no warning/exception on snapshot/restore operation, if not 
 enabled.
 even if we add the cleaner to the hbase-default.xml how do we ensure that the 
 user doesn't remove it?
 Do we want to hardcode the cleaner at master startup?
 Do we want to add a check in snapshot/restore that throws an exception if the 
 cleaner is not enabled?
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HBASE-7294) Check for snapshot file cleaners on start


[ 
https://issues.apache.org/jira/browse/HBASE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539614#comment-13539614
 ] 

Jonathan Hsieh edited comment on HBASE-7294 at 12/26/12 5:54 PM:
-

We really should push folks to use hbase.snapshot.enabled

nit:

Should we update this to describe hbase.snapshot.enabled?  (seems simpler)  
also mention that you have data in the .snapshots dir (might be strange if it 
works with the same config, but later does not.  at least if we mention the dir 
users will have a pointer and an option to remove snapshots.
{code}
+  /**
+   * Throws an exception if snapshot operations (take a snapshot, restore, 
clone) are not supported.
+   * Called at the beginning of snapshot() and restoreSnapshot() methods.
+   * @throws UnsupportedOperationException if snapshot are not supported
+   */
+  public void checkSnapshotSupport() throws UnsupportedOperationException {
+if (!this.isSnapshotSupported) {
+  throw new UnsupportedOperationException(
+To use snapshots, the HBase Master must have the proper archive 
cleaners enabled.  +
+You must add to the hbase-site.xml:  +
+' + HFileCleaner.MASTER_HFILE_CLEANER_PLUGINS + ' with ' +
+HFileLinkCleaner.class.getName() + ', ' + 
SnapshotHFileCleaner.class.getName() +
+' support. And add ' + HConstants.HBASE_MASTER_LOGCLEANER_PLUGINS + 
' with ' +
+SnapshotLogCleaner.class.getName() + ' support.);
+}
+  }
{code}

We should behave the same but emit a warning if folks are using the plugins and 
not the hbase.snapshot.enabled property.

We should have a test the verifies snapshots work if configured one way or the 
other (and possibly a negative test exercing the case where it does not.




  was (Author: jmhsieh):
We really push folks to use hbase.snapshot.enabled

nit:

Should we update this to describe hbase.snapshot.enabled?  (seems simpler)  
also mention that you have data in the .snapshots dir (might be strange if it 
works with the same config, but later does not.  at least if we mention the dir 
users will have a pointer and an option to remove snapshots.
{code}
+  /**
+   * Throws an exception if snapshot operations (take a snapshot, restore, 
clone) are not supported.
+   * Called at the beginning of snapshot() and restoreSnapshot() methods.
+   * @throws UnsupportedOperationException if snapshot are not supported
+   */
+  public void checkSnapshotSupport() throws UnsupportedOperationException {
+if (!this.isSnapshotSupported) {
+  throw new UnsupportedOperationException(
+To use snapshots, the HBase Master must have the proper archive 
cleaners enabled.  +
+You must add to the hbase-site.xml:  +
+' + HFileCleaner.MASTER_HFILE_CLEANER_PLUGINS + ' with ' +
+HFileLinkCleaner.class.getName() + ', ' + 
SnapshotHFileCleaner.class.getName() +
+' support. And add ' + HConstants.HBASE_MASTER_LOGCLEANER_PLUGINS + 
' with ' +
+SnapshotLogCleaner.class.getName() + ' support.);
+}
+  }
{code}

We should behave the same but emit a warning if folks are using the plugins and 
not the hbase.snapshot.enabled property.

We should have a test the verifies snapshots work if configured one way or the 
other (and possibly a negative test exercing the case where it does not.



  
 Check for snapshot file cleaners on start
 -

 Key: HBASE-7294
 URL: https://issues.apache.org/jira/browse/HBASE-7294
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jesse Yates
Assignee: Matteo Bertozzi
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7294-v1.patch, HBASE-7294-v2.patch, 
 HBASE-7294-v3.patch, HBASE-7294-v4.patch, HBASE-7294-v5.patch


 Snapshots currently use the SnaphotHfileCleaner and SnapshotHLogCleaner to 
 ensure that any hfiles or hlogs (respectively) that are currently part of a 
 snapshot are not removed from their respective archive directories (.archive 
 and .oldlogs).
 From Matteo Bertozzi:
 {quote}
 currently the snapshot cleaner is not in hbase-default.xml
 and there's no warning/exception on snapshot/restore operation, if not 
 enabled.
 even if we add the cleaner to the hbase-default.xml how do we ensure that the 
 user doesn't remove it?
 Do we want to hardcode the cleaner at master startup?
 Do we want to add a check in snapshot/restore that throws an exception if the 
 cleaner is not enabled?
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7352) clone operation from HBaseAdmin can hang forever.


[ 
https://issues.apache.org/jira/browse/HBASE-7352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539620#comment-13539620
 ] 

Jonathan Hsieh commented on HBASE-7352:
---

Thanks for the links.  

Shouldn't we throw an exception of some sort if we fail after the max number of 
tries right?

 clone operation from HBaseAdmin can hang forever.
 -

 Key: HBASE-7352
 URL: https://issues.apache.org/jira/browse/HBASE-7352
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Reporter: Jonathan Hsieh
Assignee: Matteo Bertozzi
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7352-v0.patch


 Sometimes the clone operation from the hbase shell can hang.  The table has 
 been created (it shows up in the web ui), but does not have any entries in 
 META.
 There don't seem to be any clone, snapshot, enable or disable found in the 
 master's jstack.
 Here's a trace from the HBaseAdmin:
 {code}
 main prio=10 tid=0x7f782800d000 nid=0x25c waiting on condition 
 [0x7f782f9bf000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.cloneSnapshot(HBaseAdmin.java:2413)
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.cloneSnapshot(HBaseAdmin.java:2393)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:465)
 at org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:323)
 at 
 org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:69)
 at 
 org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:201)
 at org.jruby.ast.CallTwoArgNode.interpret(CallTwoArgNode.java:59)
 at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
 ... (more jruby stack) ... 
 {code}  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HBASE-7352) clone operation from HBaseAdmin can hang forever.


[ 
https://issues.apache.org/jira/browse/HBASE-7352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539620#comment-13539620
 ] 

Jonathan Hsieh edited comment on HBASE-7352 at 12/26/12 6:01 PM:
-

Thanks for the links.  

Shouldn't we throw an exception of some sort if we fail after the max number of 
tries?

  was (Author: jmhsieh):
Thanks for the links.  

Shouldn't we throw an exception of some sort if we fail after the max number of 
tries right?
  
 clone operation from HBaseAdmin can hang forever.
 -

 Key: HBASE-7352
 URL: https://issues.apache.org/jira/browse/HBASE-7352
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Reporter: Jonathan Hsieh
Assignee: Matteo Bertozzi
 Fix For: hbase-6055, 0.96.0

 Attachments: HBASE-7352-v0.patch


 Sometimes the clone operation from the hbase shell can hang.  The table has 
 been created (it shows up in the web ui), but does not have any entries in 
 META.
 There don't seem to be any clone, snapshot, enable or disable found in the 
 master's jstack.
 Here's a trace from the HBaseAdmin:
 {code}
 main prio=10 tid=0x7f782800d000 nid=0x25c waiting on condition 
 [0x7f782f9bf000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.cloneSnapshot(HBaseAdmin.java:2413)
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.cloneSnapshot(HBaseAdmin.java:2393)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:465)
 at org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:323)
 at 
 org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:69)
 at 
 org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:201)
 at org.jruby.ast.CallTwoArgNode.interpret(CallTwoArgNode.java:59)
 at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:104)
 ... (more jruby stack) ... 
 {code}  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HBASE-7433) org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: servers with issues: slave3.hadoop:60020


 [ 
https://issues.apache.org/jira/browse/HBASE-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-7433:
--


  org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
 1 action: servers with issues: slave3.hadoop:60020
 ---

 Key: HBASE-7433
 URL: https://issues.apache.org/jira/browse/HBASE-7433
 Project: HBase
  Issue Type: Bug
  Components: Balancer, regionserver
Affects Versions: 0.94.1
 Environment: linux ,hbase0.94.1 ,hadoop1.0.3,zookeeper3.4.3
Reporter: gavin peng
Priority: Blocker
  Labels: balance, regionserver
   Original Estimate: 193h
  Remaining Estimate: 193h

 when i use the client of kettler insert data into hbase table by put 
 operator,it is no problem when the table only has one region,but the data is 
 also too much,so i want to load balancing,so i created many region when i 
 created the table,this time problem is producted,exception is follow:
  Problem inserting row into HBase: Failed 1 action: servers with issues: 
 slave2.hadoop:60020, 
  at 
 org.pentaho.di.trans.steps.hbaseoutput.HBaseOutput.processRow(HBaseOutput.java:316)
  at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50)
  at java.lang.Thread.run(Unknown Source)
 Caused by: 
 org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
 action: servers with issues: slave2.hadoop:60020, 
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1601)
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1377)
 at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916)
 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:772)
 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:747)
 at 
 org.pentaho.hbase.shim.common.CommonHBaseConnection.executeTargetTablePut(CommonHBaseConnection.java:732)
 at 
 org.pentaho.di.trans.steps.hbaseoutput.HBaseOutput.processRow(HBaseOutput.java:307)
 we know the client find the rowkey belong to the region of the table,but 
 can't connection the regionserver.but the regionserver is ok,but not running 
 gc
 so i can't understand the problem,please help me ,thanks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-7433) org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: servers with issues: slave3.hadoop:60020


 [ 
https://issues.apache.org/jira/browse/HBASE-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-7433.
--

Resolution: Invalid

  org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
 1 action: servers with issues: slave3.hadoop:60020
 ---

 Key: HBASE-7433
 URL: https://issues.apache.org/jira/browse/HBASE-7433
 Project: HBase
  Issue Type: Bug
  Components: Balancer, regionserver
Affects Versions: 0.94.1
 Environment: linux ,hbase0.94.1 ,hadoop1.0.3,zookeeper3.4.3
Reporter: gavin peng
Priority: Blocker
  Labels: balance, regionserver
   Original Estimate: 193h
  Remaining Estimate: 193h

 when i use the client of kettler insert data into hbase table by put 
 operator,it is no problem when the table only has one region,but the data is 
 also too much,so i want to load balancing,so i created many region when i 
 created the table,this time problem is producted,exception is follow:
  Problem inserting row into HBase: Failed 1 action: servers with issues: 
 slave2.hadoop:60020, 
  at 
 org.pentaho.di.trans.steps.hbaseoutput.HBaseOutput.processRow(HBaseOutput.java:316)
  at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50)
  at java.lang.Thread.run(Unknown Source)
 Caused by: 
 org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
 action: servers with issues: slave2.hadoop:60020, 
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1601)
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1377)
 at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916)
 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:772)
 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:747)
 at 
 org.pentaho.hbase.shim.common.CommonHBaseConnection.executeTargetTablePut(CommonHBaseConnection.java:732)
 at 
 org.pentaho.di.trans.steps.hbaseoutput.HBaseOutput.processRow(HBaseOutput.java:307)
 we know the client find the rowkey belong to the region of the table,but 
 can't connection the regionserver.but the regionserver is ok,but not running 
 gc
 so i can't understand the problem,please help me ,thanks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7433) org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: servers with issues: slave3.hadoop:60020


[ 
https://issues.apache.org/jira/browse/HBASE-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539622#comment-13539622
 ] 

Lars Hofhansl commented on HBASE-7433:
--

Marked as invalid. Thanks for checking Gavin.

  org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 
 1 action: servers with issues: slave3.hadoop:60020
 ---

 Key: HBASE-7433
 URL: https://issues.apache.org/jira/browse/HBASE-7433
 Project: HBase
  Issue Type: Bug
  Components: Balancer, regionserver
Affects Versions: 0.94.1
 Environment: linux ,hbase0.94.1 ,hadoop1.0.3,zookeeper3.4.3
Reporter: gavin peng
Priority: Blocker
  Labels: balance, regionserver
   Original Estimate: 193h
  Remaining Estimate: 193h

 when i use the client of kettler insert data into hbase table by put 
 operator,it is no problem when the table only has one region,but the data is 
 also too much,so i want to load balancing,so i created many region when i 
 created the table,this time problem is producted,exception is follow:
  Problem inserting row into HBase: Failed 1 action: servers with issues: 
 slave2.hadoop:60020, 
  at 
 org.pentaho.di.trans.steps.hbaseoutput.HBaseOutput.processRow(HBaseOutput.java:316)
  at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50)
  at java.lang.Thread.run(Unknown Source)
 Caused by: 
 org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
 action: servers with issues: slave2.hadoop:60020, 
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1601)
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1377)
 at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916)
 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:772)
 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:747)
 at 
 org.pentaho.hbase.shim.common.CommonHBaseConnection.executeTargetTablePut(CommonHBaseConnection.java:732)
 at 
 org.pentaho.di.trans.steps.hbaseoutput.HBaseOutput.processRow(HBaseOutput.java:307)
 we know the client find the rowkey belong to the region of the table,but 
 can't connection the regionserver.but the regionserver is ok,but not running 
 gc
 so i can't understand the problem,please help me ,thanks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7437) Improve CompactSelection


[ 
https://issues.apache.org/jira/browse/HBASE-7437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539631#comment-13539631
 ] 

stack commented on HBASE-7437:
--

[~ted_yu] You cannot expect Hiroshi have his patch hit a moving target.  The 
work in hbase-7055 is not committed therefore is out of scope for Hiroshi work.

[~ikeda] Thank you for doing these code read throughs.  You are finding great 
stuff. I have put review comments up on rb.

 Improve CompactSelection
 

 Key: HBASE-7437
 URL: https://issues.apache.org/jira/browse/HBASE-7437
 Project: HBase
  Issue Type: Improvement
  Components: Compaction
Reporter: Hiroshi Ikeda
Assignee: Hiroshi Ikeda
Priority: Minor
 Attachments: HBASE-7437.patch


 1. Using AtomicLong makes CompactSelection simple and improve its performance.
 2. There are unused fields and methods.
 3. The fields should be private.
 4. Assertion in the method finishRequest seems wrong:
 {code}
   public void finishRequest() {
 if (isOffPeakCompaction) {
   long newValueToLog = -1;
   synchronized(compactionCountLock) {
 assert !isOffPeakCompaction : Double-counting off-peak count for 
 compaction;
 {code}
 The above assertion seems almost always false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7423) HFileArchiver should not use the configuration from the Filesystem


 [ 
https://issues.apache.org/jira/browse/HBASE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-7423:
-

Status: Patch Available  (was: Open)

 HFileArchiver should not use the configuration from the Filesystem
 --

 Key: HBASE-7423
 URL: https://issues.apache.org/jira/browse/HBASE-7423
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0, 0.94.4
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-7423_v1.patch, hbase-7423_v2.patch, 
 hbase-7423_v2.patch


 HFileArchiver gets the configuration from the FileSystem in 
 {code}
  public static void archiveRegion(FileSystem fs, HRegionInfo info)
   throws IOException {
 Path rootDir = FSUtils.getRootDir(fs.getConf());
 {code}
 In Pig's test cases, they construct a MiniDFSCluster and pass it to 
 HBaseTestingUtil, which causes the delete table to fail because it will refer 
 to the FileSystem's configuration rather than HBase's one. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)

[
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539640#comment-13539640
]

stack commented on HBASE-7055:
--

[~sershe] The community rules here --
http://hbase.apache.org/book.html#decisions -- say two +1s... they don't say
two +1s and whatever stack says (smile). I was going to suggest that you get
at least a +1 from one of the compaction component owners but it seems like no
one has volunteered for this role (I suppose we made the role as we went
through this issue).

Yeah, in general, I'm trying to work on 0.96 issues. I think it kinda critical
that the project have a 0.96. soon. It has been too long since our last major
version.

My high level concern with this issue, still, is that it is a complex feature
-- not only in operation but in configuration and in interpretation of its
workings -- and complexity is an attribute we could do with less of not more.
The justification for including this feature seems weak as currently written
(The release note currently starts A tier based compaction algorithm has been
developed as though the fact that the feature was written is enough to
justify its inclusion or the design document starts Thegoalof
the compaction selection algorithm is to
schedulecompactions efficiently. but there are no 'fit criteria'
included in the document to prove the mechanism is working 'efficiently').

Let me review the patch up on rb.

port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice
(not configurable by cf or dynamically)
-

Key: HBASE-7055
URL: https://issues.apache.org/jira/browse/HBASE-7055
Project: HBase
Issue Type: Task
Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Fix For: 0.96.0

Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch,
HBASE-6371-v3-refactor-only-squashed.patch,
HBASE-6371-v4-refactor-only-squashed.patch,
HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch,
HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch,
HBASE-7055-v4.patch

There's divergence in the code :(
See HBASE-6371 for details.

[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)


[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539641#comment-13539641
 ] 

stack commented on HBASE-7055:
--

Oh, one thing, thanks for working on this [~sershe]

 port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
 (not configurable by cf or dynamically)
 -

 Key: HBASE-7055
 URL: https://issues.apache.org/jira/browse/HBASE-7055
 Project: HBase
  Issue Type: Task
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.96.0

 Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
 HBASE-6371-v3-refactor-only-squashed.patch, 
 HBASE-6371-v4-refactor-only-squashed.patch, 
 HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
 HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
 HBASE-7055-v4.patch


 There's divergence in the code :(
 See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-7188) Move classes into hbase-client


 [ 
https://issues.apache.org/jira/browse/HBASE-7188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark reassigned HBASE-7188:


Assignee: Elliott Clark

 Move classes into hbase-client
 --

 Key: HBASE-7188
 URL: https://issues.apache.org/jira/browse/HBASE-7188
 Project: HBase
  Issue Type: Sub-task
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.96.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7428) There are multiple HConstants for configuring Zookeeper timeout

2012-12-26 Thread Nick Dimiduk (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-7428:


Attachment: 7428-consolidate-ZK_SESSION_TIMEOUT.0.diff

This patch consolidates on {{ZK_SESSION_TIMEOUT}}, replaces string literals and 
default value literals with references to {{HConstants}}.

After applying this patch, no more occurrences of {{zookeeper.session.timeout}} 
are found:

{code}
$ grep -riIn 'zookeeper\.session\.timeout' */src/*
hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java:173:  public 
static final String ZK_SESSION_TIMEOUT = zookeeper.session.timeout;
hbase-server/src/main/resources/hbase-default.xml:619:
namezookeeper.session.timeout/name
{code}

 There are multiple HConstants for configuring Zookeeper timeout
 ---

 Key: HBASE-7428
 URL: https://issues.apache.org/jira/browse/HBASE-7428
 Project: HBase
  Issue Type: Improvement
Reporter: Nick Dimiduk
Priority: Minor
  Labels: noob
 Attachments: 7428-consolidate-ZK_SESSION_TIMEOUT.0.diff


 From [~te...@apache.org] to dev@:
 {quote}
 There are two constants with the same value:
 HConstants.ZOOKEEPER_SESSION_TIMEOUT and HConstants.ZK_SESSION_TIMEOUT
 HConstants.ZOOKEEPER_SESSION_TIMEOUT is only used in tests.
 HConstants.ZK_SESSION_TIMEOUT is used by ZKUtil
 Shall we remove HConstants.ZOOKEEPER_SESSION_TIMEOUT and let tests use
 HConstants.ZK_SESSION_TIMEOUT ?
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7428) There are multiple HConstants for configuring Zookeeper timeout

2012-12-26 Thread Nick Dimiduk (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-7428:


Fix Version/s: 0.96.0
 Assignee: Nick Dimiduk
   Status: Patch Available  (was: Open)

 There are multiple HConstants for configuring Zookeeper timeout
 ---

 Key: HBASE-7428
 URL: https://issues.apache.org/jira/browse/HBASE-7428
 Project: HBase
  Issue Type: Improvement
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
  Labels: noob
 Fix For: 0.96.0

 Attachments: 7428-consolidate-ZK_SESSION_TIMEOUT.0.diff


 From [~te...@apache.org] to dev@:
 {quote}
 There are two constants with the same value:
 HConstants.ZOOKEEPER_SESSION_TIMEOUT and HConstants.ZK_SESSION_TIMEOUT
 HConstants.ZOOKEEPER_SESSION_TIMEOUT is only used in tests.
 HConstants.ZK_SESSION_TIMEOUT is used by ZKUtil
 Shall we remove HConstants.ZOOKEEPER_SESSION_TIMEOUT and let tests use
 HConstants.ZK_SESSION_TIMEOUT ?
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7412) Fix how HTableDescriptor handles default max file size and flush size


[ 
https://issues.apache.org/jira/browse/HBASE-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539648#comment-13539648
 ] 

Elliott Clark commented on HBASE-7412:
--

* Should we just make HTableDescriptor take in a Conf in the constructor ?  
Then everything that relies on region.getTableDescriptor won't have any code 
changes.
* It looks like this will cause a few javadoc errors (Hadoop qa missed them but 
it would catch them now I think).  
* I don't think that the forced un-boxing on Long is needed for the two return 
Long.valueOf.

 Fix how HTableDescriptor handles default max file size and flush size
 -

 Key: HBASE-7412
 URL: https://issues.apache.org/jira/browse/HBASE-7412
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: 0.94-7412.patch, trunk-7412.patch


 If the region flush size is not set in the table, 
 IncreasingToUpperBoundRegionSplitPolicy will most likely always use the 
 default value: 128MB, even if the flush size is set to a different value in 
 hbase-site.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7428) There are multiple HConstants for configuring Zookeeper timeout


[ 
https://issues.apache.org/jira/browse/HBASE-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539649#comment-13539649
 ] 

Ted Yu commented on HBASE-7428:
---

Patch looks good. 

 There are multiple HConstants for configuring Zookeeper timeout
 ---

 Key: HBASE-7428
 URL: https://issues.apache.org/jira/browse/HBASE-7428
 Project: HBase
  Issue Type: Improvement
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
  Labels: noob
 Fix For: 0.96.0

 Attachments: 7428-consolidate-ZK_SESSION_TIMEOUT.0.diff


 From [~te...@apache.org] to dev@:
 {quote}
 There are two constants with the same value:
 HConstants.ZOOKEEPER_SESSION_TIMEOUT and HConstants.ZK_SESSION_TIMEOUT
 HConstants.ZOOKEEPER_SESSION_TIMEOUT is only used in tests.
 HConstants.ZK_SESSION_TIMEOUT is used by ZKUtil
 Shall we remove HConstants.ZOOKEEPER_SESSION_TIMEOUT and let tests use
 HConstants.ZK_SESSION_TIMEOUT ?
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7301) Force ipv4 for unit tests


[ 
https://issues.apache.org/jira/browse/HBASE-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539650#comment-13539650
 ] 

Enis Soztutar commented on HBASE-7301:
--

@lars, which tests still fail for you? Do they timeout? 

 Force ipv4 for unit tests
 -

 Key: HBASE-7301
 URL: https://issues.apache.org/jira/browse/HBASE-7301
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Jean-Daniel Cryans
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7301.094.txt, HBASE-7301.patch


 These two tests are failing when I run them locally:
 Failed tests:   
 testMultiSlaveReplication(org.apache.hadoop.hbase.replication.TestMultiSlaveReplication):
  Waited too much time for put replication
   
 testCyclicReplication(org.apache.hadoop.hbase.replication.TestMasterReplication):
  Waited too much time for put replication
   
 testSimplePutDelete(org.apache.hadoop.hbase.replication.TestMasterReplication):
  Waited too much time for put replication
 The TestMasterReplication is NPE'ing
 Mighty JD said he'd take a looksee.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7428) There are multiple HConstants for configuring Zookeeper timeout


[ 
https://issues.apache.org/jira/browse/HBASE-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539656#comment-13539656
 ] 

Jonathan Hsieh commented on HBASE-7428:
---

+1 lgtm.  let the robot compile it and then commit.

 There are multiple HConstants for configuring Zookeeper timeout
 ---

 Key: HBASE-7428
 URL: https://issues.apache.org/jira/browse/HBASE-7428
 Project: HBase
  Issue Type: Improvement
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
  Labels: noob
 Fix For: 0.96.0

 Attachments: 7428-consolidate-ZK_SESSION_TIMEOUT.0.diff


 From [~te...@apache.org] to dev@:
 {quote}
 There are two constants with the same value:
 HConstants.ZOOKEEPER_SESSION_TIMEOUT and HConstants.ZK_SESSION_TIMEOUT
 HConstants.ZOOKEEPER_SESSION_TIMEOUT is only used in tests.
 HConstants.ZK_SESSION_TIMEOUT is used by ZKUtil
 Shall we remove HConstants.ZOOKEEPER_SESSION_TIMEOUT and let tests use
 HConstants.ZK_SESSION_TIMEOUT ?
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)


[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539660#comment-13539660
 ] 

stack commented on HBASE-7055:
--

I did a quick skim of the patch.  There's a few questions up on rb.

 port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
 (not configurable by cf or dynamically)
 -

 Key: HBASE-7055
 URL: https://issues.apache.org/jira/browse/HBASE-7055
 Project: HBase
  Issue Type: Task
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.96.0

 Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
 HBASE-6371-v3-refactor-only-squashed.patch, 
 HBASE-6371-v4-refactor-only-squashed.patch, 
 HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
 HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
 HBASE-7055-v4.patch


 There's divergence in the code :(
 See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7301) Force ipv4 for unit tests


[ 
https://issues.apache.org/jira/browse/HBASE-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539662#comment-13539662
 ] 

Lars Hofhansl commented on HBASE-7301:
--

I think this was a dud. Sorry for the noise.

 Force ipv4 for unit tests
 -

 Key: HBASE-7301
 URL: https://issues.apache.org/jira/browse/HBASE-7301
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Jean-Daniel Cryans
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7301.094.txt, HBASE-7301.patch


 These two tests are failing when I run them locally:
 Failed tests:   
 testMultiSlaveReplication(org.apache.hadoop.hbase.replication.TestMultiSlaveReplication):
  Waited too much time for put replication
   
 testCyclicReplication(org.apache.hadoop.hbase.replication.TestMasterReplication):
  Waited too much time for put replication
   
 testSimplePutDelete(org.apache.hadoop.hbase.replication.TestMasterReplication):
  Waited too much time for put replication
 The TestMasterReplication is NPE'ing
 Mighty JD said he'd take a looksee.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops

Lars Hofhansl created HBASE-7438:


 Summary: TestSplitTransactionOnCluster has too many infinite loops
 Key: HBASE-7438
 URL: https://issues.apache.org/jira/browse/HBASE-7438
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.4


There are many cases in these test where we loop until a condition happens. If 
that condition never occurs we'll wait forever, and the test will time out 
instead of failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7438) TestSplitTransactionOnCluster has too many infinite loops