[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-19 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418738#comment-13418738
 ] 

Zhihong Ted Yu commented on HBASE-6389:
---

Thanks for your explanation.

Have you seen the test failure that I described above @ 19/Jul/12 03:34 ?

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-19 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6389:
--

Attachment: org.apache.hadoop.hbase.TestZooKeeper-output.txt

Here was the test output from yesterday.

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch, org.apache.hadoop.hbase.TestZooKeeper-output.txt


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-19 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418752#comment-13418752
 ] 

Zhihong Ted Yu commented on HBASE-6389:
---

Looking at https://builds.apache.org/job/PreCommit-HBASE-Build/2406/console, 
there was still some hanging test although I wasn't able to find which test 
hung.

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch, org.apache.hadoop.hbase.TestZooKeeper-output.txt


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-19 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418866#comment-13418866
 ] 

Zhihong Ted Yu commented on HBASE-6389:
---

I ran test suite with latest patch on trunk and got:
{code}
Running org.apache.hadoop.hbase.client.TestHCM
Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 37.265 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.client.TestAdmin
Tests run: 40, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 322.872 sec 
 FAILURE!
--
Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor
Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 134.193 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 20, Failures: 5, Errors: 2, Skipped: 0, Time elapsed: 669.588 sec 
 FAILURE!
{code}
There was one hanging test:
{code}
at 
org.apache.hadoop.hbase.replication.TestReplication.setUp(TestReplication.java:183)
{code}

BTW what do R sub i, C and F sub i represent in the formula above ?

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch, org.apache.hadoop.hbase.TestZooKeeper-output.txt


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = 

[jira] [Updated] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-19 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6389:
--

Attachment: testReplication.jstack

jstack for the hanging TestReplication

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch, org.apache.hadoop.hbase.TestZooKeeper-output.txt, 
 testReplication.jstack


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-19 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418866#comment-13418866
 ] 

Zhihong Ted Yu edited comment on HBASE-6389 at 7/20/12 1:37 AM:


I ran test suite with latest patch on trunk and got:
{code}
Running org.apache.hadoop.hbase.client.TestHCM
Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 37.265 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.client.TestAdmin
Tests run: 40, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 322.872 sec 
 FAILURE!
--
Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor
Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 134.193 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 20, Failures: 5, Errors: 2, Skipped: 0, Time elapsed: 669.588 sec 
 FAILURE!
{code}
There was one hanging test:
{code}
at 
org.apache.hadoop.hbase.replication.TestReplication.setUp(TestReplication.java:183)
{code}

BTW what do R~i~, C and F~i~ represent in the formula above ?

  was (Author: zhi...@ebaysf.com):
I ran test suite with latest patch on trunk and got:
{code}
Running org.apache.hadoop.hbase.client.TestHCM
Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 37.265 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.client.TestAdmin
Tests run: 40, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 322.872 sec 
 FAILURE!
--
Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor
Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 134.193 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 20, Failures: 5, Errors: 2, Skipped: 0, Time elapsed: 669.588 sec 
 FAILURE!
{code}
There was one hanging test:
{code}
at 
org.apache.hadoop.hbase.replication.TestReplication.setUp(TestReplication.java:183)
{code}

BTW what do R sub i, C and F sub i represent in the formula above ?
  
 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch, org.apache.hadoop.hbase.TestZooKeeper-output.txt, 
 testReplication.jstack


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To 

[jira] [Comment Edited] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-19 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418866#comment-13418866
 ] 

Zhihong Ted Yu edited comment on HBASE-6389 at 7/20/12 1:41 AM:


I ran test suite with latest patch on trunk and got:
{code}
Running org.apache.hadoop.hbase.client.TestHCM
Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 37.265 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.client.TestAdmin
Tests run: 40, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 322.872 sec 
 FAILURE!
--
Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor
Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 134.193 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 20, Failures: 5, Errors: 2, Skipped: 0, Time elapsed: 669.588 sec 
 FAILURE!
{code}
There was one hanging test:
{code}
at 
org.apache.hadoop.hbase.replication.TestReplication.setUp(TestReplication.java:183)
{code}

BTW what do *R*~i~, C and *F*~i~ represent in the formula above ?

  was (Author: zhi...@ebaysf.com):
I ran test suite with latest patch on trunk and got:
{code}
Running org.apache.hadoop.hbase.client.TestHCM
Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 37.265 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.client.TestAdmin
Tests run: 40, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 322.872 sec 
 FAILURE!
--
Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor
Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 134.193 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 20, Failures: 5, Errors: 2, Skipped: 0, Time elapsed: 669.588 sec 
 FAILURE!
{code}
There was one hanging test:
{code}
at 
org.apache.hadoop.hbase.replication.TestReplication.setUp(TestReplication.java:183)
{code}

BTW what do R~i~, C and F~i~ represent in the formula above ?
  
 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch, org.apache.hadoop.hbase.TestZooKeeper-output.txt, 
 testReplication.jstack


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To 

[jira] [Comment Edited] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-19 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418866#comment-13418866
 ] 

Zhihong Ted Yu edited comment on HBASE-6389 at 7/20/12 2:53 AM:


I ran test suite with latest patch on trunk and got:
{code}
Failed tests:   
testRunThriftServer[12](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine):
 expected:1 but was:0
  
testRunThriftServer[14](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine):
 expected:1 but was:0
  
testRunThriftServer[15](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine):
 expected:1 but was:0
  
testRunThriftServer[16](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine):
 expected:1 but was:0
  
testRunThriftServer[17](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine):
 expected:1 but was:0

Tests in error:
  testRegionCaching(org.apache.hadoop.hbase.client.TestHCM): 
org.apache.hadoop.hbase.UnknownRegionException: bd992463917ba68fe5389c5bf9e94a3a
  
testCloseRegionThatFetchesTheHRIFromMeta(org.apache.hadoop.hbase.client.TestAdmin):
 -1
  testTableExists(org.apache.hadoop.hbase.catalog.TestMetaReaderEditor): 
org.apache.hadoop.hbase.TableNotEnabledException: testTableExists
  
testRunThriftServer[11](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine):
 test timed out after 6 milliseconds
  
testRunThriftServer[13](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine):
 test timed out after 6 milliseconds
{code}
There was one hanging test:
{code}
at 
org.apache.hadoop.hbase.replication.TestReplication.setUp(TestReplication.java:183)
{code}

BTW what do *R*~i~, C and *F*~i~ represent in the formula above ?

  was (Author: zhi...@ebaysf.com):
I ran test suite with latest patch on trunk and got:
{code}
Running org.apache.hadoop.hbase.client.TestHCM
Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 37.265 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.client.TestAdmin
Tests run: 40, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 322.872 sec 
 FAILURE!
--
Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor
Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 134.193 sec  
FAILURE!
--
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 20, Failures: 5, Errors: 2, Skipped: 0, Time elapsed: 669.588 sec 
 FAILURE!
{code}
There was one hanging test:
{code}
at 
org.apache.hadoop.hbase.replication.TestReplication.setUp(TestReplication.java:183)
{code}

BTW what do *R*~i~, C and *F*~i~ represent in the formula above ?
  
 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch, org.apache.hadoop.hbase.TestZooKeeper-output.txt, 
 testReplication.jstack


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   

[jira] [Updated] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-19 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6389:
--

Status: Open  (was: Patch Available)

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch, org.apache.hadoop.hbase.TestZooKeeper-output.txt, 
 testReplication.jstack


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3725) HBase increments from old value after delete and write to disk

2012-07-19 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418904#comment-13418904
 ] 

Zhihong Ted Yu commented on HBASE-3725:
---

Looking at existing code:
{code}
  private ListKeyValue getLastIncrement(final Get get) throws IOException {
InternalScan iscan = new InternalScan(get);
{code}
iscan was assigned at the beginning. Looks like the assignment in else block is 
redundant.

TestHRegion#testIncrementWithFlushAndDelete passed without that assignment.

 HBase increments from old value after delete and write to disk
 --

 Key: HBASE-3725
 URL: https://issues.apache.org/jira/browse/HBASE-3725
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.90.1
Reporter: Nathaniel Cook
Assignee: Jonathan Gray
 Attachments: HBASE-3725-0.92-V1.patch, HBASE-3725-0.92-V2.patch, 
 HBASE-3725-0.92-V3.patch, HBASE-3725-0.92-V4.patch, HBASE-3725-0.92-V5.patch, 
 HBASE-3725-Test-v1.patch, HBASE-3725-v3.patch, HBASE-3725.patch


 Deleted row values are sometimes used for starting points on new increments.
 To reproduce:
 Create a row r. Set column x to some default value.
 Force hbase to write that value to the file system (such as restarting the 
 cluster).
 Delete the row.
 Call table.incrementColumnValue with some_value
 Get the row.
 The returned value in the column was incremented from the old value before 
 the row was deleted instead of being initialized to some_value.
 Code to reproduce:
 {code}
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hbase.HBaseConfiguration;
 import org.apache.hadoop.hbase.HColumnDescriptor;
 import org.apache.hadoop.hbase.HTableDescriptor;
 import org.apache.hadoop.hbase.client.Delete;
 import org.apache.hadoop.hbase.client.Get;
 import org.apache.hadoop.hbase.client.HBaseAdmin;
 import org.apache.hadoop.hbase.client.HTableInterface;
 import org.apache.hadoop.hbase.client.HTablePool;
 import org.apache.hadoop.hbase.client.Increment;
 import org.apache.hadoop.hbase.client.Result;
 import org.apache.hadoop.hbase.util.Bytes;
 public class HBaseTestIncrement
 {
   static String tableName  = testIncrement;
   static byte[] infoCF = Bytes.toBytes(info);
   static byte[] rowKey = Bytes.toBytes(test-rowKey);
   static byte[] newInc = Bytes.toBytes(new);
   static byte[] oldInc = Bytes.toBytes(old);
   /**
* This code reproduces a bug with increment column values in hbase
* Usage: First run part one by passing '1' as the first arg
*Then restart the hbase cluster so it writes everything to disk
*Run part two by passing '2' as the first arg
*
* This will result in the old deleted data being found and used for 
 the increment calls
*
* @param args
* @throws IOException
*/
   public static void main(String[] args) throws IOException
   {
   if(1.equals(args[0]))
   partOne();
   if(2.equals(args[0]))
   partTwo();
   if (both.equals(args[0]))
   {
   partOne();
   partTwo();
   }
   }
   /**
* Creates a table and increments a column value 10 times by 10 each 
 time.
* Results in a value of 100 for the column
*
* @throws IOException
*/
   static void partOne()throws IOException
   {
   Configuration conf = HBaseConfiguration.create();
   HBaseAdmin admin = new HBaseAdmin(conf);
   HTableDescriptor tableDesc = new HTableDescriptor(tableName);
   tableDesc.addFamily(new HColumnDescriptor(infoCF));
   if(admin.tableExists(tableName))
   {
   admin.disableTable(tableName);
   admin.deleteTable(tableName);
   }
   admin.createTable(tableDesc);
   HTablePool pool = new HTablePool(conf, Integer.MAX_VALUE);
   HTableInterface table = pool.getTable(Bytes.toBytes(tableName));
   //Increment unitialized column
   for (int j = 0; j  10; j++)
   {
   table.incrementColumnValue(rowKey, infoCF, oldInc, 
 (long)10);
   Increment inc = new Increment(rowKey);
   inc.addColumn(infoCF, newInc, (long)10);
   table.increment(inc);
   }
   Get get = new Get(rowKey);
   Result r = table.get(get);
   System.out.println(initial values: new  + 
 Bytes.toLong(r.getValue(infoCF, newInc)) +  old  + 
 

[jira] [Resolved] (HBASE-6345) Utilize fault injection in testing using AspectJ

2012-07-19 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu resolved HBASE-6345.
---

Resolution: Won't Fix

There was not enough incentive to pursue fault injection using AspectJ.

 Utilize fault injection in testing using AspectJ
 

 Key: HBASE-6345
 URL: https://issues.apache.org/jira/browse/HBASE-6345
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu

 HDFS uses fault injection to test pipeline failure in addition to mock, spy. 
 HBase uses mock, spy. But there are cases where mock, spy aren't convenient.
 Some example from DFSClientAspects.aj :
 {code}
   pointcut pipelineInitNonAppend(DataStreamer datastreamer):
 callCreateBlockOutputStream(datastreamer)
  cflow(execution(* nextBlockOutputStream(..)))
  within(DataStreamer);
   after(DataStreamer datastreamer) returning : 
 pipelineInitNonAppend(datastreamer) {
 LOG.info(FI: after pipelineInitNonAppend: hasError=
 + datastreamer.hasError +  errorIndex= + datastreamer.errorIndex);
 if (datastreamer.hasError) {
   DataTransferTest dtTest = DataTransferTestUtil.getDataTransferTest();
   if (dtTest != null)
 dtTest.fiPipelineInitErrorNonAppend.run(datastreamer.errorIndex);
 }
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4255) Expose CatalogJanitor controls

2012-07-18 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu reassigned HBASE-4255:
-

Assignee: Devaraj Das

 Expose CatalogJanitor controls
 --

 Key: HBASE-4255
 URL: https://issues.apache.org/jira/browse/HBASE-4255
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: 4255-4.2.patch


 When doing surgery or other operational tasks, it's nice to be able to have 
 the .META. table quickly cleaned of split parents. The CatalogJanitor already 
 has controls baked in (currently used in unit tests), I think we should 
 expose this the same way we do with the balancer, that is:
  - start
  - stop
  - request a run
 A client would need to go through HBaseAdmin, and shell commands need to be 
 created.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4255) Expose CatalogJanitor controls

2012-07-18 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-4255:
--

Attachment: 4255-4.2.patch

Patch from review board.

 Expose CatalogJanitor controls
 --

 Key: HBASE-4255
 URL: https://issues.apache.org/jira/browse/HBASE-4255
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: 4255-4.2.patch


 When doing surgery or other operational tasks, it's nice to be able to have 
 the .META. table quickly cleaned of split parents. The CatalogJanitor already 
 has controls baked in (currently used in unit tests), I think we should 
 expose this the same way we do with the balancer, that is:
  - start
  - stop
  - request a run
 A client would need to go through HBaseAdmin, and shell commands need to be 
 created.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4255) Expose CatalogJanitor controls

2012-07-18 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-4255:
--

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 Expose CatalogJanitor controls
 --

 Key: HBASE-4255
 URL: https://issues.apache.org/jira/browse/HBASE-4255
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: 4255-4.2.patch


 When doing surgery or other operational tasks, it's nice to be able to have 
 the .META. table quickly cleaned of split parents. The CatalogJanitor already 
 has controls baked in (currently used in unit tests), I think we should 
 expose this the same way we do with the balancer, that is:
  - start
  - stop
  - request a run
 A client would need to go through HBaseAdmin, and shell commands need to be 
 created.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4255) Expose CatalogJanitor controls

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417057#comment-13417057
 ] 

Zhihong Ted Yu commented on HBASE-4255:
---

@J-D:
Please take a look at Deravaj's patch.

 Expose CatalogJanitor controls
 --

 Key: HBASE-4255
 URL: https://issues.apache.org/jira/browse/HBASE-4255
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: 4255-4.2.patch


 When doing surgery or other operational tasks, it's nice to be able to have 
 the .META. table quickly cleaned of split parents. The CatalogJanitor already 
 has controls baked in (currently used in unit tests), I think we should 
 expose this the same way we do with the balancer, that is:
  - start
  - stop
  - request a run
 A client would need to go through HBaseAdmin, and shell commands need to be 
 created.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417157#comment-13417157
 ] 

Zhihong Ted Yu commented on HBASE-4470:
---

Indentation seems to be off in testVerifyMetaRegionLocationWithException():
{code}
+ Mockito.when(implementation.get((byte [])Mockito.any(), (Get)Mockito.any())).
{code}

 ServerNotRunningException coming out of assignRootAndMeta kills the Master
 --

 Key: HBASE-4470
 URL: https://issues.apache.org/jira/browse/HBASE-4470
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Assignee: Gregory Chanan
Priority: Critical
 Fix For: 0.90.7

 Attachments: HBASE-4470-90.patch


 I'm surprised we still have issues like that and I didn't get a hit while 
 googling so forgive me if there's already a jira about it.
 When the master starts it verifies the locations of root and meta before 
 assigning them, if the server is started but not running you'll get this:
 {quote}
 2011-09-23 04:47:44,859 WARN 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
 RemoteException connecting to RS
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running 
 yet
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
 at 
 org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
 at $Proxy6.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
 at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
 at 
 org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484)
 at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441)
 at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)
 {quote}
 I hit that 3-4 times this week while debugging something else. The worst is 
 that when you restart the master it sees that as a failover, but none of the 
 regions are assigned so it takes an eternity to get back fully online.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417230#comment-13417230
 ] 

Zhihong Ted Yu commented on HBASE-6400:
---

Integrated to trunk.

Thanks for the patch, Enis.

Thanks for the review, Stack.

 Add getMasterAdmin() and getMasterMonitor() to HConnection
 --

 Key: HBASE-6400
 URL: https://issues.apache.org/jira/browse/HBASE-6400
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: 6400-v2.patch, HBASE-6400_v1.patch


 HConnection used to have getMaster() which returns HMasterInterface, but 
 after HBASE-6039 it has been removed. I think we need to expose 
 HConnection.getMasterAdmin() and getMasterMonitor() a la 
 HConnection.getAdmin(), and getClient(). 
 HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason 
 to leak keep alive classes to upper layers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417250#comment-13417250
 ] 

Zhihong Ted Yu commented on HBASE-6419:
---

I ran the two tests above and they passed with patch.

Integrated to trunk.

Thanks for the patch, Paul.

Thanks for the review, Stack.

 PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 
 of HBASE-6220)
 ---

 Key: HBASE-6419
 URL: https://issues.apache.org/jira/browse/HBASE-6419
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Paul Cavallaro
 Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6405) Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417279#comment-13417279
 ] 

Zhihong Ted Yu commented on HBASE-6405:
---

Addendum 2 looks good.

 Create Hadoop compatibilty modules and Metrics2 implementation of replication 
 metrics
 -

 Key: HBASE-6405
 URL: https://issues.apache.org/jira/browse/HBASE-6405
 Project: HBase
  Issue Type: Sub-task
Reporter: Zhihong Ted Yu
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: 6405.txt, HBASE-6405-ADD.patch, 
 hbase-6405-addendum-2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6421) [pom] add jettison and fix netty specification

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417324#comment-13417324
 ] 

Zhihong Ted Yu commented on HBASE-6421:
---

Patch didn't compile against hadoop 2.0:
{code}
==
==
Checking against hadoop 2.0 build
==
==
{code}

 [pom] add jettison and fix netty specification
 --

 Key: HBASE-6421
 URL: https://issues.apache.org/jira/browse/HBASE-6421
 Project: HBase
  Issue Type: Bug
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: hbase-6421-v0.patch


 Currently, jettison isn't required for testing hbase-server, but 
 TestSchemaConfigured requires it, causing the compile phase (at least on my 
 MBP) to fail. Further, in cleaning up the poms, netty should be declared in 
 the parent hbase/pom.xml and then inherited in the subclass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417366#comment-13417366
 ] 

Zhihong Ted Yu commented on HBASE-5547:
---

Here is the link about InterruptedException handling:
http://www.ibm.com/developerworks/java/library/j-jtp05236/index.html

Take a look at Listing 3 under 'Don't swallow interrupts'

 Don't delete HFiles when in backup mode
 -

 Key: HBASE-5547
 URL: https://issues.apache.org/jira/browse/HBASE-5547
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Jesse Yates
 Fix For: 0.94.2

 Attachments: 5547-v12.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, 
 hbase-5547-v9.patch, java_HBASE-5547_v13.patch, java_HBASE-5547_v4.patch, 
 java_HBASE-5547_v5.patch, java_HBASE-5547_v6.patch, java_HBASE-5547_v7.patch


 This came up in a discussion I had with Stack.
 It would be nice if HBase could be notified that a backup is in progress (via 
 a znode for example) and in that case either:
 1. rename HFiles to be delete to file.bck
 2. rename the HFiles into a special directory
 3. rename them to a general trash directory (which would not need to be tied 
 to backup mode).
 That way it should be able to get a consistent backup based on HFiles (HDFS 
 snapshots or hard links would be better options here, but we do not have 
 those).
 #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417379#comment-13417379
 ] 

Zhihong Ted Yu commented on HBASE-6396:
---

Clarification: the goal of the patch was not to make build compiled against 
hadoop 1 to work against hadoop 2.
The goal is to make build against 0.23 profile work with hadoop 2.

 Fix NoSuchMethodError running against hadoop 2.0
 

 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
  Labels: hadoop-2.0
 Fix For: 0.96.0

 Attachments: 6396-v2.txt


 HADOOP-8350 changed the signature of NetUtils.getInputStream()
 This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().
 See 
 https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6220) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics

2012-07-18 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6220:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 PersistentMetricsTimeVaryingRate gets used for non-time-based metrics
 -

 Key: HBASE-6220
 URL: https://issues.apache.org/jira/browse/HBASE-6220
 Project: HBase
  Issue Type: Bug
  Components: metrics
Affects Versions: 0.96.0
Reporter: David S. Wang
Assignee: Paul Cavallaro
Priority: Minor
  Labels: noob
 Attachments: ServerMetrics_HBASE_6220.patch, 
 ServerMetrics_HBASE_6220_Flush_Metrics.patch


 PersistentMetricsTimeVaryingRate gets used for metrics that are not 
 time-based, leading to confusing names such as avg_time for compaction 
 size, etc.  You hav to read the code in order to understand that this is 
 actually referring to bytes, not seconds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4255) Expose CatalogJanitor controls

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417400#comment-13417400
 ] 

Zhihong Ted Yu commented on HBASE-4255:
---

No test failure from 
https://builds.apache.org/job/PreCommit-HBASE-Build/2400//testReport/:
{code}
Results :

Tests run: 1021, Failures: 0, Errors: 0, Skipped: 9
{code}

 Expose CatalogJanitor controls
 --

 Key: HBASE-4255
 URL: https://issues.apache.org/jira/browse/HBASE-4255
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.4
Reporter: Jean-Daniel Cryans
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: 4255-4.2.patch, 4255-5.1.patch


 When doing surgery or other operational tasks, it's nice to be able to have 
 the .META. table quickly cleaned of split parents. The CatalogJanitor already 
 has controls baked in (currently used in unit tests), I think we should 
 expose this the same way we do with the balancer, that is:
  - start
  - stop
  - request a run
 A client would need to go through HBaseAdmin, and shell commands need to be 
 created.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6406) TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417665#comment-13417665
 ] 

Zhihong Ted Yu commented on HBASE-6406:
---

For trunk, TestZooKeeper hung with the following output:
{code}
2012-07-18 13:24:34,764 INFO  
[Master:0;sdev25.arch.ebay.com,59816,1342643039714] master.HMaster(455): 
HMaster main thread exiting

2012-07-18 13:24:34,764 INFO  
[RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759] 
zookeeper.RecoverableZooKeeper(102): The identifier of this process is 
15496@sdev25

2012-07-18 13:24:34,772 DEBUG 
[RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759-EventThread] 
zookeeper.ZooKeeperWatcher(262): regionserver:60707 Received ZooKeeper Event, 
type=None, state=SyncConnected, path=null

2012-07-18 13:24:34,773 DEBUG 
[RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759] 
zookeeper.ZKUtil(238): regionserver:60707 /hbase/master does not exist. Watcher 
is set.

2012-07-18 13:24:34,774 DEBUG 
[RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759-EventThread] 
zookeeper.ZooKeeperWatcher(339): regionserver:60707-0x1389bc2dddb000c connected

2012-07-18 13:24:35,062 INFO  
[sdev25.arch.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor] 
hbase.Chore(82): 
sdev25.arch.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor exiting

2012-07-18 13:24:35,080 DEBUG 
[RegionServer:0;sdev25.arch.ebay.com,48349,1342643039994] 
regionserver.HRegionServer(1817): No master found; retry

2012-07-18 13:24:36,081 DEBUG 
[RegionServer:0;sdev25.arch.ebay.com,48349,1342643039994] 
regionserver.HRegionServer(1817): No master found; retry
{code}


 TestReplicationPeer.testResetZooKeeperSession and 
 TestZooKeeper.testClientSessionExpired fail frequently
 

 Key: HBASE-6406
 URL: https://issues.apache.org/jira/browse/HBASE-6406
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.2

 Attachments: testReplication.jstack, testZooKeeper.jstack


 Looking back through the 0.94 test runs these two tests accounted for 11 of 
 34 failed tests.
 They should be fixed or (temporarily) disabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (HBASE-6406) TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417665#comment-13417665
 ] 

Zhihong Ted Yu edited comment on HBASE-6406 at 7/18/12 8:50 PM:


For trunk, TestZooKeeper hung with the following output:
{code}
2012-07-18 13:24:34,764 INFO  [Master:0;X.ebay.com,59816,1342643039714] 
master.HMaster(455): HMaster main thread exiting 
2012-07-18 13:24:34,764 INFO  [RegionServer:2;X.ebay.com,60707,1342643074759] 
zookeeper.RecoverableZooKeeper(102): The identifier of this process is 
15496@sdev25
2012-07-18 13:24:34,772 DEBUG 
[RegionServer:2;X.ebay.com,60707,1342643074759-EventThread] 
zookeeper.ZooKeeperWatcher(262): regionserver:60707 Received ZooKeeper Event, 
type=None, state=SyncConnected, path=null
2012-07-18 13:24:34,773 DEBUG [RegionServer:2;X.ebay.com,60707,1342643074759] 
zookeeper.ZKUtil(238): regionserver:60707 /hbase/master does not exist. Watcher 
is set.
2012-07-18 13:24:34,774 DEBUG 
[RegionServer:2;X.ebay.com,60707,1342643074759-EventThread] 
zookeeper.ZooKeeperWatcher(339): regionserver:60707-0x1389bc2dddb000c connected
2012-07-18 13:24:35,062 INFO  
[X.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor] hbase.Chore(82): 
X.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor exiting
2012-07-18 13:24:35,080 DEBUG [RegionServer:0;X.ebay.com,48349,1342643039994] 
regionserver.HRegionServer(1817): No master found; retry
2012-07-18 13:24:36,081 DEBUG [RegionServer:0;X.ebay.com,48349,1342643039994] 
regionserver.HRegionServer(1817): No master found; retry{code}


  was (Author: zhi...@ebaysf.com):
For trunk, TestZooKeeper hung with the following output:
{code}
2012-07-18 13:24:34,764 INFO  
[Master:0;sdev25.arch.ebay.com,59816,1342643039714] master.HMaster(455): 
HMaster main thread exiting

2012-07-18 13:24:34,764 INFO  
[RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759] 
zookeeper.RecoverableZooKeeper(102): The identifier of this process is 
15496@sdev25

2012-07-18 13:24:34,772 DEBUG 
[RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759-EventThread] 
zookeeper.ZooKeeperWatcher(262): regionserver:60707 Received ZooKeeper Event, 
type=None, state=SyncConnected, path=null

2012-07-18 13:24:34,773 DEBUG 
[RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759] 
zookeeper.ZKUtil(238): regionserver:60707 /hbase/master does not exist. Watcher 
is set.

2012-07-18 13:24:34,774 DEBUG 
[RegionServer:2;sdev25.arch.ebay.com,60707,1342643074759-EventThread] 
zookeeper.ZooKeeperWatcher(339): regionserver:60707-0x1389bc2dddb000c connected

2012-07-18 13:24:35,062 INFO  
[sdev25.arch.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor] 
hbase.Chore(82): 
sdev25.arch.ebay.com,59816,1342643039714.splitLogManagerTimeoutMonitor exiting

2012-07-18 13:24:35,080 DEBUG 
[RegionServer:0;sdev25.arch.ebay.com,48349,1342643039994] 
regionserver.HRegionServer(1817): No master found; retry

2012-07-18 13:24:36,081 DEBUG 
[RegionServer:0;sdev25.arch.ebay.com,48349,1342643039994] 
regionserver.HRegionServer(1817): No master found; retry
{code}

  
 TestReplicationPeer.testResetZooKeeperSession and 
 TestZooKeeper.testClientSessionExpired fail frequently
 

 Key: HBASE-6406
 URL: https://issues.apache.org/jira/browse/HBASE-6406
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.2

 Attachments: testReplication.jstack, testZooKeeper.jstack


 Looking back through the 0.94 test runs these two tests accounted for 11 of 
 34 failed tests.
 They should be fixed or (temporarily) disabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6421) [pom] add jettison and fix netty specification

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417670#comment-13417670
 ] 

Zhihong Ted Yu commented on HBASE-6421:
---

I got the following:
{code}
[ERROR] The build could not read 1 project - [Help 1]
org.apache.maven.project.ProjectBuildingException: Some problems were 
encountered while processing the POMs:
[ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-core:jar 
is missing. @ line 64, column 17
[ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-test:jar 
is missing. @ line 91, column 17

at 
org.apache.maven.project.DefaultProjectBuilder.build(DefaultProjectBuilder.java:339)
at org.apache.maven.DefaultMaven.collectProjects(DefaultMaven.java:632)
at 
org.apache.maven.DefaultMaven.getProjectsForMavenReactor(DefaultMaven.java:581)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:233)
{code}
Here is the command:

mvn clean test help:active-profiles -X -DskipTests -Dhadoop.profile=2.0

 [pom] add jettison and fix netty specification
 --

 Key: HBASE-6421
 URL: https://issues.apache.org/jira/browse/HBASE-6421
 Project: HBase
  Issue Type: Bug
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: hbase-6421-v0.patch


 Currently, jettison isn't required for testing hbase-server, but 
 TestSchemaConfigured requires it, causing the compile phase (at least on my 
 MBP) to fail. Further, in cleaning up the poms, netty should be declared in 
 the parent hbase/pom.xml and then inherited in the subclass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417697#comment-13417697
 ] 

Zhihong Ted Yu commented on HBASE-6389:
---

I tried to see why TestZooKeeper hung strangely:
{code}
2012-07-18 14:05:59,533 DEBUG [pool-57-thread-1] zookeeper.ZKUtil(1142): 
master:52861-0x1389be8bd6e-0x1389be8bd6e000a-0x1389be8bd6e000b Retrieved 39 
byte(s) of data from znode /hbase/root-region-server and set watcher; 
X.ebay.com,44052,1342645522433

2012-07-18 14:05:59,533 WARN  [pool-52-thread-1] 
zookeeper.RecoverableZooKeeper(218): Possibly transient ZooKeeper exception: 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbase/root-region-server

2012-07-18 14:05:59,533 INFO  [pool-52-thread-1] util.RetryCounter(55): 
Sleeping 2000ms before retry #1...

2012-07-18 14:05:59,536 INFO  [main] ipc.HBaseRpcMetrics(66): Initializing RPC 
Metrics with hostName=MiniHBaseCluster$MiniHBaseClusterRegionServer, port=44030

2012-07-18 14:05:59,537 INFO  [Master:0;X.ebay.com,52861,1342645522110] 
master.HMaster(455): HMaster main thread exiting
{code}
Basically the test hung in setup().

I then traced where TestZooKeeper stopped showing up in test result and this 
was the first URL giving me 404 error:

https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/3126/testReport/org.apache.hadoop.hbase/TestZooKeeper/

That was when this patch went in.

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' 

[jira] [Reopened] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-18 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu reopened HBASE-6389:
---


After reverting the patch, test passed smoothly:
{code}
Running org.apache.hadoop.hbase.TestZooKeeper
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 48.678 sec

Results :

Tests run: 11, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] --- maven-surefire-plugin:2.10:test (secondPartTestsExecution) @ 
hbase-server ---
[INFO] Tests are skipped.
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 58.563s
{code}

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by 

[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417816#comment-13417816
 ] 

Zhihong Ted Yu commented on HBASE-4050:
---

Elliot's first patch used ResourceFinder which allowed passing initialization 
parameters to ctor.
Maybe revive that approach ?

 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
 HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417825#comment-13417825
 ] 

Zhihong Ted Yu commented on HBASE-6389:
---

Patch for 0.94 wasn't attached here.

@Lars:
Can you revert the patches ?

Thanks

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417869#comment-13417869
 ] 

Zhihong Ted Yu commented on HBASE-6389:
---

Reverted trunk patch.

Have not touched 0.94 branch yet.

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6405) Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417958#comment-13417958
 ] 

Zhihong Ted Yu commented on HBASE-6405:
---

Addendum v2 integrated to trunk.

Thanks for the patch, Jesse.

Thanks for the review, Elliot.

 Create Hadoop compatibilty modules and Metrics2 implementation of replication 
 metrics
 -

 Key: HBASE-6405
 URL: https://issues.apache.org/jira/browse/HBASE-6405
 Project: HBase
  Issue Type: Sub-task
Reporter: Zhihong Ted Yu
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: 6405.txt, HBASE-6405-ADD.patch, 
 hbase-6405-addendum-2-v2.patch, hbase-6405-addendum-2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6426) Add Hadoop 2.0.x profile to 0.92+

2012-07-18 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6426:
--

Description: 0.96 already has a Hadoop-2.0 build profile. Let add this to 
0.92 and 0.94 as well.  (was: 0.96 already has a Hadoop-2.0 build profile. Let 
add this to 0.92 and 0.96 as well.)

 Add Hadoop 2.0.x profile to 0.92+
 -

 Key: HBASE-6426
 URL: https://issues.apache.org/jira/browse/HBASE-6426
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.92.2, 0.94.1


 0.96 already has a Hadoop-2.0 build profile. Let add this to 0.92 and 0.94 as 
 well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6392) UnknownRegionException blocks hbck from sideline big overlap regions

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417968#comment-13417968
 ] 

Zhihong Ted Yu commented on HBASE-6392:
---

@Jimmy:
Can you attach patch for 0.92 to this JIRA ?

 UnknownRegionException blocks hbck from sideline big overlap regions
 

 Key: HBASE-6392
 URL: https://issues.apache.org/jira/browse/HBASE-6392
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: 6392-trunk.patch, 6392-trunk_v2.patch


 Before sidelining a big overlap region, hbck tries to close it and offline it 
 at first.  However, sometimes, it throws NotServingRegion or 
 UnknownRegionException.
 It could be because the region is not open/assigned at all, or some other 
 issue.
 We should figure out why and fix it.
 By the way, it's better to print out in the log the command line to bulk load 
 back sidelined regions, if any. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6327) HLog can be null when create table

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13417993#comment-13417993
 ] 

Zhihong Ted Yu commented on HBASE-6327:
---

I was expecting other committer(s) to take a look.

 HLog can be null when create table
 --

 Key: HBASE-6327
 URL: https://issues.apache.org/jira/browse/HBASE-6327
 Project: HBase
  Issue Type: Bug
Reporter: ShiXing
Assignee: ShiXing
 Fix For: 0.96.0

 Attachments: 6327.txt, HBASE-6327-trunk-V1.patch, 
 createTableFailedMaster.log


 As HBASE-4010 discussed, the HLog can be null.
 We have meet createTable failed because the no use hlog.
 When createHReagion, the HLog.LogSyncer is run sync(), in under layer it call 
 the DFSClient.DFSOutputStream.sync(). 
 Then the hlog.closeAndDelete() was called,firstly the HLog.close() will 
 interrupt the LogSyncer, and interrupt DFSClient.DFSOutputStream.sync().The 
 DFSClient.DFSOutputStream will store the exception and throw it when we 
 called DFSClient.close(). 
 The HLog.close() call the writer.close()/DFSClient.close() after interrupt 
 the LogSyncer. And there is no catch exception for the close().
 So the Master throw exception to the client. There is no need to throw this 
 exception, further, the hlog is no use.
 Our cluster is 0.90, the logs is attached, after closing hlog writer, there 
 is no log for the createTable().
 The trunk and 0.92, 0.94, we used just one hlog, and if the exception 
 happends, the client will got createTable failed, but indeed ,we expect all 
 the regions for the table can also be assigned.
 I will give the patch for this later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418025#comment-13418025
 ] 

Zhihong Ted Yu commented on HBASE-6389:
---

[~areborn]:
Can you remove the sentence in Release Notes ?

Thanks

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418030#comment-13418030
 ] 

Zhihong Ted Yu commented on HBASE-6389:
---

It has been 12 minutes since I started running TestZooKeeper based on latest 
patch.
Here is the tail of jstack:
{code}
main prio=5 tid=102801000 nid=0x100601000 in Object.wait() [1005fe000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 78ebfee68 (a 
org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
at java.lang.Thread.join(Thread.java:1210)
- locked 78ebfee68 (a 
org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
at java.lang.Thread.join(Thread.java:1263)
at 
org.apache.hadoop.hbase.LocalHBaseCluster.waitOnRegionServer(LocalHBaseCluster.java:262)
at 
org.apache.hadoop.hbase.MiniHBaseCluster.waitOnRegionServer(MiniHBaseCluster.java:285)
at 
org.apache.hadoop.hbase.TestZooKeeper.testRegionServerSessionExpired(TestZooKeeper.java:201)
{code}
Please take a look at:
https://issues.apache.org/jira/browse/HBASE-6406?focusedCommentId=13417665page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13417665

See if your finding can explain that symptom.

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException

[jira] [Comment Edited] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418025#comment-13418025
 ] 

Zhihong Ted Yu edited comment on HBASE-6389 at 7/19/12 4:16 AM:


@Aditya:
Can you remove the sentence in Release Notes ?

Thanks

  was (Author: zhi...@ebaysf.com):
[~areborn]:
Can you remove the sentence in Release Notes ?

Thanks
  
 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch, 
 HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6406) TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418047#comment-13418047
 ] 

Zhihong Ted Yu commented on HBASE-6406:
---

TestReplicationPeer.java should be removed from trunk as well, right ?

 TestReplicationPeer.testResetZooKeeperSession and 
 TestZooKeeper.testClientSessionExpired fail frequently
 

 Key: HBASE-6406
 URL: https://issues.apache.org/jira/browse/HBASE-6406
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.2

 Attachments: testReplication.jstack, testZooKeeper.jstack


 Looking back through the 0.94 test runs these two tests accounted for 11 of 
 34 failed tests.
 They should be fixed or (temporarily) disabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6421) [pom] add jettison and fix netty specification

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418048#comment-13418048
 ] 

Zhihong Ted Yu commented on HBASE-6421:
---

Integrated to trunk.

Thanks for the patch, Jesse.

 [pom] add jettison and fix netty specification
 --

 Key: HBASE-6421
 URL: https://issues.apache.org/jira/browse/HBASE-6421
 Project: HBase
  Issue Type: Bug
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: hbase-6421-v0.patch, hbase-6421-v1.patch


 Currently, jettison isn't required for testing hbase-server, but 
 TestSchemaConfigured requires it, causing the compile phase (at least on my 
 MBP) to fail. Further, in cleaning up the poms, netty should be declared in 
 the parent hbase/pom.xml and then inherited in the subclass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5447) Support for custom filters with PB-based RPC

2012-07-18 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13418059#comment-13418059
 ] 

Zhihong Ted Yu commented on HBASE-5447:
---

bq. just derive from FilterBasePB (where they are required to implement the 
readFields and write methods, as today)
I guess there was a typo above: FilterBaseWritable should have been used.

 Support for custom filters with PB-based RPC
 

 Key: HBASE-5447
 URL: https://issues.apache.org/jira/browse/HBASE-5447
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Todd Lipcon



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6399) MetricsContext be different between RegionServerMetrics and RegionServerDynamicMetrics

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416171#comment-13416171
 ] 

Zhihong Ted Yu commented on HBASE-6399:
---

{code}
+# dynamic.fileName=/tmp/metrics_jvm.log
{code}
The filename should be metrics_dynamic.log or similar, right ?

There are no rrd file pileup with the patch applied, I assume.

 MetricsContext be different between RegionServerMetrics and 
 RegionServerDynamicMetrics
 --

 Key: HBASE-6399
 URL: https://issues.apache.org/jira/browse/HBASE-6399
 Project: HBase
  Issue Type: Bug
  Components: metrics
Affects Versions: 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6399.patch


 In hadoop-metrics.properties, GangliaContext is optional metrics context, I 
 think we will use ganglia to monitor hbase cluster generally.
 However, I find a serious problem:
 RegionServerDynamicMetrics will generate lots of rrd file because we would 
 move region or create/delete table. 
 Especially if table is created everyday in some applications, there are much 
 more and more rrd files in Gmetad Server. It will make Gmetad Server 
 corrupted.
 IMO, MetricsContext should be different between RegionServerMetrics and 
 RegionServerDynamicMetrics

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6401) HBase may lose edits after a crash if used with HDFS 1.0.3 or older

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416223#comment-13416223
 ] 

Zhihong Ted Yu commented on HBASE-6401:
---

The test is for hadoop.
Is there HADOOP- JIRA for this bug ?

 HBase may lose edits after a crash if used with HDFS 1.0.3 or older
 ---

 Key: HBASE-6401
 URL: https://issues.apache.org/jira/browse/HBASE-6401
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.0
 Environment: all
Reporter: nkeywal
Priority: Critical
 Attachments: TestReadAppendWithDeadDN.java


 This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the 
 hdfs jira for this.
 Context: HBase Write Ahead Log features. This is using hdfs append. If the 
 node crashes, the file that was written is read by other processes to replay 
 the action.
 - So we have in hdfs one (dead) process writing with another process reading.
 - But, despite the call to syncFs, we don't always see the data when we have 
 a dead node. It seems to be because the call in DFSClient#updateBlockInfo 
 ignores the ipc errors and set the length to 0.
 - So we may miss all the writes to the last block if we try to connect to the 
 dead DN.
 hdfs 1.0.3, branch-1 or branch-1-win: we have the issue
 http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853view=markup
 hdfs branch-2 or trunk: we should not have the issue (but not tested)
 http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup
 The attached test will fail ~50 of the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416358#comment-13416358
 ] 

Zhihong Ted Yu commented on HBASE-6400:
---

{code}
+public MasterAdminProtocol getMasterAdmin() throws 
MasterNotRunningException {
{code}
@Override seems to be missing.

 Add getMasterAdmin() and getMasterMonitor() to HConnection
 --

 Key: HBASE-6400
 URL: https://issues.apache.org/jira/browse/HBASE-6400
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6400_v1.patch


 HConnection used to have getMasterInterface(), but after HBASE-6039 it has 
 been removed. I think we need to expose HConnection.getMasterAdmin() and 
 getMasterMonitor() a la HConnection.getAdmin(), and getClient(). 
 HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason 
 to leak keep alive classes to upper layers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6403) RegionCoprocessorHost provides empty config when loading a coprocessor

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416421#comment-13416421
 ] 

Zhihong Ted Yu commented on HBASE-6403:
---

Nice finding, Eric.

Looks like HBaseConfiguration.create() should be aware of the type of 
Configuration object passed to it.
Meaning, CompoundConfiguration should be pulled into hbase-common module.

 RegionCoprocessorHost provides empty config when loading a coprocessor
 --

 Key: HBASE-6403
 URL: https://issues.apache.org/jira/browse/HBASE-6403
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Eric Newton
Priority: Minor

 I started playing with Giraffa.  I am running it against Hadoop 2.0.0-alpha, 
 and current HBase trunk.  On line 159 of RegionCoprocessorHost, the server's 
 configuration is copied... or at least an attempt is made to copy it.  
 However, the server's configuration object, a CompoundConfiguration, does not 
 store the data in the same way as the base Configuration object, and so 
 nothing is copied. This leaves the coprocessor without access to 
 configuration values, like the fs.defaultFS, which Giraffa is looking for.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection

2012-07-17 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6400:
--

  Description: 
HConnection used to have getMaster() which returns HMasterInterface, but after 
HBASE-6039 it has been removed. I think we need to expose 
HConnection.getMasterAdmin() and getMasterMonitor() a la 
HConnection.getAdmin(), and getClient(). 

HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason to 
leak keep alive classes to upper layers.

  was:
HConnection used to have getMasterInterface(), but after HBASE-6039 it has been 
removed. I think we need to expose HConnection.getMasterAdmin() and 
getMasterMonitor() a la HConnection.getAdmin(), and getClient(). 

HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason to 
leak keep alive classes to upper layers.

Fix Version/s: 0.96.0

 Add getMasterAdmin() and getMasterMonitor() to HConnection
 --

 Key: HBASE-6400
 URL: https://issues.apache.org/jira/browse/HBASE-6400
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: HBASE-6400_v1.patch


 HConnection used to have getMaster() which returns HMasterInterface, but 
 after HBASE-6039 it has been removed. I think we need to expose 
 HConnection.getMasterAdmin() and getMasterMonitor() a la 
 HConnection.getAdmin(), and getClient(). 
 HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason 
 to leak keep alive classes to upper layers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection

2012-07-17 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6400:
--

Attachment: 6400-v2.patch

Patch v2 adds @Override.

 Add getMasterAdmin() and getMasterMonitor() to HConnection
 --

 Key: HBASE-6400
 URL: https://issues.apache.org/jira/browse/HBASE-6400
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: 6400-v2.patch, HBASE-6400_v1.patch


 HConnection used to have getMaster() which returns HMasterInterface, but 
 after HBASE-6039 it has been removed. I think we need to expose 
 HConnection.getMasterAdmin() and getMasterMonitor() a la 
 HConnection.getAdmin(), and getClient(). 
 HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason 
 to leak keep alive classes to upper layers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6400) Add getMasterAdmin() and getMasterMonitor() to HConnection

2012-07-17 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6400:
--

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 Add getMasterAdmin() and getMasterMonitor() to HConnection
 --

 Key: HBASE-6400
 URL: https://issues.apache.org/jira/browse/HBASE-6400
 Project: HBase
  Issue Type: Improvement
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: 6400-v2.patch, HBASE-6400_v1.patch


 HConnection used to have getMaster() which returns HMasterInterface, but 
 after HBASE-6039 it has been removed. I think we need to expose 
 HConnection.getMasterAdmin() and getMasterMonitor() a la 
 HConnection.getAdmin(), and getClient(). 
 HConnectionImplementation has getKeepAliveMasterAdmin() but, I see no reason 
 to leak keep alive classes to upper layers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416543#comment-13416543
 ] 

Zhihong Ted Yu commented on HBASE-4050:
---

Would integrate the patch later this afternoon if there is no further review 
comment.

 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
 HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416599#comment-13416599
 ] 

Zhihong Ted Yu commented on HBASE-4050:
---

How long would it take to align them with the work from Elliot ?

 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
 HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416608#comment-13416608
 ] 

Zhihong Ted Yu commented on HBASE-4050:
---

The above is a nice list.
The title of this JIRA is general sounding.
Does it make sense to create the above sub-tasks (including Elliot's latest 
patch) under this JIRA ?

 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
 HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6405) Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics

2012-07-17 Thread Zhihong Ted Yu (JIRA)
Zhihong Ted Yu created HBASE-6405:
-

 Summary: Create Hadoop compatibilty modules and Metrics2 
implementation of replication metrics
 Key: HBASE-6405
 URL: https://issues.apache.org/jira/browse/HBASE-6405
 Project: HBase
  Issue Type: Sub-task
Reporter: Zhihong Ted Yu
Assignee: Elliott Clark
 Fix For: 0.96.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-17 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-4050:
--

Status: Open  (was: Patch Available)

 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
 HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6405) Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics

2012-07-17 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6405:
--

Attachment: 6405.txt

Same patch as 4050-8.patch from Elliot.

 Create Hadoop compatibilty modules and Metrics2 implementation of replication 
 metrics
 -

 Key: HBASE-6405
 URL: https://issues.apache.org/jira/browse/HBASE-6405
 Project: HBase
  Issue Type: Sub-task
Reporter: Zhihong Ted Yu
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: 6405.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6405) Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416651#comment-13416651
 ] 

Zhihong Ted Yu commented on HBASE-6405:
---

Integrated to trunk.

Thanks for the patch, Elliot.

Thanks for the review, Stack.

 Create Hadoop compatibilty modules and Metrics2 implementation of replication 
 metrics
 -

 Key: HBASE-6405
 URL: https://issues.apache.org/jira/browse/HBASE-6405
 Project: HBase
  Issue Type: Sub-task
Reporter: Zhihong Ted Yu
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: 6405.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6405) Create Hadoop compatibilty modules and Metrics2 implementation of replication metrics

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416679#comment-13416679
 ] 

Zhihong Ted Yu commented on HBASE-6405:
---

Addendum checked in.

Thanks for the quick turn-around, Elliot.

 Create Hadoop compatibilty modules and Metrics2 implementation of replication 
 metrics
 -

 Key: HBASE-6405
 URL: https://issues.apache.org/jira/browse/HBASE-6405
 Project: HBase
  Issue Type: Sub-task
Reporter: Zhihong Ted Yu
Assignee: Elliott Clark
 Fix For: 0.96.0

 Attachments: 6405.txt, HBASE-6405-ADD.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6261) Better approximate high-percentile percentile latency metrics

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416706#comment-13416706
 ] 

Zhihong Ted Yu commented on HBASE-6261:
---

bq. copy the code over and then refactor it away
+1 on above.

 Better approximate high-percentile percentile latency metrics
 -

 Key: HBASE-6261
 URL: https://issues.apache.org/jira/browse/HBASE-6261
 Project: HBase
  Issue Type: New Feature
Reporter: Andrew Wang
Assignee: Andrew Wang
  Labels: metrics
 Attachments: Latencyestimation.pdf


 The existing reservoir-sampling based latency metrics in HBase are not 
 well-suited for providing accurate estimates of high-percentile (e.g. 90th, 
 95th, or 99th) latency. This is a well-studied problem in the literature (see 
 [1] and [2]), the question is determining which methods best suit our needs 
 and then implementing it.
 Ideally, we should be able to estimate these high percentiles with minimal 
 memory and CPU usage as well as minimal error (e.g. 1% error on 90th, or .1% 
 on 99th). It's also desirable to provide this over different time-based 
 sliding windows, e.g. last 1 min, 5 mins, 15 mins, and 1 hour.
 I'll note that this would also be useful in HDFS, or really anywhere latency 
 metrics are kept.
 [1] http://www.cs.rutgers.edu/~muthu/bquant.pdf
 [2] http://infolab.stanford.edu/~manku/papers/04pods-sliding.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416709#comment-13416709
 ] 

Zhihong Ted Yu commented on HBASE-4050:
---

bq. filing JIRA to allow metrics removals
+1 on above.

 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
 HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6406) TestReplicationPeer.testResetZooKeeperSession and TestZooKeeper.testClientSessionExpired fail frequently

2012-07-17 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6406:
--

Attachment: testZooKeeper.jstack
testReplication.jstack

I ran trunk test suite and there were two surefire JVMs hanging.

Here're their jstack's

 TestReplicationPeer.testResetZooKeeperSession and 
 TestZooKeeper.testClientSessionExpired fail frequently
 

 Key: HBASE-6406
 URL: https://issues.apache.org/jira/browse/HBASE-6406
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.1
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.2

 Attachments: testReplication.jstack, testZooKeeper.jstack


 Looking back through the 0.94 test runs these two tests accounted for 11 of 
 34 failed tests.
 They should be fixed or (temporarily) disabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6399) MetricsContext be different between RegionServerMetrics and RegionServerDynamicMetrics

2012-07-17 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416839#comment-13416839
 ] 

Zhihong Ted Yu commented on HBASE-6399:
---

Patch v2 looks good to me.

 MetricsContext be different between RegionServerMetrics and 
 RegionServerDynamicMetrics
 --

 Key: HBASE-6399
 URL: https://issues.apache.org/jira/browse/HBASE-6399
 Project: HBase
  Issue Type: Bug
  Components: metrics
Affects Versions: 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6399.patch, HBASE-6399v2.patch


 In hadoop-metrics.properties, GangliaContext is optional metrics context, I 
 think we will use ganglia to monitor hbase cluster generally.
 However, I find a serious problem:
 RegionServerDynamicMetrics will generate lots of rrd file because we would 
 move region or create/delete table. 
 Especially if table is created everyday in some applications, there are much 
 more and more rrd files in Gmetad Server. It will make Gmetad Server 
 corrupted.
 IMO, MetricsContext should be different between RegionServerMetrics and 
 RegionServerDynamicMetrics

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6399) MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics

2012-07-17 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6399:
--

Hadoop Flags: Reviewed
 Summary: MetricsContext should be different between 
RegionServerMetrics and RegionServerDynamicMetrics  (was: MetricsContext be 
different between RegionServerMetrics and RegionServerDynamicMetrics)

 MetricsContext should be different between RegionServerMetrics and 
 RegionServerDynamicMetrics
 -

 Key: HBASE-6399
 URL: https://issues.apache.org/jira/browse/HBASE-6399
 Project: HBase
  Issue Type: Bug
  Components: metrics
Affects Versions: 0.94.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0, 0.94.2

 Attachments: HBASE-6399.patch, HBASE-6399v2.patch


 In hadoop-metrics.properties, GangliaContext is optional metrics context, I 
 think we will use ganglia to monitor hbase cluster generally.
 However, I find a serious problem:
 RegionServerDynamicMetrics will generate lots of rrd file because we would 
 move region or create/delete table. 
 Especially if table is created everyday in some applications, there are much 
 more and more rrd files in Gmetad Server. It will make Gmetad Server 
 corrupted.
 IMO, MetricsContext should be different between RegionServerMetrics and 
 RegionServerDynamicMetrics

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96

2012-07-16 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13415410#comment-13415410
 ] 

Zhihong Ted Yu commented on HBASE-6055:
---

Thanks for the hint, Jon.
I thought of that approach.

I recently looked up related classes in the patch using vi directly.
It would be nice if we can reduce the number of classes: controller, monitor, 
manager, sentinel, etc. It is hard to follow :-)

I have gone through about 2.5 pages of diff.
I can see there is more work to be done for Global snapshot.

 Snapshots in HBase 0.96
 ---

 Key: HBASE-6055
 URL: https://issues.apache.org/jira/browse/HBASE-6055
 Project: HBase
  Issue Type: New Feature
  Components: client, master, regionserver, zookeeper
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: Snapshots in HBase.docx


 Continuation of HBASE-50 for the current trunk. Since the implementation has 
 drastically changed, opening as a new ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent

2012-07-16 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13415421#comment-13415421
 ] 

Zhihong Ted Yu commented on HBASE-6272:
---

Understood, Jimmy.
Sounds like a good plan.

 In-memory region state is inconsistent
 --

 Key: HBASE-6272
 URL: https://issues.apache.org/jira/browse/HBASE-6272
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang

 AssignmentManger stores region state related information in several places: 
 regionsInTransition, regions (region info to server name map), and servers 
 (server name to region info set map).  However the access to these places is 
 not coordinated properly.  It leads to inconsistent in-memory region state 
 information.  Sometimes, some region could even be offline, and not in 
 transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6336) Split point should not be equal to start row or end row

2012-07-16 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6336:
--

Summary: Split point should not be equal to start row or end row  (was: 
Split point should not be equal with start row or end row)

 Split point should not be equal to start row or end row
 ---

 Key: HBASE-6336
 URL: https://issues.apache.org/jira/browse/HBASE-6336
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-6336.patch


 Should we allow split point equal with region's start row or end row?
 {code}
 // if the midkey is the same as the first and last keys, then we cannot
 // (ever) split this region.
 if (this.comparator.compareRows(mk, firstKey) == 0 
 this.comparator.compareRows(mk, lastKey) == 0) {
   if (LOG.isDebugEnabled()) {
 LOG.debug(cannot split because midkey is the same as first or  +
   last row);
   }
 {code}
 Here, I think it is a mistake.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6336) Split point should not be equal to start row or end row

2012-07-16 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13415874#comment-13415874
 ] 

Zhihong Ted Yu commented on HBASE-6336:
---

Integrated to trunk.

Thanks for the patch, Chunhui.

Thanks for the review, Stack and Ram.

 Split point should not be equal to start row or end row
 ---

 Key: HBASE-6336
 URL: https://issues.apache.org/jira/browse/HBASE-6336
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-6336.patch


 Should we allow split point equal with region's start row or end row?
 {code}
 // if the midkey is the same as the first and last keys, then we cannot
 // (ever) split this region.
 if (this.comparator.compareRows(mk, firstKey) == 0 
 this.comparator.compareRows(mk, lastKey) == 0) {
   if (LOG.isDebugEnabled()) {
 LOG.debug(cannot split because midkey is the same as first or  +
   last row);
   }
 {code}
 Here, I think it is a mistake.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2012-07-16 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13415879#comment-13415879
 ] 

Zhihong Ted Yu commented on HBASE-5416:
---

I ran TestJoinedScanners on Linux and observed the following in test output:
{code}
2012-07-16 17:31:52,339 INFO  [main] regionserver.TestJoinedScanners(152): Slow 
scanner finished in 96.393137286 seconds, got 1000 rows
...
2012-07-16 17:32:05,026 INFO  [main] regionserver.TestJoinedScanners(172): 
Joined scanner finished in 12.687607287 seconds, got 1000 rows
{code}

 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: filters, performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Max Lapan
 Fix For: 0.96.0

 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, 
 Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, 
 Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, 
 Filtered_scans_v7.patch


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96

2012-07-15 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414689#comment-13414689
 ] 

Zhihong Ted Yu commented on HBASE-6055:
---

Flipping through 5 pages on review board is slow. So I am putting down some 
notes here.

For HStore.java:
The license header doesn't look like the standard format.
Please add audience and stability annotations to this new interface.
{code}
+  FileStatus[] getStoreFiles() throws IOException;
+
+  ListStoreFile getStorefiles();
{code}
Why do we need two methods which are spelled almost the same, yet returning 
different types ? When refactoring, we should make the code cleaner.
There're many methods which don't have javadoc. Please add javadoc for them.
{code}
+  public HStore getDelgate() {
{code}
Correct spelling for the above method.

 Snapshots in HBase 0.96
 ---

 Key: HBASE-6055
 URL: https://issues.apache.org/jira/browse/HBASE-6055
 Project: HBase
  Issue Type: New Feature
  Components: client, master, regionserver, zookeeper
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: Snapshots in HBase.docx


 Continuation of HBASE-50 for the current trunk. Since the implementation has 
 drastically changed, opening as a new ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent

2012-07-15 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414690#comment-13414690
 ] 

Zhihong Ted Yu commented on HBASE-6272:
---

@Jimmy:
Can you clarify whether the test on trunk was performed in a live cluster ?

 In-memory region state is inconsistent
 --

 Key: HBASE-6272
 URL: https://issues.apache.org/jira/browse/HBASE-6272
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang

 AssignmentManger stores region state related information in several places: 
 regionsInTransition, regions (region info to server name map), and servers 
 (server name to region info set map).  However the access to these places is 
 not coordinated properly.  It leads to inconsistent in-memory region state 
 information.  Sometimes, some region could even be offline, and not in 
 transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent

2012-07-15 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414767#comment-13414767
 ] 

Zhihong Ted Yu commented on HBASE-6272:
---

Please consider the scenario described in HBASE-6060

Thanks

 In-memory region state is inconsistent
 --

 Key: HBASE-6272
 URL: https://issues.apache.org/jira/browse/HBASE-6272
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang

 AssignmentManger stores region state related information in several places: 
 regionsInTransition, regions (region info to server name map), and servers 
 (server name to region info set map).  However the access to these places is 
 not coordinated properly.  It leads to inconsistent in-memory region state 
 information.  Sometimes, some region could even be offline, and not in 
 transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-14 Thread Zhihong Ted Yu (JIRA)
Zhihong Ted Yu created HBASE-6396:
-

 Summary: Fix NoSuchMethodError running against hadoop 2.0
 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu


HADOOP-8350 changed the signature of NetUtils.getInputStream()
This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().

See 
https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-14 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6396:
--

Attachment: 6396.txt

Patch allows TestPBOnWritableRpc to pass against hadoop 2.0

 Fix NoSuchMethodError running against hadoop 2.0
 

 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
 Fix For: 0.96.0

 Attachments: 6396.txt


 HADOOP-8350 changed the signature of NetUtils.getInputStream()
 This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().
 See 
 https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-14 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6396:
--

Fix Version/s: 0.96.0

 Fix NoSuchMethodError running against hadoop 2.0
 

 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
 Fix For: 0.96.0

 Attachments: 6396.txt


 HADOOP-8350 changed the signature of NetUtils.getInputStream()
 This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().
 See 
 https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-14 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6396:
--

Status: Patch Available  (was: Open)

 Fix NoSuchMethodError running against hadoop 2.0
 

 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
 Fix For: 0.96.0

 Attachments: 6396.txt


 HADOOP-8350 changed the signature of NetUtils.getInputStream()
 This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().
 See 
 https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-14 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414436#comment-13414436
 ] 

Zhihong Ted Yu commented on HBASE-6396:
---

I first saw this error from the following thread:
http://search-hadoop.com/m/MeBkIxlDj52/Hive+and+CDH4+GAsubj=Hive+and+CDH4+GA

I put my suggestion on HADOOP-8350 as to how impact for downstream projects can 
be minimized.

But the cast is safe to have regardless of how HADOOP-8350 is implemented.

 Fix NoSuchMethodError running against hadoop 2.0
 

 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
 Fix For: 0.96.0

 Attachments: 6396.txt


 HADOOP-8350 changed the signature of NetUtils.getInputStream()
 This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().
 See 
 https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-14 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6396:
--

Status: Open  (was: Patch Available)

 Fix NoSuchMethodError running against hadoop 2.0
 

 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
 Fix For: 0.96.0


 HADOOP-8350 changed the signature of NetUtils.getInputStream()
 This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().
 See 
 https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-14 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6396:
--

Attachment: (was: 6396.txt)

 Fix NoSuchMethodError running against hadoop 2.0
 

 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
 Fix For: 0.96.0


 HADOOP-8350 changed the signature of NetUtils.getInputStream()
 This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().
 See 
 https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-14 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6396:
--

Status: Patch Available  (was: Open)

 Fix NoSuchMethodError running against hadoop 2.0
 

 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
 Fix For: 0.96.0

 Attachments: 6396-v2.txt


 HADOOP-8350 changed the signature of NetUtils.getInputStream()
 This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().
 See 
 https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6396) Fix NoSuchMethodError running against hadoop 2.0

2012-07-14 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6396:
--

Attachment: 6396-v2.txt

I used the following commands to verify patch v2:
{code}
mvn clean compile
nohup mvn help:active-profiles -Dhadoop.profile=2.0 test  ../suite.txt 
{code}

 Fix NoSuchMethodError running against hadoop 2.0
 

 Key: HBASE-6396
 URL: https://issues.apache.org/jira/browse/HBASE-6396
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu
Assignee: Zhihong Ted Yu
 Fix For: 0.96.0

 Attachments: 6396-v2.txt


 HADOOP-8350 changed the signature of NetUtils.getInputStream()
 This leads to NoSuchMethodError in HBaseClient$Connection.setupIOstreams().
 See 
 https://issues.apache.org/jira/browse/HADOOP-8350?focusedCommentId=13414276page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414276

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414161#comment-13414161
 ] 

Zhihong Ted Yu commented on HBASE-6394:
---

{code}
+replicatedScanner.close();
{code}
I was expecting 'replicatedScanner = null' following the above call.

 verifyrep MR job map tasks throws NullPointerException 
 ---

 Key: HBASE-6394
 URL: https://issues.apache.org/jira/browse/HBASE-6394
 Project: HBase
  Issue Type: Bug
  Components: replication
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: 6394-trunk.patch


 {noformat}
 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
 Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
 child
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
 for the task
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6395) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Zhihong Ted Yu (JIRA)
Zhihong Ted Yu created HBASE-6395:
-

 Summary: TestFSSchedulerApp should be in scheduler.fair package
 Key: HBASE-6395
 URL: https://issues.apache.org/jira/browse/HBASE-6395
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu


MAPREDUCE-3451 added Fair Scheduler to MRv2

TestFSSchedulerApp was added under 
src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair but 
its package was declared to be 
org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-6395) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu resolved HBASE-6395.
---

Resolution: Won't Fix

This should have been a MAPREDUCE JIRA.

 TestFSSchedulerApp should be in scheduler.fair package
 --

 Key: HBASE-6395
 URL: https://issues.apache.org/jira/browse/HBASE-6395
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu

 MAPREDUCE-3451 added Fair Scheduler to MRv2
 TestFSSchedulerApp was added under 
 src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair 
 but its package was declared to be 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414189#comment-13414189
 ] 

Zhihong Ted Yu commented on HBASE-6394:
---

+1 on patch v2.

 verifyrep MR job map tasks throws NullPointerException 
 ---

 Key: HBASE-6394
 URL: https://issues.apache.org/jira/browse/HBASE-6394
 Project: HBase
  Issue Type: Bug
  Components: replication
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: 6394-trunk.patch, 6394-trunk_v2.patch


 {noformat}
 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
 Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
 child
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
 for the task
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize

2012-07-12 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6380:
--

Attachment: 6380-trunk.txt

Patch rebased for trunk.

 bulkload should update the store.storeSize
 --

 Key: HBASE-6380
 URL: https://issues.apache.org/jira/browse/HBASE-6380
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0, 0.96.0
Reporter: Jie Huang
Priority: Critical
 Attachments: 6380-trunk.txt, hbase-6380_0_94_0.patch


 After bulkloading some HFiles into the Table, we found the force-split didn't 
 work because of the MidKey == NULL. Only if we re-booted the HBase service, 
 the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize

2012-07-12 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6380:
--

Attachment: (was: 6380-trunk.txt)

 bulkload should update the store.storeSize
 --

 Key: HBASE-6380
 URL: https://issues.apache.org/jira/browse/HBASE-6380
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0, 0.96.0
Reporter: Jie Huang
Priority: Critical
 Attachments: 6380-trunk.txt, hbase-6380_0_94_0.patch


 After bulkloading some HFiles into the Table, we found the force-split didn't 
 work because of the MidKey == NULL. Only if we re-booted the HBase service, 
 the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize

2012-07-12 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6380:
--

Attachment: 6380-trunk.txt

 bulkload should update the store.storeSize
 --

 Key: HBASE-6380
 URL: https://issues.apache.org/jira/browse/HBASE-6380
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0, 0.96.0
Reporter: Jie Huang
Priority: Critical
 Attachments: 6380-trunk.txt, hbase-6380_0_94_0.patch


 After bulkloading some HFiles into the Table, we found the force-split didn't 
 work because of the MidKey == NULL. Only if we re-booted the HBase service, 
 the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize

2012-07-12 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6380:
--

Status: Patch Available  (was: Open)

 bulkload should update the store.storeSize
 --

 Key: HBASE-6380
 URL: https://issues.apache.org/jira/browse/HBASE-6380
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0, 0.96.0
Reporter: Jie Huang
Priority: Critical
 Attachments: 6380-trunk.txt, hbase-6380_0_94_0.patch


 After bulkloading some HFiles into the Table, we found the force-split didn't 
 work because of the MidKey == NULL. Only if we re-booted the HBase service, 
 the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6382) Upgrade Jersey to 1.8 to match Hadoop 2

2012-07-12 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413023#comment-13413023
 ] 

Zhihong Ted Yu commented on HBASE-6382:
---

Looks like hadoop 1.0 is using 1.8 as well:
{code}
ivy/libraries.properties:jersey-core.version=1.8
ivy/libraries.properties:jersey-json.version=1.8
ivy/libraries.properties:jersey-server.version=1.8
{code}
Suggest changing the title of this JIRA.

 Upgrade Jersey to 1.8 to match Hadoop 2
 ---

 Key: HBASE-6382
 URL: https://issues.apache.org/jira/browse/HBASE-6382
 Project: HBase
  Issue Type: Improvement
  Components: rest
Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
Reporter: David S. Wang
Assignee: David S. Wang

 Upgrade Jersey dependency from 1.4 to 1.8 to match Hadoop 2 dependency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6383) Investigate using 2Q for block cache

2012-07-12 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413047#comment-13413047
 ] 

Zhihong Ted Yu commented on HBASE-6383:
---

Found this: 
http://code.google.com/p/custard-cache/source/browse/trunk/custard-cache-policies/src/main/java/com/custardsource/cache/policy/twoq/TwoQCacheManager.java?r=38
custard-cache is Apache License 2.0

 Investigate using 2Q for block cache
 

 Key: HBASE-6383
 URL: https://issues.apache.org/jira/browse/HBASE-6383
 Project: HBase
  Issue Type: New Feature
  Components: performance, regionserver
Affects Versions: 0.96.0
Reporter: Jesse Yates
Priority: Minor

 Currently we use a basic version of LRU to handle block caching. LRU is know 
 to be very susceptible to scan thrashing (not scan resistant), which is a 
 common operation in HBase. 2Q is an efficient caching algorithm that emulates 
 the effectivness of LRU/2 (eviction based not on the last access, but rather 
 the access before the last), but is O(1), rather than O(lg\(n)) in complexity.
 JD has long been talking about investigating 2Q as it may be far better for 
 HBase than LRU and has been shown to be incredibly useful for traditional 
 database caching on production systems.
 One would need to implement 2Q (though the pseudocode in the paper is quite 
 explicit) and then test against the existing cache implementation.
 The link to the original paper is here: www.vldb.org/conf/1994/P439.PDF
 A short overview of 2Q:
 2Q uses two queues (hence the name) and a list of pointers to keep track of 
 cached blocks. The first queue is for new, hot items (Ain). If an item is 
 accessed that isn't in Ain, the coldest block is evicted from Ain and the new 
 item replaces it. Anything accessed in Ain is already stored in memory and 
 kept in Ain.
 When a block is evicted from Ain, it is moved to Aout _as a pointer_. If Aout 
 is full, the oldest element is evicted and replaced with the new pointer.
 The key to 2Q comes in that when you access something in Aout, it is reloaded 
 into memory and stored in queue B. If B becomes full, then the coldest block 
 is evicted. 
 This essentially makes Aout a filter for long-term hot items, based on the 
 size of Aout. The original authors found that while you can tune Aout, it 
 generally performs very well at at 50% of the number of pages as would fit 
 into the buffer, but can be tuned as low as 5% at only a slight cost to 
 responsiveness to changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5547) Don't delete HFiles when in backup mode

2012-07-12 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-5547:
--

Attachment: 5547-v12.txt

 Don't delete HFiles when in backup mode
 -

 Key: HBASE-5547
 URL: https://issues.apache.org/jira/browse/HBASE-5547
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Jesse Yates
 Fix For: 0.94.2

 Attachments: 5547-v12.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, 
 hbase-5547-v9.patch, java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, 
 java_HBASE-5547_v6.patch, java_HBASE-5547_v7.patch


 This came up in a discussion I had with Stack.
 It would be nice if HBase could be notified that a backup is in progress (via 
 a znode for example) and in that case either:
 1. rename HFiles to be delete to file.bck
 2. rename the HFiles into a special directory
 3. rename them to a general trash directory (which would not need to be tied 
 to backup mode).
 That way it should be able to get a consistent backup based on HFiles (HDFS 
 snapshots or hard links would be better options here, but we do not have 
 those).
 #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-12 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413197#comment-13413197
 ] 

Zhihong Ted Yu commented on HBASE-4050:
---

Putting patch on review board would help.
https://reviews.apache.org/r/new/ gave me Error 500 ...

 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Alex Baranau
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-12 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413319#comment-13413319
 ] 

Zhihong Ted Yu commented on HBASE-4050:
---

Latest patch didn't compile against hadoop 2.0
See https://builds.apache.org/job/PreCommit-HBASE-Build/2372/console

 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Alex Baranau
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-12 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413356#comment-13413356
 ] 

Zhihong Ted Yu commented on HBASE-4050:
---

{code}
+public interface BaseMetricsSource {
{code}
I suggest adding the following for above interface:
{code}
@InterfaceAudience.Public
@InterfaceStability.Evolving
{code}
{code}
+   * Subtract some amount to a gauge.
{code}
'to a' - 'from a'
{code}
+rms = 
ServiceLoader.load(ReplicationMetricsSource.class).iterator().next();
{code}
Shall we traverse the iterator and warn user if there are more than one 
implementation found ?

Will continue on the review board.

 Update HBase metrics framework to metrics2 framework
 

 Key: HBASE-4050
 URL: https://issues.apache.org/jira/browse/HBASE-4050
 Project: HBase
  Issue Type: New Feature
  Components: metrics
Affects Versions: 0.90.4
 Environment: Java 6
Reporter: Eric Yang
Assignee: Alex Baranau
Priority: Critical
 Fix For: 0.96.0

 Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
 HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
 HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050.patch


 Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
 and it might get removed in future Hadoop release.  Hence, HBase needs to 
 revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-12 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6389:
--

Hadoop Flags: Reviewed

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-12 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413441#comment-13413441
 ] 

Zhihong Ted Yu commented on HBASE-6389:
---

Several variables are no longer final but I only see this extra assignment:
{code}
+maxToStart = minToStart;
{code}
It would be nice to keep other variables final.

 Modify the conditions to ensure that Master waits for sufficient number of 
 Region Servers before starting region assignments
 

 Key: HBASE-6389
 URL: https://issues.apache.org/jira/browse/HBASE-6389
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-6389_trunk.patch


 Continuing from HBASE-6375.
 It seems I was mistaken in my assumption that changing the value of 
 hbase.master.wait.on.regionservers.mintostart to a sufficient number (from 
 default of 1) can help prevent assignment of all regions to one (or a small 
 number of) region server(s).
 While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
 0.94.0 onwards to address HBASE-4993.
 From 0.94.0 onwards, Master will proceed immediately after the timeout has 
 lapsed, even if hbase.master.wait.on.regionservers.mintostart has not 
 reached.
 Reading the current conditions of waitForRegionServers() clarifies it
 {code:title=ServerManager.java (trunk rev:1360470)}
 
 581 /**
 582  * Wait for the region servers to report in.
 583  * We will wait until one of this condition is met:
 584  *  - the master is stopped
 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
 587  *region servers is reached
 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
 AND
 589  *   there have been no new region server in for
 590  *  'hbase.master.wait.on.regionservers.interval' time
 591  *
 592  * @throws InterruptedException
 593  */
 594 public void waitForRegionServers(MonitoredTask status)
 595 throws InterruptedException {
 
 
 612   while (
 613 !this.master.isStopped() 
 614   slept  timeout 
 615   count  maxToStart 
 616   (lastCountChange+interval  now || count  minToStart)
 617 ){
 
 {code}
 So with the current conditions, the wait will end as soon as timeout is 
 reached even lesser number of RS have checked-in with the Master and the 
 master will proceed with the region assignment among these RSes alone.
 As mentioned in 
 -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
  and I concur, this could have disastrous effect in large cluster especially 
 now that MSLAB is turned on.
 To enforce the required quorum as specified by 
 hbase.master.wait.on.regionservers.mintostart irrespective of timeout, 
 these conditions need to be modified as following
 {code:title=ServerManager.java}
 ..
   /**
* Wait for the region servers to report in.
* We will wait until one of this condition is met:
*  - the master is stopped
*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
*region servers is reached
*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
*   there have been no new region server in for
*  'hbase.master.wait.on.regionservers.interval' time AND
*   the 'hbase.master.wait.on.regionservers.timeout' is reached
*
* @throws InterruptedException
*/
   public void waitForRegionServers(MonitoredTask status)
 ..
 ..
 int minToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.mintostart, 1);
 int maxToStart = this.master.getConfiguration().
 getInt(hbase.master.wait.on.regionservers.maxtostart, 
 Integer.MAX_VALUE);
 if (maxToStart  minToStart) {
   maxToStart = minToStart;
 }
 ..
 ..
 while (
   !this.master.isStopped() 
 count  maxToStart 
 (lastCountChange+interval  now || timeout  slept || count  
 minToStart)
   ){
 ..
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6369) HTable is not closed in AggregationClient

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13411271#comment-13411271
 ] 

Zhihong Ted Yu commented on HBASE-6369:
---

Hadoop QA actually passed:
{code}
[INFO] HBase . SUCCESS [1.927s]
[INFO] HBase - Common  SUCCESS [4.046s]
[INFO] HBase - Server  SUCCESS [38:32.534s]
[INFO] HBase - Integration Tests . SUCCESS [1.405s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 38:40.251s
[INFO] Finished at: Wed Jul 11 05:42:45 UTC 2012
{code}
Will integrate tomorrow if there is no objection.

 HTable is not closed in AggregationClient
 -

 Key: HBASE-6369
 URL: https://issues.apache.org/jira/browse/HBASE-6369
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: binlijin
Assignee: binlijin
 Fix For: 0.92.2, 0.96.0, 0.94.2

 Attachments: HBASE-6369-0.92-2.patch, HBASE-6369-0.92.patch, 
 HBASE-6369-0.94-2.patch, HBASE-6369-0.94.patch, HBASE-6369-trunk-2.patch, 
 HBASE-6369-trunk.patch


 In AggregationClient, HTable instance is not closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

2012-07-11 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6284:
--

Fix Version/s: (was: 0.94.2)
   0.94.1

Integrated to 0.94 branch as well.

Thanks for the review, Lars.

 Introduce HRegion#doMiniBatchMutation()
 ---

 Key: HBASE-6284
 URL: https://issues.apache.org/jira/browse/HBASE-6284
 Project: HBase
  Issue Type: Bug
  Components: performance, regionserver
Reporter: Zhihong Ted Yu
Assignee: Anoop Sam John
 Fix For: 0.96.0, 0.94.1

 Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, 
 HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, 
 HBASE-6284_Trunk.patch


 From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
 The HTable#delete(ListDelete) groups the Deletes for the same RS and make 
 one n/w call only. But within the RS, there will be N number of delete calls 
 on the region one by one. This will include N number of HLog write and sync. 
 If this also can be grouped can we get better performance for the multi row 
 delete.
 I have made the new miniBatchDelete () and made the 
 HTable#delete(ListDelete) to call this new batch delete.
 Just tested initially with the one node cluster.  In that itself I am getting 
 a performance boost which is very much promising.
 Only one CF and qualifier.
 10K total rows delete with a batch of 100 deletes. Only deletes happening on 
 the table from one thread.
 With the new way the net time taken is reduced by more than 1/10
 Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6369) HTable is not closed in AggregationClient

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13411494#comment-13411494
 ] 

Zhihong Ted Yu commented on HBASE-6369:
---

Integrated to trunk.

Thanks for the patch, binlijin.

Thanks for the review, Stack.

 HTable is not closed in AggregationClient
 -

 Key: HBASE-6369
 URL: https://issues.apache.org/jira/browse/HBASE-6369
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: binlijin
Assignee: binlijin
 Fix For: 0.92.2, 0.96.0, 0.94.2

 Attachments: HBASE-6369-0.92-2.patch, HBASE-6369-0.92.patch, 
 HBASE-6369-0.94-2.patch, HBASE-6369-0.94.patch, HBASE-6369-trunk-2.patch, 
 HBASE-6369-trunk.patch


 In AggregationClient, HTable instance is not closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5151) Rename hbase.skip.errors in HRegion as it is too general-sounding.

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13411570#comment-13411570
 ] 

Zhihong Ted Yu commented on HBASE-5151:
---

In trunk build 3118:
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) 
on project hbase-server: Compilation failure: Compilation failure:
[ERROR] 
/home/jenkins/jenkins-slave/workspace/HBase-TRUNK/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:[2854,63]
 ')' expected
[ERROR] 
[ERROR] 
/home/jenkins/jenkins-slave/workspace/HBase-TRUNK/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:[2855,19]
 not a statement
[ERROR] 
[ERROR] 
/home/jenkins/jenkins-slave/workspace/HBase-TRUNK/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:[2855,22]
 ';' expected
[ERROR] 
[ERROR] 
/home/jenkins/jenkins-slave/workspace/HBase-TRUNK/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:[2855,24]
 not a statement
[ERROR] 
[ERROR] 
/home/jenkins/jenkins-slave/workspace/HBase-TRUNK/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:[2855,25]
 ';' expected
{code}

 Rename hbase.skip.errors in HRegion as it is too general-sounding.
 

 Key: HBASE-5151
 URL: https://issues.apache.org/jira/browse/HBASE-5151
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 0.94.0
Reporter: Harsh J
Assignee: Harsh J
 Fix For: 0.96.0

 Attachments: HBASE-5151.patch, HBASE-5151.patch


 We should rename hbase.skip.errors, used in HRegion.java for skipping 
 errors when replaying edits. It should probably be something more like 
 hbase.hregion.edits.replay.skip.errors or so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5151) Rename hbase.skip.errors in HRegion as it is too general-sounding.

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13411584#comment-13411584
 ] 

Zhihong Ted Yu commented on HBASE-5151:
---

@Harsh:
In the future, please version patches with a number.
Now we have three attachments with the same name.

 Rename hbase.skip.errors in HRegion as it is too general-sounding.
 

 Key: HBASE-5151
 URL: https://issues.apache.org/jira/browse/HBASE-5151
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 0.94.0
Reporter: Harsh J
Assignee: Harsh J
 Fix For: 0.96.0

 Attachments: HBASE-5151.amend.patch, HBASE-5151.patch, 
 HBASE-5151.patch, HBASE-5151.patch


 We should rename hbase.skip.errors, used in HRegion.java for skipping 
 errors when replaying edits. It should probably be something more like 
 hbase.hregion.edits.replay.skip.errors or so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




<    1   2   3   4   5   6   7   8   9   10   >