[jira] [Commented] (HBASE-4773) HBaseAdmin leaks ZooKeeper connections

2011-11-25 Thread xufeng (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157024#comment-13157024
 ] 

xufeng commented on HBASE-4773:
---

yes, I have tested it in my cluster.

Here is my client test code:
{noformat}
.
  static void initHBase() throws ZooKeeperConnectionException
  {
HBaseAdmin hbaseAdmin = null;
Configuration config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", 
"158.1.130.31,158.1.130.32,158.1.130.33");
config.set("hbase.zookeeper.property.clientPort", "2181");

try {
  hbaseAdmin = new HBaseAdmin(config);
  System.out.println("init sucess!");
} catch (MasterNotRunningException e) {
  e.printStackTrace();
  initHBase();
  
} catch (ZooKeeperConnectionException e) {
  e.printStackTrace();
  initHBase();
}
  }
}
.
{noformat}

In my cluster I did not start HBase process.

Run test,result of the lsof commondline is:
{noformat}
java  16735   root   72w  REG  253,3   890569 
524379 /opt/xf/hadoop.log
java  16735   root   73w  REG  253,3   274338 
524376 /opt/xf/HA_hadoop.log
java  16735   root   74r FIFO0,8  0t0  
110645029 pipe
java  16735   root   75w FIFO0,8  0t0  
110645029 pipe
java  16735   root   76u 0,90 
21 anon_inode
java  16735   root   77u IPv6  110645030  0t0
TCP C3S31:35186->C3S33:eforward (ESTABLISHED)
java  16735   root   78u unix 0x8800cba90380  0t0  
110645035 socket
java  16735   root   79u sock0,6  0t0  
110645032 can't identify protocol
java  16735   root   80r FIFO0,8  0t0  
110645037 pipe
java  16735   root   81w FIFO0,8  0t0  
110645037 pipe
java  16735   root   82u 0,90 
21 anon_inode
java  16735   root   83u IPv6  110645038  0t0
TCP C3S31:53727->C3S31:eforward (ESTABLISHED)
java  16735   root   84r FIFO0,8  0t0  
110645043 pipe
java  16735   root   85w FIFO0,8  0t0  
110645043 pipe
java  16735   root   86u 0,90 
21 anon_inode
java  16735   root   87u IPv6  110645044  0t0
TCP C3S31:53728->C3S31:eforward (ESTABLISHED)
java  16735   root   88r FIFO0,8  0t0  
110645047 pipe
java  16735   root   89w FIFO0,8  0t0  
110645047 pipe
java  16735   root   90u 0,90 
21 anon_inode
java  16735   root   91u IPv6  110645048  0t0
TCP C3S31:47183->C3S32:eforward (ESTABLISHED)
java  16735   root   92r FIFO0,8  0t0  
110645050 pipe
java  16735   root   93w FIFO0,8  0t0  
110645050 pipe
java  16735   root   94u 0,90 
21 anon_inode
java  16735   root   95u IPv6  110645051  0t0
TCP C3S31:53730->C3S31:eforward (ESTABLISHED)
java  16735   root   96r FIFO0,8  0t0  
110645135 pipe
java  16735   root   97w FIFO0,8  0t0  
110645135 pipe
java  16735   root   98u 0,90 
21 anon_inode
java  16735   root   99u IPv6  110645136  0t0
TCP C3S31:49799->C3S31:eforward (ESTABLISHED)
java  16735   root  100r FIFO0,8  0t0  
110645143 pipe
java  16735   root  101w FIFO0,8  0t0  
110645143 pipe
java  16735   root  102u 0,90 
21 anon_inode
java  16735   root  103u IPv6  110645144  0t0
TCP C3S31:38931->C3S32:eforward (ESTABLISHED)
java  16735   root  104r FIFO0,8  0t0  
110645148 pipe
java  16735   root  105w FIFO0,8  0t0  
110645148 pipe
java  16735   root  106u 0,90 
21 anon_inode
java  16735   root  107u IPv6  110645149  0t0
TCP C3S31:59939->C3S33:eforward (ESTABLISHED)
java  16735   root  108r FIFO0,8  0t0  
110645507 pipe
java  16735   root  109w FIFO0,8  0t0  
110645507 pipe
java  16735   root  110u 0,90 
21 anon_inode
java  16735   root  111u IPv6  110645508  0t0
TCP C3S31:59940->C3S33:eforward (ESTABLISHED)
{noformat}

The [eforward] is p

[jira] [Created] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread gaojinchao (Created) (JIRA)
testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
-

 Key: HBASE-4868
 URL: https://issues.apache.org/jira/browse/HBASE-4868
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.92.0
Reporter: gaojinchao
Priority: Minor
 Fix For: 0.94.0


looks: 
https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
Please review, see whether the method makes sense? 
If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread gaojinchao (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4868:
--

Attachment: HBASE-4868_trial.patch

> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4868:
--

Status: Patch Available  (was: Open)

> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4864) TestMasterObserver#testRegionTransitionOperations occasionally fails

2011-11-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157066#comment-13157066
 ] 

Hudson commented on HBASE-4864:
---

Integrated in HBase-TRUNK-security #9 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/9/])
HBASE-4864  TestMasterObserver#testRegionTransitionOperations occasionally
   fails (Gao Jinchao)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java


> TestMasterObserver#testRegionTransitionOperations occasionally fails
> 
>
> Key: HBASE-4864
> URL: https://issues.apache.org/jira/browse/HBASE-4864
> Project: HBase
>  Issue Type: Test
>  Components: test
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4864_Branch92.patch
>
>
> looks this logs:
> https://builds.apache.org/job/HBase-TRUNK-security/ws/trunk/target/surefire-reports/
> It seems that we should wait region is added to online region set.
> I made a patch, Please review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4856) Upgrade zookeeper to 3.4.0 release

2011-11-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157067#comment-13157067
 ] 

Hudson commented on HBASE-4856:
---

Integrated in HBase-TRUNK-security #9 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/9/])
HBASE-4856  Upgrade zookeeper to 3.4.0 release

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/pom.xml


> Upgrade zookeeper to 3.4.0 release
> --
>
> Key: HBASE-4856
> URL: https://issues.apache.org/jira/browse/HBASE-4856
> Project: HBase
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.92.0
>
> Attachments: 4856.txt
>
>
> Zookeeper 3.4.0 has been released.
> We should upgade.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4855) SplitLogManager hangs on cluster restart due to batch.installed doubly counted

2011-11-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157068#comment-13157068
 ] 

Hudson commented on HBASE-4855:
---

Integrated in HBase-TRUNK-security #9 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/9/])
HBASE-4855  SplitLogManager hangs on cluster restart due to batch.installed 
doubly counted

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java


> SplitLogManager hangs on cluster restart due to batch.installed doubly counted
> --
>
> Key: HBASE-4855
> URL: https://issues.apache.org/jira/browse/HBASE-4855
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.92.0
>
> Attachments: HBASE-4855.patch
>
>
> Start a master and RS
> RS goes down (kill -9)
> Wait for ServerShutDownHandler to create the splitlog nodes. As no RS is 
> there it cannot be processed.
> Restart both master and bring up an RS.
> The master hangs in SplitLogManager.waitforTasks().
> I feel that batch.done is not getting incremented properly.  Not yet digged 
> in fully.
> This may be the reason for occasional failure of 
> TestDistributedLogSplitting.testWorkerAbort(). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4773) HBaseAdmin leaks ZooKeeper connections

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157069#comment-13157069
 ] 

Ted Yu commented on HBASE-4773:
---

+1 on patch. 

Can you make a patch for trunk ?

> HBaseAdmin leaks ZooKeeper connections
> --
>
> Key: HBASE-4773
> URL: https://issues.apache.org/jira/browse/HBASE-4773
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 4773.patch
>
>
> When master crashs, HBaseAdmin will leaks ZooKeeper connections
> I think we should close the zk connetion when throw MasterNotRunningException
>  public HBaseAdmin(Configuration c)
>   throws MasterNotRunningException, ZooKeeperConnectionException {
> this.conf = HBaseConfiguration.create(c);
> this.connection = HConnectionManager.getConnection(this.conf);
> this.pause = this.conf.getLong("hbase.client.pause", 1000);
> this.numRetries = this.conf.getInt("hbase.client.retries.number", 10);
> this.retryLongerMultiplier = 
> this.conf.getInt("hbase.client.retries.longer.multiplier", 10);
> //we should add this code and close the zk connection
> try{
>   this.connection.getMaster();
> }catch(MasterNotRunningException e){
>   HConnectionManager.deleteConnection(conf, false);
>   throw e;  
> }
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test

2011-11-25 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4863:
--

Status: Open  (was: Patch Available)

> Make HBase Thrift server more configurable and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test

2011-11-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4863:
---

Attachment: D531.4.patch

mbautin updated the revision "[jira] [HBASE-4863] Make HBase Thrift server more 
configurable and add a command-line UI test".
Reviewers: JIRA, Kannan, tedyu, stack

  Fixing a bug in TestThreads. Cluster testing is going well. I will kick off a 
unit test run on Jenkins.

REVISION DETAIL
  https://reviews.facebook.net/D531

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/thrift/TBoundedThreadPoolServer.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java
  src/main/java/org/apache/hadoop/hbase/util/Threads.java
  src/main/resources/hbase-default.xml
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServerCmdLine.java
  src/test/java/org/apache/hadoop/hbase/util/TestThreads.java


> Make HBase Thrift server more configurable and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test

2011-11-25 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4863:
--

Attachment: 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch

> Make HBase Thrift server more configurable and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test

2011-11-25 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4863:
--

Status: Patch Available  (was: Open)

> Make HBase Thrift server more configurable and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4867) A tool to merge configuration files

2011-11-25 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4867:
---

Attachment: D537.1.patch

mbautin requested code review of "[jira] [HBASE-4867] A tool to merge 
configuration files".
Reviewers: todd, Karthik, tedyu, stack, JIRA

  With our cluster configuration setup it would be good to have a tool that 
would merge HBase configuration files so that files appearing later in the list 
would override properties specified in earlier files. This way we could merge 
an application-specific configuration file with a cluster-specific 
configuration file (with the latter overriding the former) and produce a single 
HBase configuration file to install on the cluster.

TEST PLAN
  Run the tool on two configuration files (common and cluster-specific). Use 
the resulting configuration on a dev cluster.

REVISION DETAIL
  https://reviews.facebook.net/D537

AFFECTED FILES
  src/main/python/hbase/merge_conf.py

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/1191/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


> A tool to merge configuration files
> ---
>
> Key: HBASE-4867
> URL: https://issues.apache.org/jira/browse/HBASE-4867
> Project: HBase
>  Issue Type: New Feature
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D537.1.patch
>
>
> With our cluster configuration setup it would be good to have a tool that 
> would merge HBase configuration, so that files appearing later in the list 
> would override properties specified in earlier files. This way we could merge 
> application-specific configuration file with the cluster-specific 
> configuration file (with the latter overriding the former) and produce a 
> single HBase configuration file to install on the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test

2011-11-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157126#comment-13157126
 ] 

Hadoop QA commented on HBASE-4863:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12505097/D531.4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 10 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -162 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 67 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestAdmin
  org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.client.TestInstantSchemaChange

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/366//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/366//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/366//console

This message is automatically generated.

> Make HBase Thrift server more configurable and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157133#comment-13157133
 ] 

Hadoop QA commented on HBASE-4868:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12505085/HBASE-4868_trial.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -162 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 66 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestInstantSchemaChange
  org.apache.hadoop.hbase.client.TestAdmin
  org.apache.hadoop.hbase.replication.TestMultiSlaveReplication

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/365//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/365//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/365//console

This message is automatically generated.

> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4773) HBaseAdmin leaks ZooKeeper connections

2011-11-25 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157183#comment-13157183
 ] 

ramkrishna.s.vasudevan commented on HBASE-4773:
---

+1 on patch

> HBaseAdmin leaks ZooKeeper connections
> --
>
> Key: HBASE-4773
> URL: https://issues.apache.org/jira/browse/HBASE-4773
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 4773.patch
>
>
> When master crashs, HBaseAdmin will leaks ZooKeeper connections
> I think we should close the zk connetion when throw MasterNotRunningException
>  public HBaseAdmin(Configuration c)
>   throws MasterNotRunningException, ZooKeeperConnectionException {
> this.conf = HBaseConfiguration.create(c);
> this.connection = HConnectionManager.getConnection(this.conf);
> this.pause = this.conf.getLong("hbase.client.pause", 1000);
> this.numRetries = this.conf.getInt("hbase.client.retries.number", 10);
> this.retryLongerMultiplier = 
> this.conf.getInt("hbase.client.retries.longer.multiplier", 10);
> //we should add this code and close the zk connection
> try{
>   this.connection.getMaster();
> }catch(MasterNotRunningException e){
>   HConnectionManager.deleteConnection(conf, false);
>   throw e;  
> }
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test

2011-11-25 Thread Ted Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156821#comment-13156821
 ] 

Ted Yu edited comment on HBASE-4863 at 11/25/11 3:21 PM:
-

I got compilation _error_:
{code}
testRunThriftServer[0](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine)  
Time elapsed: 2.047 sec  <<< ERROR!
java.lang.Error: Unresolved compilation problem:
  Cannot make a static reference to the non-static method 
getColumnDescriptors() from the type TestThriftServer

  at 
org.apache.hadoop.hbase.thrift.TestThriftServer.createDropTable(TestThriftServer.java:111)
{code}

Since HBaseThreadPoolServer extends TServer, I think a better name for the 
class would be TBoundedThreadPoolServer (TThreadPoolServer is in thrift).

  was (Author: yuzhih...@gmail.com):
I got compilation [error]:
{code}
testRunThriftServer[0](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine)  
Time elapsed: 2.047 sec  <<< ERROR!
java.lang.Error: Unresolved compilation problem:
  Cannot make a static reference to the non-static method 
getColumnDescriptors() from the type TestThriftServer

  at 
org.apache.hadoop.hbase.thrift.TestThriftServer.createDropTable(TestThriftServer.java:111)
{code}

Since HBaseThreadPoolServer extends TServer, I think a better name for the 
class would be TBoundedThreadPoolServer (TThreadPoolServer is in thrift).
  
> Make HBase Thrift server more configurable and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test

2011-11-25 Thread Ted Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156821#comment-13156821
 ] 

Ted Yu edited comment on HBASE-4863 at 11/25/11 3:20 PM:
-

I got compilation [error]:
{code}
testRunThriftServer[0](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine)  
Time elapsed: 2.047 sec  <<< ERROR!
java.lang.Error: Unresolved compilation problem:
  Cannot make a static reference to the non-static method 
getColumnDescriptors() from the type TestThriftServer

  at 
org.apache.hadoop.hbase.thrift.TestThriftServer.createDropTable(TestThriftServer.java:111)
{code}

Since HBaseThreadPoolServer extends TServer, I think a better name for the 
class would be TBoundedThreadPoolServer (TThreadPoolServer is in thrift).

  was (Author: yuzhih...@gmail.com):
I got compilation error:
{code}
testRunThriftServer[0](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine)  
Time elapsed: 2.047 sec  <<< ERROR!
java.lang.Error: Unresolved compilation problem:
  Cannot make a static reference to the non-static method 
getColumnDescriptors() from the type TestThriftServer

  at 
org.apache.hadoop.hbase.thrift.TestThriftServer.createDropTable(TestThriftServer.java:111)
{code}

Since HBaseThreadPoolServer extends TServer, I think a better name for the 
class would be TBoundedThreadPoolServer (TThreadPoolServer is in thrift).
  
> Make HBase Thrift server more configurable and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test

2011-11-25 Thread Ted Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156821#comment-13156821
 ] 

Ted Yu edited comment on HBASE-4863 at 11/25/11 3:22 PM:
-

I got _compilation error_:
{code}
testRunThriftServer[0](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine)  
Time elapsed: 2.047 sec  <<< ERROR!
java.lang.Error: Unresolved compilation problem:
  Cannot make a static reference to the non-static method 
getColumnDescriptors() from the type TestThriftServer

  at 
org.apache.hadoop.hbase.thrift.TestThriftServer.createDropTable(TestThriftServer.java:111)
{code}

Since HBaseThreadPoolServer extends TServer, I think a better name for the 
class would be TBoundedThreadPoolServer (TThreadPoolServer is in thrift).

  was (Author: yuzhih...@gmail.com):
I got compilation _error_:
{code}
testRunThriftServer[0](org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine)  
Time elapsed: 2.047 sec  <<< ERROR!
java.lang.Error: Unresolved compilation problem:
  Cannot make a static reference to the non-static method 
getColumnDescriptors() from the type TestThriftServer

  at 
org.apache.hadoop.hbase.thrift.TestThriftServer.createDropTable(TestThriftServer.java:111)
{code}

Since HBaseThreadPoolServer extends TServer, I think a better name for the 
class would be TBoundedThreadPoolServer (TThreadPoolServer is in thrift).
  
> Make HBase Thrift server more configurable and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4863) Make HBase Thrift server more configurable and add a command-line UI test

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4863:
--

Fix Version/s: 0.94.0

+1 on patch v2.
The test failures reported by HadoopQA weren't related to the patch.

> Make HBase Thrift server more configurable and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4773) HBaseAdmin leaks ZooKeeper connections

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157198#comment-13157198
 ] 

Ted Yu commented on HBASE-4773:
---

In TRUNK, we retry connecting to master several times:
{code}
  } catch (MasterNotRunningException mnre) {
HConnectionManager.deleteStaleConnection(this.connection);
this.connection = HConnectionManager.getConnection(this.conf);
{code}

@Xufeng:
Can you implement similar retry loop for 0.90 ?

Thanks

> HBaseAdmin leaks ZooKeeper connections
> --
>
> Key: HBASE-4773
> URL: https://issues.apache.org/jira/browse/HBASE-4773
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 4773.patch
>
>
> When master crashs, HBaseAdmin will leaks ZooKeeper connections
> I think we should close the zk connetion when throw MasterNotRunningException
>  public HBaseAdmin(Configuration c)
>   throws MasterNotRunningException, ZooKeeperConnectionException {
> this.conf = HBaseConfiguration.create(c);
> this.connection = HConnectionManager.getConnection(this.conf);
> this.pause = this.conf.getLong("hbase.client.pause", 1000);
> this.numRetries = this.conf.getInt("hbase.client.retries.number", 10);
> this.retryLongerMultiplier = 
> this.conf.getInt("hbase.client.retries.longer.multiplier", 10);
> //we should add this code and close the zk connection
> try{
>   this.connection.getMaster();
> }catch(MasterNotRunningException e){
>   HConnectionManager.deleteConnection(conf, false);
>   throw e;  
> }
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4868:
--

Fix Version/s: 0.92.0

>From 
>https://builds.apache.org/job/PreCommit-HBASE-Build/365//testReport/org.apache.hadoop.hbase.replication/TestMultiSlaveReplication/testMultiSlaveReplication/,
> I see:
{code}
2011-11-25 12:15:36,018 ERROR 
[org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@c7057c] 
datanode.DataXceiverServer(145): DatanodeRegistration(127.0.0.1:56231, 
storageID=DS-225434506-67.195.138.20-56231-133310312, infoPort=41654, 
ipcPort=53271):DataXceiveServer: Exiting due to:java.lang.OutOfMemoryError: 
unable to create new native thread
at java.lang.Thread.start0(Native Method)
{code}
The other two failed tests were due to 'Too many open files'

> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157206#comment-13157206
 ] 

Ted Yu commented on HBASE-4868:
---

+1 on patch.

> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread Ted Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157206#comment-13157206
 ] 

Ted Yu edited comment on HBASE-4868 at 11/25/11 3:58 PM:
-

@Jinchao:
The inner ZooKeeperWatcher is similar to the one in 
HBaseTestingUtility.createAndForceNodeToOpenedState().

Can you refactor the inner class in HBaseTestingUtility so that we can reuse it 
in other unit tests ?

Thanks

  was (Author: yuzhih...@gmail.com):
+1 on patch.
  
> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157250#comment-13157250
 ] 

Ted Yu commented on HBASE-4862:
---

Log snippets from Chunhui.
Region C was 3591e9867a4c125493dc82168854ea0c
File F was 13156791680

Master log:
{code}
2011-11-16 11:47:23,134 INFO org.apache.hadoop.hbase.master.ServerManager:
  Triggering server recovery; existingServer serverB,60020,1321415172631 looks 
stale
  2011-11-16 11:47:23,134 DEBUG org.apache.hadoop.hbase.master.ServerManager:
  Added=serverB,60020,1321415172631 to dead servers, submitted shutdown handler 
to be executed, root=false, meta=true

  2011-11-16 11:47:29,305 INFO org.apache.hadoop.hbase.master.ServerManager:
  Triggering server recovery; existingServer serverA,60020,1321415179549 looks 
stale
  2011-11-16 11:47:29,305 DEBUG org.apache.hadoop.hbase.master.ServerManager:
  Added=serverA,60020,1321415179549 to dead servers, submitted shutdown handler 
to be executed, root=false, meta=false

  2011-11-16 11:48:28,700 INFO 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
  Splitting 28 hlog(s) in 
hdfs://serverX:9000/hbase-common/.logs/serverB,60020,1321414043798

  2011-11-16 11:48:30,657 DEBUG 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
  Creating writer 
path=hdfs://serverX:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/13156800103
 region=3591e9867a4c125493dc82168854ea0c

  2011-11-16 11:49:17,855 INFO 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
  Closed path 
hdfs://serverX:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/13156800103
 (wrote 75875 edits in 3228ms)

  2011-11-16 11:49:19,629 INFO 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
  Splitting 28 hlog(s) in 
hdfs://serverX:9000/hbase-common/.logs/serverA,60020,1321414056134

  2011-11-16 11:49:20,650 DEBUG 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
  Creating writer 
path=hdfs://serverX:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/13156791680
 region=3591e9867a4c125493dc82168854ea0c

  2011-11-16 11:49:36,731 DEBUG 
org.apache.hadoop.hbase.master.AssignmentManager:
  Assigning region 
writetest1,19ILNKUHRKQ3BT0FLC9CMVWBP2ZPRV4W7XYA491BE6ZS2JE9132BO5GABIHNJHDU79TXBA4OOAP8OEIVTQ0PDHZB26QI5XHY17BK,1321267032810.3591e9867a4c125493dc82168854ea0c.
 to serverD,60020,1321415224381

  2011-11-16 11:49:49,755 DEBUG 
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region 
writetest1,19ILNKUHRKQ3BT0FLC9CMVWBP2ZPRV4W7XYA491BE6ZS2JE9132BO5GABIHNJHDU79TXBA4OOAP8OEIVTQ0PDHZB26QI5XHY17BK,1321267032810.3591e9867a4c125493dc82168854ea0c.
 on serverD,60020,1321415224381

  2011-11-16 11:50:13,030 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
Exception: org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/13156791680
 File does not exist.

  2011-11-16 11:50:13,037 FATAL 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-1 Got while 
writing log entry to log
  org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: 
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/13156791680
 File does not exist.

  2011-11-16 11:50:13,051 ERROR 
org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting 
hdfs://serverX:9000/hbase-common/.logs/serverA,60020,1321414056134
{code}
Log from region server D:
{code}
2011-11-16 11:49:36,730 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 
region: 
writetest1,19ILNKUHRKQ3BT0FLC9CMVWBP2ZPRV4W7XYA491BE6ZS2JE9132BO5GABIHNJHDU79TXBA4OOAP8OEIVTQ0PDHZB26QI5XHY17BK,1321267032810.3591e9867a4c125493dc82168854ea0c.

2011-11-16 11:49:49,727 ERROR org.apache.hadoop.hbase.regionserver.HRegion:
Failed delete of 
hdfs://serverX:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/13156791680
 
2011-11-16 11:49:49,733 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Onlined 
writetest1,19ILNKUHRKQ3BT0FLC9CMVWBP2ZPRV4W7XYA491BE6ZS2JE9132BO5GABIHNJHDU79TXBA4OOAP8OEIVTQ0PDHZB26QI5XHY17BK,1321267032810.3591e9867a4c125493dc82168854ea0c.;
 next sequenceid=13160672878
{code}


> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
>

[jira] [Updated] (HBASE-4863) Make Thrift server thread pool bounded and add a command-line UI test

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4863:
--

Summary: Make Thrift server thread pool bounded and add a command-line UI 
test  (was: Make HBase Thrift server more configurable and add a command-line 
UI test)

> Make Thrift server thread pool bounded and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4863) Make Thrift server thread pool bounded and add a command-line UI test

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157257#comment-13157257
 ] 

Ted Yu commented on HBASE-4863:
---

Integrated to TRUNK.

Thanks for the patch Mikhail.

> Make Thrift server thread pool bounded and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4820) Distributed log splitting coding enhancement to make it easier to understand, no semantics change

2011-11-25 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4820:
---

Status: Open  (was: Patch Available)

> Distributed log splitting coding enhancement to make it easier to understand, 
> no semantics change
> -
>
> Key: HBASE-4820
> URL: https://issues.apache.org/jira/browse/HBASE-4820
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
>  Labels: newbie
> Fix For: 0.94.0
>
> Attachments: 
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch,
>  
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch,
>  0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme.patch, 
> 0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme.patch
>
>
> In reviewing distributed log splitting feature, we found some cosmetic 
> issues.  They make the code hard to understand.
> It will be great to fix them.  For this issue, there should be no semantic 
> change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4820) Distributed log splitting coding enhancement to make it easier to understand, no semantics change

2011-11-25 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4820:
---

Attachment: 
0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme_new.patch

> Distributed log splitting coding enhancement to make it easier to understand, 
> no semantics change
> -
>
> Key: HBASE-4820
> URL: https://issues.apache.org/jira/browse/HBASE-4820
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
>  Labels: newbie
> Fix For: 0.94.0
>
> Attachments: 
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch,
>  
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch,
>  0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme.patch, 
> 0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme.patch, 
> 0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme_new.patch
>
>
> In reviewing distributed log splitting feature, we found some cosmetic 
> issues.  They make the code hard to understand.
> It will be great to fix them.  For this issue, there should be no semantic 
> change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4820) Distributed log splitting coding enhancement to make it easier to understand, no semantics change

2011-11-25 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4820:
---

Status: Patch Available  (was: Open)

This one is the latest:

0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme_new.patch

> Distributed log splitting coding enhancement to make it easier to understand, 
> no semantics change
> -
>
> Key: HBASE-4820
> URL: https://issues.apache.org/jira/browse/HBASE-4820
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
>  Labels: newbie
> Fix For: 0.94.0
>
> Attachments: 
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch,
>  
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch,
>  0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme.patch, 
> 0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme.patch, 
> 0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme_new.patch
>
>
> In reviewing distributed log splitting feature, we found some cosmetic 
> issues.  They make the code hard to understand.
> It will be great to fix them.  For this issue, there should be no semantic 
> change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4869) Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with edits we know older than what region currently has.

2011-11-25 Thread Jimmy Xiang (Created) (JIRA)
Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with 
edits we know older than what region currently has.
---

 Key: HBASE-4869
 URL: https://issues.apache.org/jira/browse/HBASE-4869
 Project: HBase
  Issue Type: Improvement
  Components: wal
Affects Versions: 0.90.2
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.92.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157267#comment-13157267
 ] 

ramkrishna.s.vasudevan commented on HBASE-4868:
---

@Ted
I too thought of the same.  

> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157270#comment-13157270
 ] 

ramkrishna.s.vasudevan commented on HBASE-4862:
---

If the scenario is valid do we need to up the priority of this defect? But may 
not be common.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 
> trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4820) Distributed log splitting coding enhancement to make it easier to understand, no semantics change

2011-11-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157275#comment-13157275
 ] 

Hadoop QA commented on HBASE-4820:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12505146/0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme_new.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -162 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 67 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/367//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/367//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/367//console

This message is automatically generated.

> Distributed log splitting coding enhancement to make it easier to understand, 
> no semantics change
> -
>
> Key: HBASE-4820
> URL: https://issues.apache.org/jira/browse/HBASE-4820
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
>  Labels: newbie
> Fix For: 0.94.0
>
> Attachments: 
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch,
>  
> 0001-HBASE-4820-Distributed-log-splitting-coding-enhancement-to-makeit-easier-to-understand,-no-semantics-change..patch,
>  0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme.patch, 
> 0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme.patch, 
> 0001-HBASE-4820-Minor-distributed-log-splitting-enhanceme_new.patch
>
>
> In reviewing distributed log splitting feature, we found some cosmetic 
> issues.  They make the code hard to understand.
> It will be great to fix them.  For this issue, there should be no semantic 
> change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4869) Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with edits we know older than what region currently has.

2011-11-25 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4869:
---

Attachment: 0001-HBASE-4869-Backport-to-0.92-HBASE-4797-availability-.patch

> Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with 
> edits we know older than what region currently has.
> ---
>
> Key: HBASE-4869
> URL: https://issues.apache.org/jira/browse/HBASE-4869
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.90.2
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 
> 0001-HBASE-4869-Backport-to-0.92-HBASE-4797-availability-.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4869) Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with edits we know older than what region currently has.

2011-11-25 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-4869:
---

Status: Patch Available  (was: Open)

Backward port from trunk.

> Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with 
> edits we know older than what region currently has.
> ---
>
> Key: HBASE-4869
> URL: https://issues.apache.org/jira/browse/HBASE-4869
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.90.2
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 
> 0001-HBASE-4869-Backport-to-0.92-HBASE-4797-availability-.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4869) Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with edits we know older than what region currently has.

2011-11-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157280#comment-13157280
 ] 

Hadoop QA commented on HBASE-4869:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12505147/0001-HBASE-4869-Backport-to-0.92-HBASE-4797-availability-.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/368//console

This message is automatically generated.

> Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with 
> edits we know older than what region currently has.
> ---
>
> Key: HBASE-4869
> URL: https://issues.apache.org/jira/browse/HBASE-4869
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.90.2
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 
> 0001-HBASE-4869-Backport-to-0.92-HBASE-4797-availability-.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4862:
--

Status: Patch Available  (was: Open)

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 
> trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157289#comment-13157289
 ] 

Hadoop QA commented on HBASE-4862:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12505060/hbase-4862v1+for+trunk.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -162 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 67 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestRollingRestart
  org.apache.hadoop.hbase.master.TestRestartCluster
  org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
  org.apache.hadoop.hbase.regionserver.wal.TestHLogBench
  org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD
  org.apache.hadoop.hbase.regionserver.TestAtomicOperation
  org.apache.hadoop.hbase.TestInfoServers
  org.apache.hadoop.hbase.regionserver.TestParallelPut
  org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
  
org.apache.hadoop.hbase.regionserver.TestStoreFileBlockCacheSummary
  org.apache.hadoop.hbase.TestRegionRebalancing
  org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort
  org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed
  org.apache.hadoop.hbase.ipc.TestDelayedRpc
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  org.apache.hadoop.hbase.regionserver.wal.TestWALReplay
  org.apache.hadoop.hbase.master.TestHMasterRPCException
  org.apache.hadoop.hbase.regionserver.TestHRegion
  org.apache.hadoop.hbase.client.TestMultipleTimestamps
  org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad
  
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
  org.apache.hadoop.hbase.client.TestMetaScanner
  org.apache.hadoop.hbase.master.TestMaster
  
org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable
  org.apache.hadoop.hbase.TestDrainingServer
  org.apache.hadoop.hbase.regionserver.TestSplitLogWorker
  
org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
  org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion
  org.apache.hadoop.hbase.avro.TestAvroServer
  org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol
  org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit
  org.apache.hadoop.hbase.thrift.TestThriftServer
  org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics
  org.apache.hadoop.hbase.master.TestMasterFailover
  org.apache.hadoop.hbase.regionserver.wal.TestHLog
  org.apache.hadoop.hbase.TestMultiVersions
  org.apache.hadoop.hbase.master.TestMasterTransitions
  org.apache.hadoop.hbase.master.TestSplitLogManager
  org.apache.hadoop.hbase.master.TestOpenedRegionHandler
  org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/369//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/369//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/369//console

This message is automatically generated.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 
> trunk.diff
>

[jira] [Commented] (HBASE-4863) Make Thrift server thread pool bounded and add a command-line UI test

2011-11-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157290#comment-13157290
 ] 

Hudson commented on HBASE-4863:
---

Integrated in HBase-TRUNK #2482 (See 
[https://builds.apache.org/job/HBase-TRUNK/2482/])
HBASE-4863 Phabricator D531 Make Thrift server thread pool bounded and add 
a command-line UI test

tedyu : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/TBoundedThreadPoolServer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* /hbase/trunk/src/main/resources/hbase-default.xml
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServerCmdLine.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java


> Make Thrift server thread pool bounded and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, D531.1.patch, 
> D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4870) [book] developer.xml - adding integration test information

2011-11-25 Thread Doug Meil (Created) (JIRA)
[book] developer.xml - adding integration test information
--

 Key: HBASE-4870
 URL: https://issues.apache.org/jira/browse/HBASE-4870
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: developer_HBASE_4870.xml.patch

developer.xml
* adding integration test information

Note: this patch was supplied by Jesse Yates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4870) [book] developer.xml - adding integration test information

2011-11-25 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4870:
-

Status: Patch Available  (was: Open)

> [book] developer.xml - adding integration test information
> --
>
> Key: HBASE-4870
> URL: https://issues.apache.org/jira/browse/HBASE-4870
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: developer_HBASE_4870.xml.patch
>
>
> developer.xml
> * adding integration test information
> Note: this patch was supplied by Jesse Yates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4870) [book] developer.xml - adding integration test information

2011-11-25 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4870:
-

Attachment: developer_HBASE_4870.xml.patch

> [book] developer.xml - adding integration test information
> --
>
> Key: HBASE-4870
> URL: https://issues.apache.org/jira/browse/HBASE-4870
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: developer_HBASE_4870.xml.patch
>
>
> developer.xml
> * adding integration test information
> Note: this patch was supplied by Jesse Yates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4870) [book] developer.xml - adding integration test information

2011-11-25 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4870:
-

Resolution: Fixed
  Assignee: Jesse Yates  (was: Doug Meil)
Status: Resolved  (was: Patch Available)

> [book] developer.xml - adding integration test information
> --
>
> Key: HBASE-4870
> URL: https://issues.apache.org/jira/browse/HBASE-4870
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Jesse Yates
>Priority: Minor
> Attachments: developer_HBASE_4870.xml.patch
>
>
> developer.xml
> * adding integration test information
> Note: this patch was supplied by Jesse Yates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4871) [book] book updates, doc consolidation and some other things

2011-11-25 Thread Doug Meil (Created) (JIRA)
[book] book updates, doc consolidation and some other things


 Key: HBASE-4871
 URL: https://issues.apache.org/jira/browse/HBASE-4871
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor


Changes to book.xml, performance.xml, and configuration.xml
* Consolidated comments about region-size config information in the Config 
chapter (Performance now refers to Config chapter).  Also cleaned up Arch 
section on regions.
* Consolidated comments about RegionServer handlers in the Config chapter 
(Performance now refers to Config chapter)

Also,
* Added Network section in Troubleshooting chapter regarding network spikes, 
suggesting that users check compactionQueues (this came up on the dist-list 
recently)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4871) [book] book updates, doc consolidation and some other things

2011-11-25 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4871:
-

Attachment: docbkx_HBASE_4871.patch

> [book] book updates, doc consolidation and some other things
> 
>
> Key: HBASE-4871
> URL: https://issues.apache.org/jira/browse/HBASE-4871
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: docbkx_HBASE_4871.patch
>
>
> Changes to book.xml, performance.xml, and configuration.xml
> * Consolidated comments about region-size config information in the Config 
> chapter (Performance now refers to Config chapter).  Also cleaned up Arch 
> section on regions.
> * Consolidated comments about RegionServer handlers in the Config chapter 
> (Performance now refers to Config chapter)
> Also,
> * Added Network section in Troubleshooting chapter regarding network spikes, 
> suggesting that users check compactionQueues (this came up on the dist-list 
> recently)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4871) [book] book updates, doc consolidation and some other things

2011-11-25 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4871:
-

Status: Patch Available  (was: Open)

> [book] book updates, doc consolidation and some other things
> 
>
> Key: HBASE-4871
> URL: https://issues.apache.org/jira/browse/HBASE-4871
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: docbkx_HBASE_4871.patch
>
>
> Changes to book.xml, performance.xml, and configuration.xml
> * Consolidated comments about region-size config information in the Config 
> chapter (Performance now refers to Config chapter).  Also cleaned up Arch 
> section on regions.
> * Consolidated comments about RegionServer handlers in the Config chapter 
> (Performance now refers to Config chapter)
> Also,
> * Added Network section in Troubleshooting chapter regarding network spikes, 
> suggesting that users check compactionQueues (this came up on the dist-list 
> recently)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4871) [book] book updates, doc consolidation and some other things

2011-11-25 Thread Doug Meil (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4871:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> [book] book updates, doc consolidation and some other things
> 
>
> Key: HBASE-4871
> URL: https://issues.apache.org/jira/browse/HBASE-4871
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: docbkx_HBASE_4871.patch
>
>
> Changes to book.xml, performance.xml, and configuration.xml
> * Consolidated comments about region-size config information in the Config 
> chapter (Performance now refers to Config chapter).  Also cleaned up Arch 
> section on regions.
> * Consolidated comments about RegionServer handlers in the Config chapter 
> (Performance now refers to Config chapter)
> Also,
> * Added Network section in Troubleshooting chapter regarding network spikes, 
> suggesting that users check compactionQueues (this came up on the dist-list 
> recently)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4862:
--

Attachment: 4862.txt

I ran a few tests based on patch for TRUNK and didn't see failure.
Reattaching patch for TRUNK.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4862:
--

Status: Open  (was: Patch Available)

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4862:
--

Status: Patch Available  (was: Open)

TestHLogSplit passed on MacBook.

Rerun test suite on Jenkins.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4863) Make Thrift server thread pool bounded and add a command-line UI test

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4863:
--

Attachment: 4863.addendum

Addendum to add category for TestThreads

> Make Thrift server thread pool bounded and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 4863.addendum, 
> D531.1.patch, D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4870) [book] developer.xml - adding integration test information

2011-11-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157326#comment-13157326
 ] 

Hudson commented on HBASE-4870:
---

Integrated in HBase-TRUNK #2483 (See 
[https://builds.apache.org/job/HBase-TRUNK/2483/])
HBASE-4870 developer.xml, integration test info

dmeil : 
Files : 
* /hbase/trunk/src/docbkx/developer.xml


> [book] developer.xml - adding integration test information
> --
>
> Key: HBASE-4870
> URL: https://issues.apache.org/jira/browse/HBASE-4870
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Jesse Yates
>Priority: Minor
> Attachments: developer_HBASE_4870.xml.patch
>
>
> developer.xml
> * adding integration test information
> Note: this patch was supplied by Jesse Yates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157327#comment-13157327
 ] 

Hadoop QA commented on HBASE-4862:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12505162/4862.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -162 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 67 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/371//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/371//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/371//console

This message is automatically generated.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4872) Balancer goes to move a region but its being split so we fail offlining because NodeExists and it crashes master

2011-11-25 Thread stack (Created) (JIRA)
Balancer goes to move a region but its being split so we fail offlining because 
NodeExists and it crashes master


 Key: HBASE-4872
 URL: https://issues.apache.org/jira/browse/HBASE-4872
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: stack
Priority: Critical
 Fix For: 0.92.0


So, offlining should deal with there being an existing znode in SPLITTING of 
SPLIT state.  Here is log around the master crash.

{code}
// BALANCER RUNS
2011-11-24 07:41:18,686 INFO org.apache.hadoop.hbase.master.HMaster: balance 
hri=TestTable,0936288217,1322120470779.f681f38f85f6aad594f24f90559da051., 
src=sv4r9s38,7003,1322110570576, dest=sv4r31s44,7003,1322110570645
2011-11-24 07:41:18,686 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Starting unassignment of region 
TestTable,0936288217,1322120470779.f681f38f85f6aad594f24f90559da051. (offlining)
2011-11-24 07:41:18,686 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:7001-0x133d3ee451d Creating unassigned node for 
f681f38f85f6aad594f24f90559da051 in a CLOSING state 
2011-11-24 07:41:18,702 ERROR 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
/hbase/unassigned/f681f38f85f6aad594f24f90559da051 already exists and this is 
not a retry

2011-11-24 07:41:18,702 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
server abort: loaded coprocessors are: []
2011-11-24 07:41:18,704 FATAL org.apache.hadoop.hbase.master.HMaster: 
Unexpected ZK exception creating node CLOSING
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
NodeExists for /hbase/unassigned/f681f38f85f6aad594f24f90559da051
at org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:839)
at 
org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
at 
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1766)
at 
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1732)
at 
org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:2763)
at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:871)
at org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:735)
at org.apache.hadoop.hbase.Chore.run(Chore.java:67)
at java.lang.Thread.run(Thread.java:662)
2011-11-24 07:41:18,706 INFO org.apache.hadoop.hbase.master.HMaster: Aborting


// HERE THE SPLIT SHOWS UP ON MASTER


2011-11-24 07:41:19,310 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_SPLIT, server=sv4r9s38,7003,1322110570576, 
region=f681f38f85f6aad594f24f90559da051
2011-11-24 07:41:19,311 INFO org.apache.hadoop.hbase.master.AssignmentManager: 
Received SPLIT for region f681f38f85f6aad594f24f90559da051 from server 
sv4r9s38,7003,1322110570576 but region was not first in SPLITTING state; 
continuing
2011-11-24 07:41:19,311 DEBUG 
org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handling SPLIT event 
for f681f38f85f6aad594f24f90559da051; deleting node
2011-11-24 07:41:19,311 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:7001-0x133d3ee451d Deleting existing unassigned node for 
f681f38f85f6aad594f24f90559da051 that is in expected state RS_ZK_REGION_SPLIT
2011-11-24 07:41:19,401 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:7001-0x133d3ee451d Successfully deleted unassigned node for region 
f681f38f85f6aad594f24f90559da051 in expected state RS_ZK_REGION_SPLIT
2011-11-24 07:41:19,401 INFO 
org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handled SPLIT 
report); 
parent=TestTable,0936288217,1322120470779.f681f38f85f6aad594f24f90559da051. 
daughter 
a=TestTable,0936288217,1322120475927.98808642d25058c930fc9e5974f3715d.daughter 
b=TestTable,0936327449,1322120475927.00ba03057ddc6d3868fb6ef349cb9a2c.
{code}

Here is regionserver side of things at time of split:

{code}
2011-11-24 07:41:15,927 DEBUG 
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for 
TestTable,0936288217,1322120470779.f681f38f85f6aad594f24f90559da051..  
compaction_queue=(1:0), split_queue=1
2011-11-24 07:41:15,928 INFO 
org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region 
TestTable,0936288217,1322120470779.f681f38f85f6aad594f24f90559da051.

[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4862:
--

Attachment: (was: 4862.txt)

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4862:
--

Status: Open  (was: Patch Available)

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4863) Make Thrift server thread pool bounded and add a command-line UI test

2011-11-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157328#comment-13157328
 ] 

Hadoop QA commented on HBASE-4863:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12505163/4863.addendum
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/372//console

This message is automatically generated.

> Make Thrift server thread pool bounded and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 4863.addendum, 
> D531.1.patch, D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4862:
--

Attachment: 4862.txt

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4862:
--

Status: Patch Available  (was: Open)

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4863) Make Thrift server thread pool bounded and add a command-line UI test

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4863:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Addendum applied to TRUNK.

> Make Thrift server thread pool bounded and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 4863.addendum, 
> D531.1.patch, D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4873) Port HBASE-4863 to thrift2/ThriftServer

2011-11-25 Thread Ted Yu (Created) (JIRA)
Port HBASE-4863 to thrift2/ThriftServer
---

 Key: HBASE-4873
 URL: https://issues.apache.org/jira/browse/HBASE-4873
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu


HBASE-4863 introduced bounded thread pool for Thrift server.
thrift2/ThriftServer should have this enhancement as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4872) Balancer goes to move a region but its being split so we fail offlining because NodeExists and it crashes master

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157333#comment-13157333
 ] 

Ted Yu commented on HBASE-4872:
---

This seems similar to HBASE-4729

Maybe we should introduce interlock between region movement and region 
splitting.

> Balancer goes to move a region but its being split so we fail offlining 
> because NodeExists and it crashes master
> 
>
> Key: HBASE-4872
> URL: https://issues.apache.org/jira/browse/HBASE-4872
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: stack
>Priority: Critical
> Fix For: 0.92.0
>
>
> So, offlining should deal with there being an existing znode in SPLITTING of 
> SPLIT state.  Here is log around the master crash.
> {code}
> // BALANCER RUNS
> 2011-11-24 07:41:18,686 INFO org.apache.hadoop.hbase.master.HMaster: balance 
> hri=TestTable,0936288217,1322120470779.f681f38f85f6aad594f24f90559da051., 
> src=sv4r9s38,7003,1322110570576, dest=sv4r31s44,7003,1322110570645
> 2011-11-24 07:41:18,686 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
> region TestTable,0936288217,1322120470779.f681f38f85f6aad594f24f90559da051. 
> (offlining)
> 2011-11-24 07:41:18,686 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:7001-0x133d3ee451d Creating unassigned node for 
> f681f38f85f6aad594f24f90559da051 in a CLOSING state 
> 2011-11-24 07:41:18,702 ERROR 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node 
> /hbase/unassigned/f681f38f85f6aad594f24f90559da051 already exists and this is 
> not a retry
> 2011-11-24 07:41:18,702 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2011-11-24 07:41:18,704 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unexpected ZK exception creating node CLOSING
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists for /hbase/unassigned/f681f38f85f6aad594f24f90559da051
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:839)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1766)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1732)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:2763)
> at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:871)
> at org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:735)
> at org.apache.hadoop.hbase.Chore.run(Chore.java:67)
> at java.lang.Thread.run(Thread.java:662)
> 2011-11-24 07:41:18,706 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> // HERE THE SPLIT SHOWS UP ON MASTER
> 2011-11-24 07:41:19,310 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_SPLIT, server=sv4r9s38,7003,1322110570576, 
> region=f681f38f85f6aad594f24f90559da051
> 2011-11-24 07:41:19,311 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 
> f681f38f85f6aad594f24f90559da051 from server sv4r9s38,7003,1322110570576 but 
> region was not first in SPLITTING state; continuing
> 2011-11-24 07:41:19,311 DEBUG 
> org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handling SPLIT 
> event for f681f38f85f6aad594f24f90559da051; deleting node
> 2011-11-24 07:41:19,311 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:7001-0x133d3ee451d Deleting existing unassigned node for 
> f681f38f85f6aad594f24f90559da051 that is in expected state RS_ZK_REGION_SPLIT
> 2011-11-24 07:41:19,401 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:7001-0x133d3ee451d Successfully deleted unassigned node for region 
> f681f38f85f6aad594f24f90559da051 in expected state RS_ZK_REGION_SPLIT
> 2011-11-24 07:41:19,401 INFO 
> org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handled SPLIT 
> report); 
> parent=TestTable,0936288217,1322120470779.f681f38f85f6aad594f24f90559da051. 
> daughter 
> a=TestTable,0936288217,1322120475927.98808642d25058c930fc9e5974f3715d.daughter
>  b=Tes

[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157334#comment-13157334
 ] 

Hadoop QA commented on HBASE-4862:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12505167/4862.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -162 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 67 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/373//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/373//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/373//console

This message is automatically generated.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157339#comment-13157339
 ] 

Ted Yu commented on HBASE-4862:
---

I could run test suite by executing 'mvn test' on MacBook.
PreCommit builds 371 and 373 didn't run any tests.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4862:
--

Priority: Critical  (was: Major)

Lifting priority as Ramkrishna suggested.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157341#comment-13157341
 ] 

Ted Yu commented on HBASE-4862:
---

When attaching patch, please grant license to ASF.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4838) Port 2856 (TestAcidGuarantee is failing) to 0.92

2011-11-25 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157345#comment-13157345
 ] 

Lars Hofhansl commented on HBASE-4838:
--

Finally found the problem!!
It was indeed a problem in HalfStoreFileReader (as Todd and Stack have 
suggested all along). HalfStoreFileReader did not have a getScanner(final 
boolean cacheBlocks, final boolean pread, final boolean isCompaction) method, 
and hence the super method was called returning a normal reader.

bq. Reference.java and HalfStoreFileReader.java are identical between 0.92 and 
trunk

I don't know what I was comparing here (must have been late in the night), but 
they are different and that difference was exactly the problem. Could have 
saved myself about 16 hours of debugging.


> Port 2856 (TestAcidGuarantee is failing) to 0.92
> 
>
> Key: HBASE-4838
> URL: https://issues.apache.org/jira/browse/HBASE-4838
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4838-v1.txt
>
>
> Moving back port into a separate issue (as suggested by JonH), because this 
> not trivial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157346#comment-13157346
 ] 

Jonathan Hsieh commented on HBASE-4868:
---

@Jinchao

Also, if this is call that can indefinitely block, I'd add timeout values for 
the test.

So instead of just 

{code}
@Test
{code}

change it to 

{code}
@Test(timeout=180)  // fail test after 180s
{code}

> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4838) Port 2856 (TestAcidGuarantee is failing) to 0.92

2011-11-25 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4838:
-

Attachment: 4838-v2.txt

> Port 2856 (TestAcidGuarantee is failing) to 0.92
> 
>
> Key: HBASE-4838
> URL: https://issues.apache.org/jira/browse/HBASE-4838
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4838-v1.txt, 4838-v2.txt
>
>
> Moving back port into a separate issue (as suggested by JonH), because this 
> not trivial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4838) Port 2856 (TestAcidGuarantee is failing) to 0.92

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157347#comment-13157347
 ] 

Ted Yu commented on HBASE-4838:
---

Superb, Lars.

> Port 2856 (TestAcidGuarantee is failing) to 0.92
> 
>
> Key: HBASE-4838
> URL: https://issues.apache.org/jira/browse/HBASE-4838
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4838-v1.txt, 4838-v2.txt
>
>
> Moving back port into a separate issue (as suggested by JonH), because this 
> not trivial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4838) Port 2856 (TestAcidGuarantee is failing) to 0.92

2011-11-25 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4838:
-

Attachment: (was: 4838-v2.txt)

> Port 2856 (TestAcidGuarantee is failing) to 0.92
> 
>
> Key: HBASE-4838
> URL: https://issues.apache.org/jira/browse/HBASE-4838
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4838-v1.txt, 4838-v3.txt
>
>
> Moving back port into a separate issue (as suggested by JonH), because this 
> not trivial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4838) Port 2856 (TestAcidGuarantee is failing) to 0.92

2011-11-25 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4838:
-

Attachment: 4838-v3.txt

> Port 2856 (TestAcidGuarantee is failing) to 0.92
> 
>
> Key: HBASE-4838
> URL: https://issues.apache.org/jira/browse/HBASE-4838
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4838-v1.txt, 4838-v3.txt
>
>
> Moving back port into a separate issue (as suggested by JonH), because this 
> not trivial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4838) Port 2856 (TestAcidGuarantee is failing) to 0.92

2011-11-25 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157349#comment-13157349
 ] 

Lars Hofhansl commented on HBASE-4838:
--

Running all tests now.

> Port 2856 (TestAcidGuarantee is failing) to 0.92
> 
>
> Key: HBASE-4838
> URL: https://issues.apache.org/jira/browse/HBASE-4838
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4838-v1.txt, 4838-v3.txt
>
>
> Moving back port into a separate issue (as suggested by JonH), because this 
> not trivial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4838) Port 2856 (TestAcidGuarantee is failing) to 0.92

2011-11-25 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157345#comment-13157345
 ] 

Lars Hofhansl edited comment on HBASE-4838 at 11/26/11 12:30 AM:
-

Finally found the problem!!
It was indeed a problem in HalfStoreFileReader (as Todd and Stack have 
suggested all along). HalfStoreFileReader did not have a getScanner(final 
boolean cacheBlocks, final boolean pread, final boolean isCompaction) method, 
and hence the super method was called, returning a "normal" ScannerV2, instead 
of the inner HFileScanner instance.

bq. Reference.java and HalfStoreFileReader.java are identical between 0.92 and 
trunk

I don't know what I was comparing here (must have been late in the night), but 
they are different and that difference was exactly the problem. Could have 
saved myself about 16 hours of debugging.


  was (Author: lhofhansl):
Finally found the problem!!
It was indeed a problem in HalfStoreFileReader (as Todd and Stack have 
suggested all along). HalfStoreFileReader did not have a getScanner(final 
boolean cacheBlocks, final boolean pread, final boolean isCompaction) method, 
and hence the super method was called returning a normal reader.

bq. Reference.java and HalfStoreFileReader.java are identical between 0.92 and 
trunk

I don't know what I was comparing here (must have been late in the night), but 
they are different and that difference was exactly the problem. Could have 
saved myself about 16 hours of debugging.

  
> Port 2856 (TestAcidGuarantee is failing) to 0.92
> 
>
> Key: HBASE-4838
> URL: https://issues.apache.org/jira/browse/HBASE-4838
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4838-v1.txt, 4838-v3.txt
>
>
> Moving back port into a separate issue (as suggested by JonH), because this 
> not trivial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4773) HBaseAdmin leaks ZooKeeper connections

2011-11-25 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157352#comment-13157352
 ] 

gaojinchao commented on HBASE-4773:
---

In TRUNK, before throwing exception, we should call deleteStaleConnection to 
clean the dirty data


> HBaseAdmin leaks ZooKeeper connections
> --
>
> Key: HBASE-4773
> URL: https://issues.apache.org/jira/browse/HBASE-4773
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 4773.patch
>
>
> When master crashs, HBaseAdmin will leaks ZooKeeper connections
> I think we should close the zk connetion when throw MasterNotRunningException
>  public HBaseAdmin(Configuration c)
>   throws MasterNotRunningException, ZooKeeperConnectionException {
> this.conf = HBaseConfiguration.create(c);
> this.connection = HConnectionManager.getConnection(this.conf);
> this.pause = this.conf.getLong("hbase.client.pause", 1000);
> this.numRetries = this.conf.getInt("hbase.client.retries.number", 10);
> this.retryLongerMultiplier = 
> this.conf.getInt("hbase.client.retries.longer.multiplier", 10);
> //we should add this code and close the zk connection
> try{
>   this.connection.getMaster();
> }catch(MasterNotRunningException e){
>   HConnectionManager.deleteConnection(conf, false);
>   throw e;  
> }
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157353#comment-13157353
 ] 

gaojinchao commented on HBASE-4868:
---

Thanks reveiw, I will fix all the comments


> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4871) [book] book updates, doc consolidation and some other things

2011-11-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157357#comment-13157357
 ] 

Hudson commented on HBASE-4871:
---

Integrated in HBase-TRUNK #2484 (See 
[https://builds.apache.org/job/HBase-TRUNK/2484/])
HBASE-4871 hbase book. docs cleanup.

dmeil : 
Files : 
* /hbase/trunk/src/docbkx/book.xml
* /hbase/trunk/src/docbkx/configuration.xml
* /hbase/trunk/src/docbkx/performance.xml
* /hbase/trunk/src/docbkx/troubleshooting.xml


> [book] book updates, doc consolidation and some other things
> 
>
> Key: HBASE-4871
> URL: https://issues.apache.org/jira/browse/HBASE-4871
> Project: HBase
>  Issue Type: Improvement
>Reporter: Doug Meil
>Assignee: Doug Meil
>Priority: Minor
> Attachments: docbkx_HBASE_4871.patch
>
>
> Changes to book.xml, performance.xml, and configuration.xml
> * Consolidated comments about region-size config information in the Config 
> chapter (Performance now refers to Config chapter).  Also cleaned up Arch 
> section on regions.
> * Consolidated comments about RegionServer handlers in the Config chapter 
> (Performance now refers to Config chapter)
> Also,
> * Added Network section in Troubleshooting chapter regarding network spikes, 
> suggesting that users check compactionQueues (this came up on the dist-list 
> recently)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4863) Make Thrift server thread pool bounded and add a command-line UI test

2011-11-25 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157356#comment-13157356
 ] 

Hudson commented on HBASE-4863:
---

Integrated in HBase-TRUNK #2484 (See 
[https://builds.apache.org/job/HBase-TRUNK/2484/])
HBASE-4863 Addendum to add category for TestThreads

tedyu : 
Files : 
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java


> Make Thrift server thread pool bounded and add a command-line UI test
> -
>
> Key: HBASE-4863
> URL: https://issues.apache.org/jira/browse/HBASE-4863
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.94.0
>
> Attachments: 
> 0001-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 
> 0002-Fix-thread-leaks-in-the-HBase-thread-pool-server.patch, 4863.addendum, 
> D531.1.patch, D531.2.patch, D531.3.patch, D531.4.patch
>
>
> This started as an internal hotfix where we found out that the Thrift server 
> spawned 15000 threads. To bound the thread pool size I added a custom thread 
> pool server implementation called HBaseThreadPoolServer into HBase codebase, 
> and made the following parameters configurable from both command line and as 
> config settings: minWorkerThreads, maxWorkerThreads, and maxQueuedRequests. 
> Under an increasing load, the server creates new threads for every connection 
> before the pool size reaches minWorkerThreads. After that, the server puts 
> new connections into the queue and only creates a new thread when the queue 
> is full. If an attempt to create a new thread fails, the server drops 
> connection. The default TThreadPoolServer would crash in that case, but it 
> never happened because the thread pool was unbounded, so the server would 
> hang indefinitely, consume a lot of memory, and cause huge latency spikes on 
> the client side.
> Another part of this fix is refactoring and unit testing of the command-line 
> part of the Thrift server. The logic there is sufficiently complicated, and 
> the existing ThriftServer class does not test that part at all. The new 
> TestThriftServerCmdLine test starts the Thrift server on a random port with 
> various combinations of options and talks to it through the client API from 
> another thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4838) Port 2856 (TestAcidGuarantee is failing) to 0.92

2011-11-25 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157362#comment-13157362
 ] 

Lars Hofhansl commented on HBASE-4838:
--

Looking pretty good...

{noformat}
Results :

Failed tests:   testClosing(org.apache.hadoop.hbase.client.TestHCM)
  testWorkerAbort(org.apache.hadoop.hbase.master.TestDistributedLogSplitting): 
expected:<1> but was:<0>

Tests run: 1065, Failures: 2, Errors: 0, Skipped: 7
{noformat}

testClosing is unrelated.

And a 2nd run for for distributed log splitting:
{noformat}
Running org.apache.hadoop.hbase.master.TestDistributedLogSplitting
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 115.653 sec

Results :

Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
{noformat}


> Port 2856 (TestAcidGuarantee is failing) to 0.92
> 
>
> Key: HBASE-4838
> URL: https://issues.apache.org/jira/browse/HBASE-4838
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.92.0
>
> Attachments: 4838-v1.txt, 4838-v3.txt
>
>
> Moving back port into a separate issue (as suggested by JonH), because this 
> not trivial.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread chunhui shen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-4862:


Attachment: hbase-4862v1 for trunk.diff
hbase-4862v1 for 0.90.diff

Grant license to ASF for  the attached patch 

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for 
> trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157367#comment-13157367
 ] 

Hadoop QA commented on HBASE-4862:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12505172/hbase-4862v1+for+trunk.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -162 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 67 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/374//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/374//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/374//console

This message is automatically generated.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for 
> trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4874) TestHCM#testClosing fails on Linux

2011-11-25 Thread Ted Yu (Created) (JIRA)
TestHCM#testClosing fails on Linux
--

 Key: HBASE-4874
 URL: https://issues.apache.org/jira/browse/HBASE-4874
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu


TestHCM#testClosing fails on Linux

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4874) TestHCM#testClosing fails if not enough entropy is available

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4874:
--

Description: TestHCM#testClosing fails if not enough entropy is available  
(was: TestHCM#testClosing fails on Linux)
Summary: TestHCM#testClosing fails if not enough entropy is available  
(was: TestHCM#testClosing fails on Linux)

> TestHCM#testClosing fails if not enough entropy is available
> 
>
> Key: HBASE-4874
> URL: https://issues.apache.org/jira/browse/HBASE-4874
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>
> TestHCM#testClosing fails if not enough entropy is available

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4875) ZKLeaderManager.handleLeaderChange() doesn't handle KeeperException$SessionExpiredException

2011-11-25 Thread Ted Yu (Created) (JIRA)
ZKLeaderManager.handleLeaderChange() doesn't handle 
KeeperException$SessionExpiredException
---

 Key: HBASE-4875
 URL: https://issues.apache.org/jira/browse/HBASE-4875
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu


TestMasterFailover#testSimpleMasterFailover has failed twice in a row for 
builds 15 and 16.

>From 
>https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92-security/16/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testSimpleMasterFailover/:
{code}
2011-11-26 01:34:49,218 ERROR [Thread-1-EventThread] 
zookeeper.ZooKeeperWatcher(403): master:52934-0x133dd828131 Received 
unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/tokenauth/keymaster
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1003)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:225)
at 
org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.handleLeaderChange(ZKLeaderManager.java:85)
at 
org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.nodeDeleted(ZKLeaderManager.java:78)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:281)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
2011-11-26 01:34:49,216 DEBUG 
[RegionServer:2;hemera.apache.org,44702,1322271278232-EventThread] 
zookeeper.ZKUtil(230): hconnection-0x133dd828139 /hbase/master does not 
exist. Watcher is set.
2011-11-26 01:34:49,215 DEBUG [Thread-1-EventThread] zookeeper.ZKUtil(230): 
master:44883-0x133dd828132 /hbase/master does not exist. Watcher is set.
2011-11-26 01:34:49,219 DEBUG [Thread-1-EventThread] 
master.ActiveMasterManager(104): No master available. Notifying waiting threads
2011-11-26 01:34:49,215 INFO  [Master:1;hemera.apache.org,52934,1322271278115] 
master.HMaster(338): HMaster main thread exiting
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4875) ZKLeaderManager.handleLeaderChange() doesn't handle KeeperException$SessionExpiredException

2011-11-25 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4875:
--

Description: 
TestMasterFailover#testSimpleMasterFailover has failed twice in a row for 
builds 15 and 16.

>From 
>https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92-security/16/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testSimpleMasterFailover/:
{code}
2011-11-26 01:34:49,217 DEBUG 
[RegionServer:0;hemera.apache.org,57516,1322271278190-EventThread] 
zookeeper.ZooKeeperWatcher(257): regionserver:57516-0x133dd828133 Received 
ZooKeeper Event, type=NodeDeleted, state=SyncConnected, 
path=/hbase/tokenauth/keymaster
2011-11-26 01:34:49,217 WARN  [Thread-1-EventThread] zookeeper.ZKUtil(234): 
master:52934-0x133dd828131 Unable to set watcher on znode 
/hbase/tokenauth/keymaster
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/tokenauth/keymaster
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1003)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:225)
at 
org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.handleLeaderChange(ZKLeaderManager.java:85)
at 
org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.nodeDeleted(ZKLeaderManager.java:78)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:281)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
2011-11-26 01:34:49,218 ERROR [Thread-1-EventThread] 
zookeeper.ZooKeeperWatcher(403): master:52934-0x133dd828131 Received 
unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/tokenauth/keymaster
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1003)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:225)
at 
org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.handleLeaderChange(ZKLeaderManager.java:85)
at 
org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.nodeDeleted(ZKLeaderManager.java:78)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:281)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
2011-11-26 01:34:49,216 DEBUG 
[RegionServer:2;hemera.apache.org,44702,1322271278232-EventThread] 
zookeeper.ZKUtil(230): hconnection-0x133dd828139 /hbase/master does not 
exist. Watcher is set.
2011-11-26 01:34:49,215 DEBUG [Thread-1-EventThread] zookeeper.ZKUtil(230): 
master:44883-0x133dd828132 /hbase/master does not exist. Watcher is set.
2011-11-26 01:34:49,219 DEBUG [Thread-1-EventThread] 
master.ActiveMasterManager(104): No master available. Notifying waiting threads
2011-11-26 01:34:49,215 INFO  [Master:1;hemera.apache.org,52934,1322271278115] 
master.HMaster(338): HMaster main thread exiting
{code}


  was:
TestMasterFailover#testSimpleMasterFailover has failed twice in a row for 
builds 15 and 16.

>From 
>https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92-security/16/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testSimpleMasterFailover/:
{code}
2011-11-26 01:34:49,218 ERROR [Thread-1-EventThread] 
zookeeper.ZooKeeperWatcher(403): master:52934-0x133dd828131 Received 
unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/tokenauth/keymaster
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1003)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:225)
at 
org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.handleLeaderChange(ZKLeaderManager.java:85)
at 
org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.nodeDeleted(ZKLeaderManager.java:78)
at 
org.apache

[jira] [Commented] (HBASE-4875) ZKLeaderManager.handleLeaderChange() doesn't handle KeeperException$SessionExpiredException

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157376#comment-13157376
 ] 

Ted Yu commented on HBASE-4875:
---

I think we can follow the example from Master dealing with 
SessionExpiredException:
{code}
if (t != null && t instanceof KeeperException.SessionExpiredException) {
  try {
LOG.info("Primary Master trying to recover from ZooKeeper session " +
"expiry.");
return !tryRecoveringExpiredZKSession();
{code}

> ZKLeaderManager.handleLeaderChange() doesn't handle 
> KeeperException$SessionExpiredException
> ---
>
> Key: HBASE-4875
> URL: https://issues.apache.org/jira/browse/HBASE-4875
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>
> TestMasterFailover#testSimpleMasterFailover has failed twice in a row for 
> builds 15 and 16.
> From 
> https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92-security/16/testReport/org.apache.hadoop.hbase.master/TestMasterFailover/testSimpleMasterFailover/:
> {code}
> 2011-11-26 01:34:49,217 DEBUG 
> [RegionServer:0;hemera.apache.org,57516,1322271278190-EventThread] 
> zookeeper.ZooKeeperWatcher(257): regionserver:57516-0x133dd828133 
> Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, 
> path=/hbase/tokenauth/keymaster
> 2011-11-26 01:34:49,217 WARN  [Thread-1-EventThread] zookeeper.ZKUtil(234): 
> master:52934-0x133dd828131 Unable to set watcher on znode 
> /hbase/tokenauth/keymaster
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase/tokenauth/keymaster
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1003)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:225)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.handleLeaderChange(ZKLeaderManager.java:85)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.nodeDeleted(ZKLeaderManager.java:78)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:281)
>   at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
> 2011-11-26 01:34:49,218 ERROR [Thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(403): master:52934-0x133dd828131 Received 
> unexpected KeeperException, re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase/tokenauth/keymaster
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1003)
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:225)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.handleLeaderChange(ZKLeaderManager.java:85)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.nodeDeleted(ZKLeaderManager.java:78)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:281)
>   at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
> 2011-11-26 01:34:49,216 DEBUG 
> [RegionServer:2;hemera.apache.org,44702,1322271278232-EventThread] 
> zookeeper.ZKUtil(230): hconnection-0x133dd828139 /hbase/master does not 
> exist. Watcher is set.
> 2011-11-26 01:34:49,215 DEBUG [Thread-1-EventThread] zookeeper.ZKUtil(230): 
> master:44883-0x133dd828132 /hbase/master does not exist. Watcher is set.
> 2011-11-26 01:34:49,219 DEBUG [Thread-1-EventThread] 
> master.ActiveMasterManager(104): No master available. Notifying waiting 
> threads
> 2011-11-26 01:34:49,215 INFO  
> [Master:1;hemera.apache.org,52934,1322271278115] master.HMaster(338): HMaster 
> main thread exiting
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4869) Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with edits we know older than what region currently has.

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157377#comment-13157377
 ] 

Ted Yu commented on HBASE-4869:
---

bq. except testgetHDFSBlocksDistribution doesn't complete
Can you give us more details ?

Thanks

> Backport to 0.92: HBASE-4797 [availability] Skip recovered.edits files with 
> edits we know older than what region currently has.
> ---
>
> Key: HBASE-4869
> URL: https://issues.apache.org/jira/browse/HBASE-4869
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Affects Versions: 0.90.2
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 
> 0001-HBASE-4869-Backport-to-0.92-HBASE-4797-availability-.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4874) TestHCM#testClosing fails if not enough entropy is available

2011-11-25 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157378#comment-13157378
 ] 

Lars Hofhansl edited comment on HBASE-4874 at 11/26/11 5:53 AM:


This happens at least on Linux when SecureRandom is used.
SecureRandom uses /dev/random in Linux, which will block if not enough entropy 
has been gathered by system events (disk, network, keyboard, mouse movements, 
etc).

It's *not* only testClosing. Every time a test cluster is started it might 
hang, if the system happens to run out of entropy bits to generate secure 
random numbers. I've seen the same with TestFromClient and other tests that 
start an actual cluster for the test.

The fix I found was to add "-Djava.security.egd=file:/dev/./urandom" (note the 
extra ./ in path is needed for JDK 1.5 or newer) to the command line.


  was (Author: lhofhansl):
This happens at least on Linux when SecureRandom is used.
SecureRandom used /dev/random in Linux, which will block if not enough entropy 
has been gathered by system events (disk, network, keyboard, mouse movements, 
etc).

It's *not* only testClosing. Every time a test cluster is started it might 
hang, if the system happens to run out of entropy its to generated secure 
random numbers. I've seen the same with TestFromClient and other tests that 
start an actual cluster for the test.

The fix I found was to add "-Djava.security.egd=file:/dev/./urandom" (note the 
extra ./ in path is needed for JDK 1.5 or newer) to the command line.

  
> TestHCM#testClosing fails if not enough entropy is available
> 
>
> Key: HBASE-4874
> URL: https://issues.apache.org/jira/browse/HBASE-4874
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>
> TestHCM#testClosing fails if not enough entropy is available

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4874) TestHCM#testClosing fails if not enough entropy is available

2011-11-25 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157378#comment-13157378
 ] 

Lars Hofhansl commented on HBASE-4874:
--

This happens at least on Linux when SecureRandom is used.
SecureRandom used /dev/random in Linux, which will block if not enough entropy 
has been gathered by system events (disk, network, keyboard, mouse movements, 
etc).

It's *not* only testClosing. Every time a test cluster is started it might 
hang, if the system happens to run out of entropy its to generated secure 
random numbers. I've seen the same with TestFromClient and other tests that 
start an actual cluster for the test.

The fix I found was to add "-Djava.security.egd=file:/dev/./urandom" (note the 
extra ./ in path is needed for JDK 1.5 or newer) to the command line.


> TestHCM#testClosing fails if not enough entropy is available
> 
>
> Key: HBASE-4874
> URL: https://issues.apache.org/jira/browse/HBASE-4874
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>
> TestHCM#testClosing fails if not enough entropy is available

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157379#comment-13157379
 ] 

Ted Yu commented on HBASE-4862:
---

Chunhui ran the patch through test suite.

The OS is:
Red Hat Enterprise Linux Server release 5.4 (Tikanga)
{code}
Results :
Failed tests:   
testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer):
 ReplicationPeer ZooKeeper session
was not properly expired.
  testClosing(org.apache.hadoop.hbase.client.TestHCM)
Tests run: 1173, Failures: 2, Errors: 0, Skipped: 8
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 2:02:44.930s
{code}
testClosing failure is captured in HBASE-4874.
TestReplicationPeer passed when run manually.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for 
> trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-4874) TestHCM#testClosing fails if not enough entropy is available

2011-11-25 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157378#comment-13157378
 ] 

Lars Hofhansl edited comment on HBASE-4874 at 11/26/11 5:54 AM:


This happens at least on Linux when SecureRandom is used.
SecureRandom uses /dev/random in Linux, which will block if not enough entropy 
has been gathered by system events (disk, network, keyboard, mouse movements, 
etc).

It's *not* only testClosing. Every time a test cluster is started it might 
hang, if the system happens to run out of entropy bits to generate secure 
random numbers. I've seen the same with TestFromClient and other tests that 
start an actual cluster for the test.

The fix I found was to add "-Djava.security.egd=file:/dev/./urandom" (note the 
extra ./ in path is needed for JDK 1.5 or newer) to the command line. 
/dev/urandom never blocks, but if not enough entropy has been gathered the 
random numbers generated won't be truly random (which is fine for the tests).

  was (Author: lhofhansl):
This happens at least on Linux when SecureRandom is used.
SecureRandom uses /dev/random in Linux, which will block if not enough entropy 
has been gathered by system events (disk, network, keyboard, mouse movements, 
etc).

It's *not* only testClosing. Every time a test cluster is started it might 
hang, if the system happens to run out of entropy bits to generate secure 
random numbers. I've seen the same with TestFromClient and other tests that 
start an actual cluster for the test.

The fix I found was to add "-Djava.security.egd=file:/dev/./urandom" (note the 
extra ./ in path is needed for JDK 1.5 or newer) to the command line.

  
> TestHCM#testClosing fails if not enough entropy is available
> 
>
> Key: HBASE-4874
> URL: https://issues.apache.org/jira/browse/HBASE-4874
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>
> TestHCM#testClosing fails if not enough entropy is available

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4874) TestHCM#testClosing fails if not enough entropy is available

2011-11-25 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4874:
-

Attachment: 4874.txt

minipatch... not sure how to make this Linux specific.

> TestHCM#testClosing fails if not enough entropy is available
> 
>
> Key: HBASE-4874
> URL: https://issues.apache.org/jira/browse/HBASE-4874
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
> Attachments: 4874.txt
>
>
> TestHCM#testClosing fails if not enough entropy is available

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4874) TestHCM#testClosing fails if not enough entropy is available

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157382#comment-13157382
 ] 

Ted Yu commented on HBASE-4874:
---

+1 on patch.

SecureRandom isn't used in classes under src/main - Whew.

> TestHCM#testClosing fails if not enough entropy is available
> 
>
> Key: HBASE-4874
> URL: https://issues.apache.org/jira/browse/HBASE-4874
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
> Attachments: 4874.txt
>
>
> TestHCM#testClosing fails if not enough entropy is available

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4874) TestHCM#testClosing fails if not enough entropy is available

2011-11-25 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157383#comment-13157383
 ] 

Lars Hofhansl commented on HBASE-4874:
--

But it's used by UUID.randomUUID() which is used in main classes.

> TestHCM#testClosing fails if not enough entropy is available
> 
>
> Key: HBASE-4874
> URL: https://issues.apache.org/jira/browse/HBASE-4874
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
> Attachments: 4874.txt
>
>
> TestHCM#testClosing fails if not enough entropy is available

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4874) TestHCM#testClosing fails if not enough entropy is available

2011-11-25 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157384#comment-13157384
 ] 

Lars Hofhansl commented on HBASE-4874:
--

On server type setups we should be OK as entropy is also gathered by network 
and disk activity.

> TestHCM#testClosing fails if not enough entropy is available
> 
>
> Key: HBASE-4874
> URL: https://issues.apache.org/jira/browse/HBASE-4874
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
> Attachments: 4874.txt
>
>
> TestHCM#testClosing fails if not enough entropy is available

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157385#comment-13157385
 ] 

Ted Yu commented on HBASE-4862:
---

@Todd:
Do you need more details from Chunhui ?

Thanks

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for 
> trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-11-25 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157388#comment-13157388
 ] 

Ted Yu commented on HBASE-4862:
---

{code}
+if (fileName.endsWith(HLog.RECOVERED_LOG_TMPFILE_SUFFIX))
+  fileName = fileName.split(HLog.RECOVERED_LOG_TMPFILE_SUFFIX)[0];
{code}
Please enclose the second line above in curly braces.

w.r.t. fs.rename() call, here is javadoc from ClientProtocol.rename(which is 
called by fs.rename):
{code}
   * @return true if successful, or false if the old name does not exist
   * or if the new name already belongs to the namespace.
{code}
We should check the return value along with catching exception.

> Splitting hlog and opening region concurrently may cause data loss
> --
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, 
> hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for 
> trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread gaojinchao (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4868:
--

Attachment: HBASE-4868_trunkv2.patch

Fixed the comments


> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch, HBASE-4868_trunkv2.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4868) testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails

2011-11-25 Thread gaojinchao (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-4868:
--

Status: Open  (was: Patch Available)

> testMetaRebuild#TestOfflineMetaRebuildBase occasionally fails
> -
>
> Key: HBASE-4868
> URL: https://issues.apache.org/jira/browse/HBASE-4868
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.92.0
>Reporter: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4868_trial.patch, HBASE-4868_trunkv2.patch
>
>
> looks: 
> https://builds.apache.org/job/HBase-TRUNK-security/7/testReport/org.apache.hadoop.hbase.util.hbck/TestOfflineMetaRebuildBase/testMetaRebuild/
> Please review, see whether the method makes sense? 
> If it makes sense, I will check other cases?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >