[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141979#comment-13141979
 ] 

nkeywal commented on HBASE-4724:


I believe that this log:
{noformat}
org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
server = 29)
[...]
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1150)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1145)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1821)
at 
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:522)
at 
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:468)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)
at java.lang.Thread.run(Thread.java:662)
{noformat}

could explain why we have this in the final threads dump:

{noformat}
Thread 149 (Master:0;localhost,36968,1320216715828):
  State: WAITING
  Blocked count: 148
  Waited count: 148
  Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@1d4f0fb4
  Stack:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)

org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)

org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:104)

org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRoot(CatalogTracker.java:277)
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:523)

org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:468)
org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:309)
java.lang.Thread.run(Thread.java:662)
{noformat}

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Priority: Critical
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING

[jira] [Commented] (HBASE-4577) Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB

2011-11-02 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142008#comment-13142008
 ] 

gaojinchao commented on HBASE-4577:
---

Test failed, it seems not a patch problem.

> Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB
> -
>
> Key: HBASE-4577
> URL: https://issues.apache.org/jira/browse/HBASE-4577
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: gaojinchao
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: HBASE-4577_trial_Trunk.patch, HBASE-4577_trunk.patch
>
>
> Minor issue while looking at the RS metrics:
> bq. numberOfStorefiles=8, storefileUncompressedSizeMB=2418, 
> storefileSizeMB=2420, compressionRatio=1.0008
> I guess there's a truncation somewhere when it's adding the numbers up.
> FWIW there's no compression on that table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4577) Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142007#comment-13142007
 ] 

Ted Yu commented on HBASE-4577:
---

I don't see 'Too many open files' for 
https://builds.apache.org/job/PreCommit-HBASE-Build/133//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoadWithSplit/

> Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB
> -
>
> Key: HBASE-4577
> URL: https://issues.apache.org/jira/browse/HBASE-4577
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: gaojinchao
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: HBASE-4577_trial_Trunk.patch, HBASE-4577_trunk.patch
>
>
> Minor issue while looking at the RS metrics:
> bq. numberOfStorefiles=8, storefileUncompressedSizeMB=2418, 
> storefileSizeMB=2420, compressionRatio=1.0008
> I guess there's a truncation somewhere when it's adding the numbers up.
> FWIW there's no compression on that table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread Lucian George Iordache (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucian George Iordache updated HBASE-4713:
--

Attachment: HBASE-4713-patch.txt

> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Reporter: Lucian George Iordache
> Attachments: HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4724:
---

Attachment: 2002_4724_TestAdmin.patch

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Priority: Critical
> Attachments: 2002_4724_TestAdmin.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:104)
> 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRoot(CatalogTrack

[jira] [Updated] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread Lucian George Iordache (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucian George Iordache updated HBASE-4713:
--

Affects Version/s: 0.90.4
   Status: Patch Available  (was: Open)

> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Lucian George Iordache
> Attachments: HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4724:
---

Assignee: nkeywal
  Status: Patch Available  (was: Open)

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Attachments: 2002_4724_TestAdmin.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:104)
> 
> org.apache.hadoop.hba

[jira] [Commented] (HBASE-4577) Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB

2011-11-02 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142012#comment-13142012
 ] 

gaojinchao commented on HBASE-4577:
---

My local test result:

Running org.apache.hadoop.hbase.TestMultiVersions
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.045 sec

Results :

Failed tests:   testHBaseFsck(org.apache.hadoop.hbase.util.TestHBaseFsck): 
expected:<0> but was:<1>

Tests in error:
  
testMasterFailoverWithMockedRITOnDeadRS(org.apache.hadoop.hbase.master.TestMasterFailover):
 test timed out after 18 milliseconds
  
testEnableTableRoundRobinAssignment(org.apache.hadoop.hbase.client.TestAdmin): 
org.apache.hadoop.hbase.TableNotEnabledException: testEnableTableAssignment
  
testBadOriginalRootLocation(org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster):
 unknown host: example.org

Tests run: 1073, Failures: 1, Errors: 3, Skipped: 9



> Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB
> -
>
> Key: HBASE-4577
> URL: https://issues.apache.org/jira/browse/HBASE-4577
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: gaojinchao
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: HBASE-4577_trial_Trunk.patch, HBASE-4577_trunk.patch
>
>
> Minor issue while looking at the RS metrics:
> bq. numberOfStorefiles=8, storefileUncompressedSizeMB=2418, 
> storefileSizeMB=2420, compressionRatio=1.0008
> I guess there's a truncation somewhere when it's adding the numbers up.
> FWIW there's no compression on that table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142011#comment-13142011
 ] 

nkeywal commented on HBASE-4724:


I don't see the same stuff in the global build. I am gonna give a try by 
removing the modifications linked to the wal in admin.

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Attachments: 2002_4724_TestAdmin.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)
> 
> org

[jira] [Commented] (HBASE-4577) Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142015#comment-13142015
 ] 

Ted Yu commented on HBASE-4577:
---

@Jinchao:
Please start with test output of 
TestHFileOutputFormat#testMRIncrementalLoadWithSplit and see why the test 
failed.

> Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB
> -
>
> Key: HBASE-4577
> URL: https://issues.apache.org/jira/browse/HBASE-4577
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: gaojinchao
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: HBASE-4577_trial_Trunk.patch, HBASE-4577_trunk.patch
>
>
> Minor issue while looking at the RS metrics:
> bq. numberOfStorefiles=8, storefileUncompressedSizeMB=2418, 
> storefileSizeMB=2420, compressionRatio=1.0008
> I guess there's a truncation somewhere when it's adding the numbers up.
> FWIW there's no compression on that table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142013#comment-13142013
 ] 

Hadoop QA commented on HBASE-4724:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12501929/2002_4724_TestAdmin.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/134//console

This message is automatically generated.

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Attachments: 2002_4724_TestAdmin.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,3

[jira] [Commented] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142016#comment-13142016
 ] 

Hadoop QA commented on HBASE-4713:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12501928/HBASE-4713-patch.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/135//console

This message is automatically generated.

> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Lucian George Iordache
> Attachments: HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4724:
---

Status: Open  (was: Patch Available)

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:104)
> 
> org.apac

[jira] [Updated] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4724:
---

Status: Patch Available  (was: Open)

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:104)
> 
> org.apac

[jira] [Updated] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4724:
---

Attachment: 2002_4724_TestAdmin.v2.patch

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:104)
> 
> 

[jira] [Commented] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142022#comment-13142022
 ] 

Ted Yu commented on HBASE-4713:
---

@Lucian:
Your attachment is a diff file which is not recognized by HadoopQA.
Can you generate a patch ?

See http://wiki.apache.org/hadoop/Hbase/HowToContribute
Once you check out source code and make the above modification, you can use 
'svn diff > 4713.patch' to obtain patch.

> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Lucian George Iordache
> Attachments: HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142026#comment-13142026
 ] 

Ted Yu commented on HBASE-4716:
---

closeBulkRegionOperation() is at the beginning of finally block.

For the alternate code path, we only take one lock. The lock would be released 
in closeBulkRegionOperation() accordingly.

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-11-02 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142033#comment-13142033
 ] 

Hudson commented on HBASE-1744:
---

Integrated in HBase-TRUNK #2399 (See 
[https://builds.apache.org/job/HBase-TRUNK/2399/])
HBASE-1744  Thrift server to match the new java api (Tim Sell)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/bin/hbase
* /hbase/trunk/src/examples/thrift2
* /hbase/trunk/src/examples/thrift2/DemoClient.java
* /hbase/trunk/src/examples/thrift2/DemoClient.py
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/ThriftUtilities.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumn.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnIncrement.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TColumnValue.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TDelete.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TDeleteType.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TGet.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/THBaseService.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TIOError.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TIllegalArgument.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TIncrement.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TPut.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TResult.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TScan.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/generated/TTimeRange.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/thrift2/package.html
* /hbase/trunk/src/main/resources/org/apache/hadoop/hbase/thrift2
* /hbase/trunk/src/main/resources/org/apache/hadoop/hbase/thrift2/hbase.thrift
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/thrift2
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java


> Thrift server to match the new java api.
> 
>
> Key: HBASE-1744
> URL: https://issues.apache.org/jira/browse/HBASE-1744
> Project: HBase
>  Issue Type: Improvement
>  Components: thrift
>Reporter: Tim Sell
>Assignee: Tim Sell
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 
> 0001-thrift2-enable-usage-of-.deleteColumns-for-thrift.patch, 1744-trunk.10, 
> HBASE-1744.11.patch, HBASE-1744.2.patch, HBASE-1744.3.patch, 
> HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.6.patch, 
> HBASE-1744.7.patch, HBASE-1744.8.patch, HBASE-1744.9.patch, 
> HBASE-1744.preview.1.patch, thriftexperiment.patch
>
>
> This mutateRows, etc.. is a little confusing compared to the new cleaner java 
> client.
> Thinking of ways to make a thrift client that is just as elegant. something 
> like:
> void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
> with:
> struct TColumn {
>   1:Bytes family,
>   2:Bytes qualifier,
>   3:i64 timestamp
> }
> struct TPut {
>   1:Bytes row,
>   2:map values
> }
> This creates more verbose rpc  than if the columns in TPut were just 
> map>, but that is harder to fit timestamps into and 
> still be intuitive from say python.
> Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4722) TestGlobalMemStoreSize has started failing

2011-11-02 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142034#comment-13142034
 ] 

Hudson commented on HBASE-4722:
---

Integrated in HBase-TRUNK #2399 (See 
[https://builds.apache.org/job/HBase-TRUNK/2399/])
HBASE-4722 TestGlobalMemStoreSize has started failing; commit some extra 
logging to help debug whats going on up on jenkins

stack : 
Files : 
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestGlobalMemStoreSize.java


> TestGlobalMemStoreSize has started failing
> --
>
> Key: HBASE-4722
> URL: https://issues.apache.org/jira/browse/HBASE-4722
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Attachments: logging-v2.txt, logging.txt
>
>
> I'm digging in.  It fails occasionally for me locally to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142035#comment-13142035
 ] 

Ted Yu commented on HBASE-4377:
---

Integrated to 0.90, 0.92 and TRUNK.

Thanks for the patch Jonathan.

> [hbck] Offline rebuild .META. from fs data only.
> 
>
> Key: HBASE-4377
> URL: https://issues.apache.org/jira/browse/HBASE-4377
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, 
> EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, 
> hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, 
> hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch
>
>
> In a worst case situation, it may be helpful to have an offline .META. 
> rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
> from scratch.  Users could move bad regions out until there is a clean 
> rebuild.  
> It would likely fill in region split holes.  Follow on work could given 
> options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-1744) Thrift server to match the new java api.

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-1744:
--

Attachment: 1744.addendum

Addendum that makes TestThriftHBaseServiceHandler immune to hanging minicluster.

> Thrift server to match the new java api.
> 
>
> Key: HBASE-1744
> URL: https://issues.apache.org/jira/browse/HBASE-1744
> Project: HBase
>  Issue Type: Improvement
>  Components: thrift
>Reporter: Tim Sell
>Assignee: Tim Sell
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 
> 0001-thrift2-enable-usage-of-.deleteColumns-for-thrift.patch, 1744-trunk.10, 
> 1744.addendum, HBASE-1744.11.patch, HBASE-1744.2.patch, HBASE-1744.3.patch, 
> HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.6.patch, 
> HBASE-1744.7.patch, HBASE-1744.8.patch, HBASE-1744.9.patch, 
> HBASE-1744.preview.1.patch, thriftexperiment.patch
>
>
> This mutateRows, etc.. is a little confusing compared to the new cleaner java 
> client.
> Thinking of ways to make a thrift client that is just as elegant. something 
> like:
> void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
> with:
> struct TColumn {
>   1:Bytes family,
>   2:Bytes qualifier,
>   3:i64 timestamp
> }
> struct TPut {
>   1:Bytes row,
>   2:map values
> }
> This creates more verbose rpc  than if the columns in TPut were just 
> map>, but that is harder to fit timestamps into and 
> still be intuitive from say python.
> Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread Lucian George Iordache (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucian George Iordache updated HBASE-4713:
--

Status: Open  (was: Patch Available)

> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Lucian George Iordache
> Attachments: HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread Lucian George Iordache (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucian George Iordache updated HBASE-4713:
--

Status: Patch Available  (was: Open)

Try 4713.patch

> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Lucian George Iordache
> Attachments: 4713.patch, HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread Lucian George Iordache (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucian George Iordache updated HBASE-4713:
--

Attachment: 4713.patch

> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Lucian George Iordache
> Attachments: 4713.patch, HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-11-02 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142071#comment-13142071
 ] 

Hudson commented on HBASE-4377:
---

Integrated in HBase-0.92 #98 (See 
[https://builds.apache.org/job/HBase-0.92/98/])
HBASE-4377  [hbck] Offline rebuild .META. from fs data only
   (Jonathan Hsieh)
HBASE-4377  [hbck] Offline rebuild .META. from fs data only
   (Jonathan Hsieh) (detail)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/hbck
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRebuildTestCore.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java


> [hbck] Offline rebuild .META. from fs data only.
> 
>
> Key: HBASE-4377
> URL: https://issues.apache.org/jira/browse/HBASE-4377
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Attachments: 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 
> 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, 
> EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, 
> hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, 
> hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch
>
>
> In a worst case situation, it may be helpful to have an offline .META. 
> rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
> from scratch.  Users could move bad regions out until there is a clean 
> rebuild.  
> It would likely fill in region split holes.  Follow on work could given 
> options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3601) TestMasterFailover broken in TRUNK

2011-11-02 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-3601:
--

Fix Version/s: 0.92.0

> TestMasterFailover broken in TRUNK
> --
>
> Key: HBASE-3601
> URL: https://issues.apache.org/jira/browse/HBASE-3601
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
>
> After HBASE-3573, went in, TestMasterFailover broke.  The change in shutdown 
> technique revealed an issue with our in-memory accounting when a master joins 
> an already cluster; we don't add .META. and -ROOT- to our set of online 
> regions in the new master so could make for some interesting issues as the 
> new master progressed (Previous shutdown did a count of remaining servers, 
> new shutdown process looks at in-memory state to see if only catalog carrying 
> regionservers online... this is what was going out of whack in new master).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3605) Fix balancer log message

2011-11-02 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-3605:
--

Fix Version/s: 0.92.0

> Fix balancer log message
> 
>
> Key: HBASE-3605
> URL: https://issues.apache.org/jira/browse/HBASE-3605
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
> Fix For: 0.92.0
>
>
> From Gaojinchao up on user list:
> In balanceCluster function , It should be "leastloaded=" + 
> serversByLoad.firstKey ().getLoad().getNumberOfRegions())"
> {code}
> if(serversByLoad.lastKey().getLoad().getNumberOfRegions() <= max &&
>   serversByLoad.firstKey().getLoad().getNumberOfRegions() >= min) {
>  // Skipped because no server outside (min,max) range
>  LOG.info("Skipping load balancing.  servers=" + numServers + " " +
>  "regions=" + numRegions + " average=" + average + " " +
>  "mostloaded=" + 
> serversByLoad.lastKey().getLoad().getNumberOfRegions() +
>  " leastloaded=" + 
> serversByLoad.lastKey().getLoad().getNumberOfRegions());
>  return null;
>}
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142119#comment-13142119
 ] 

Hadoop QA commented on HBASE-4713:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12501940/4713.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -165 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 42 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  org.apache.hadoop.hbase.master.TestMasterFailover

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/138//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/138//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/138//console

This message is automatically generated.

> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Lucian George Iordache
> Attachments: 4713.patch, HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4713:
-

   Resolution: Fixed
Fix Version/s: 0.92.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thank you for the patch Lucian George Iordache.  I applied trunk and 0.92 
branch..

> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Lucian George Iordache
> Fix For: 0.92.0
>
> Attachments: 4713.patch, HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142172#comment-13142172
 ] 

stack commented on HBASE-4716:
--

Thanks. +1 on commit.

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4609:
--

Status: Open  (was: Patch Available)

> ThriftServer.getRegionInfo() is expecting old ServerName format, need to use 
> new Addressing class instead
> -
>
> Key: HBASE-4609
> URL: https://issues.apache.org/jira/browse/HBASE-4609
> Project: HBase
>  Issue Type: Bug
>  Components: thrift
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>Assignee: Jonathan Gray
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: HBASE-4609-v1.patch
>
>
> ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't 
> include start code.  Need to fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4577) Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB

2011-11-02 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142187#comment-13142187
 ] 

gaojinchao commented on HBASE-4577:
---

Sorry, I am not familiar with MR. Continue to dig this issue tomorrow.

> Region server reports storefileSizeMB bigger than storefileUncompressedSizeMB
> -
>
> Key: HBASE-4577
> URL: https://issues.apache.org/jira/browse/HBASE-4577
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: gaojinchao
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: HBASE-4577_trial_Trunk.patch, HBASE-4577_trunk.patch
>
>
> Minor issue while looking at the RS metrics:
> bq. numberOfStorefiles=8, storefileUncompressedSizeMB=2418, 
> storefileSizeMB=2420, compressionRatio=1.0008
> I guess there's a truncation somewhere when it's adding the numbers up.
> FWIW there's no compression on that table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4609:
--

Attachment: 4609-v2.txt

> ThriftServer.getRegionInfo() is expecting old ServerName format, need to use 
> new Addressing class instead
> -
>
> Key: HBASE-4609
> URL: https://issues.apache.org/jira/browse/HBASE-4609
> Project: HBase
>  Issue Type: Bug
>  Components: thrift
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>Assignee: Jonathan Gray
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 4609-v2.txt, HBASE-4609-v1.patch
>
>
> ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't 
> include start code.  Need to fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4609:
--

Status: Patch Available  (was: Open)

> ThriftServer.getRegionInfo() is expecting old ServerName format, need to use 
> new Addressing class instead
> -
>
> Key: HBASE-4609
> URL: https://issues.apache.org/jira/browse/HBASE-4609
> Project: HBase
>  Issue Type: Bug
>  Components: thrift
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>Assignee: Jonathan Gray
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 4609-v2.txt, HBASE-4609-v1.patch
>
>
> ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't 
> include start code.  Need to fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142197#comment-13142197
 ] 

Ted Yu commented on HBASE-1744:
---

>From 
>https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2399/artifact/trunk/target/surefire-reports/org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler-output.txt:
{code}
2011-11-02 09:18:25,930 INFO  [main] zookeeper.MiniZooKeeperCluster(141): 
Failed binding ZK Server to client port: 21818
2011-11-02 09:18:25,958 INFO  [main] zookeeper.MiniZooKeeperCluster(164): 
Started MiniZK Cluster and connect 1 ZK server on client port: 21819
{code}
Let's see what happens in build 2400.

> Thrift server to match the new java api.
> 
>
> Key: HBASE-1744
> URL: https://issues.apache.org/jira/browse/HBASE-1744
> Project: HBase
>  Issue Type: Improvement
>  Components: thrift
>Reporter: Tim Sell
>Assignee: Tim Sell
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 
> 0001-thrift2-enable-usage-of-.deleteColumns-for-thrift.patch, 1744-trunk.10, 
> 1744.addendum, HBASE-1744.11.patch, HBASE-1744.2.patch, HBASE-1744.3.patch, 
> HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.6.patch, 
> HBASE-1744.7.patch, HBASE-1744.8.patch, HBASE-1744.9.patch, 
> HBASE-1744.preview.1.patch, thriftexperiment.patch
>
>
> This mutateRows, etc.. is a little confusing compared to the new cleaner java 
> client.
> Thinking of ways to make a thrift client that is just as elegant. something 
> like:
> void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
> with:
> struct TColumn {
>   1:Bytes family,
>   2:Bytes qualifier,
>   3:i64 timestamp
> }
> struct TPut {
>   1:Bytes row,
>   2:map values
> }
> This creates more verbose rpc  than if the columns in TPut were just 
> map>, but that is harder to fit timestamps into and 
> still be intuitive from say python.
> Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142202#comment-13142202
 ] 

Ted Yu commented on HBASE-1744:
---

The test passed in build 2400.
{code}
Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.203 sec
{code}

> Thrift server to match the new java api.
> 
>
> Key: HBASE-1744
> URL: https://issues.apache.org/jira/browse/HBASE-1744
> Project: HBase
>  Issue Type: Improvement
>  Components: thrift
>Reporter: Tim Sell
>Assignee: Tim Sell
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 
> 0001-thrift2-enable-usage-of-.deleteColumns-for-thrift.patch, 1744-trunk.10, 
> 1744.addendum, HBASE-1744.11.patch, HBASE-1744.2.patch, HBASE-1744.3.patch, 
> HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.6.patch, 
> HBASE-1744.7.patch, HBASE-1744.8.patch, HBASE-1744.9.patch, 
> HBASE-1744.preview.1.patch, thriftexperiment.patch
>
>
> This mutateRows, etc.. is a little confusing compared to the new cleaner java 
> client.
> Thinking of ways to make a thrift client that is just as elegant. something 
> like:
> void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
> with:
> struct TColumn {
>   1:Bytes family,
>   2:Bytes qualifier,
>   3:i64 timestamp
> }
> struct TPut {
>   1:Bytes row,
>   2:map values
> }
> This creates more verbose rpc  than if the columns in TPut were just 
> map>, but that is harder to fit timestamps into and 
> still be intuitive from say python.
> Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1744) Thrift server to match the new java api.

2011-11-02 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142209#comment-13142209
 ] 

Hudson commented on HBASE-1744:
---

Integrated in HBase-TRUNK #2400 (See 
[https://builds.apache.org/job/HBase-TRUNK/2400/])
HBASE-1744 HBaseAdmin ctor should obtain Configuration from 
HBaseTestingUtility

tedyu : 
Files : 
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java


> Thrift server to match the new java api.
> 
>
> Key: HBASE-1744
> URL: https://issues.apache.org/jira/browse/HBASE-1744
> Project: HBase
>  Issue Type: Improvement
>  Components: thrift
>Reporter: Tim Sell
>Assignee: Tim Sell
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 
> 0001-thrift2-enable-usage-of-.deleteColumns-for-thrift.patch, 1744-trunk.10, 
> 1744.addendum, HBASE-1744.11.patch, HBASE-1744.2.patch, HBASE-1744.3.patch, 
> HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.6.patch, 
> HBASE-1744.7.patch, HBASE-1744.8.patch, HBASE-1744.9.patch, 
> HBASE-1744.preview.1.patch, thriftexperiment.patch
>
>
> This mutateRows, etc.. is a little confusing compared to the new cleaner java 
> client.
> Thinking of ways to make a thrift client that is just as elegant. something 
> like:
> void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
> with:
> struct TColumn {
>   1:Bytes family,
>   2:Bytes qualifier,
>   3:i64 timestamp
> }
> struct TPut {
>   1:Bytes row,
>   2:map values
> }
> This creates more verbose rpc  than if the columns in TPut were just 
> map>, but that is harder to fit timestamps into and 
> still be intuitive from say python.
> Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142215#comment-13142215
 ] 

stack commented on HBASE-4724:
--

I applied your restore of wal behavior in v2.  Lets see how it plays out on 
jenkins.

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.

[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142217#comment-13142217
 ] 

stack commented on HBASE-4724:
--

Oh, you need to do --no-prefix when making patches for patch-build.

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)
> 
>

[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142229#comment-13142229
 ] 

stack commented on HBASE-4724:
--

@N I don't have version mismatch when I run TestAdmin.  It doesn't fail for me 
either.

bq. Adding a maximum retry to Threads#threadDumpingIsAlive could help.

Yes.  We should do this.  No point in going on after we've thread dumped three 
times I'd say.

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.

[jira] [Commented] (HBASE-4583) Integrate RWCC with Append and Increment operations

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142236#comment-13142236
 ] 

stack commented on HBASE-4583:
--

bq. There are various ways to produce serializable schedules (pessimistic 
locking, optimistic locking with rechecking of pre conditions, snapshot 
isolation, etc), all which will probably mean worse performance for both append 
and increment.

Shouldn't we do it anyways (though big yuck on your list above -- it makes my 
brain hurt just thinking on it.  Can you imagine rechecking pre-conditions and 
then replaying the failed transaction.. how much fun that'll be to code up!)?

Shouldn't we be correct first and then performant?

bq. As said above the current implementation sync's the WAL after the memstore 
is updated and the new values are visible to other threads, and after the locks 
are released. 

Sounds broke to me; sounds like big compromise for sake of better perf.  Should 
we open new issue on this?

bq. (1) and (2) together mean that the WAL needs to be sync'ed with the row 
lock held (which would be quite a performance degradation).

Shouldn't we ship with this config. with options to run hbase otherwise 
(memstore put then sync, etc.)

bq. Now, what we could do is use rwcc to make the changes to the CFs atomic, 
and still sync the WAL after all the locks are released (as we do now). With 
this compromise everything would be correct unless the sync'ing of WAL fails

Sounds broke still?

Thanks for the write up and for digging in here fellas.



> Integrate RWCC with Append and Increment operations
> ---
>
> Key: HBASE-4583
> URL: https://issues.apache.org/jira/browse/HBASE-4583
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 4583-v2.txt, 4583-v3.txt, 4583-v4.txt, 4583.txt
>
>
> Currently Increment and Append operations do not work with RWCC and hence a 
> client could see the results of multiple such operation mixed in the same 
> Get/Scan.
> The semantics might be a bit more interesting here as upsert adds and removes 
> to and from the memstore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142239#comment-13142239
 ] 

stack commented on HBASE-4724:
--

Before applying the patch, I got the below on random run:

{code}Tests in error: 
  testDisableAndEnableTable(org.apache.hadoop.hbase.client.TestAdmin): 
org.apache.hadoop.hbase.TableNotEnabledException: 
testDisableAndEnableTable{code}

Trying w/ your v2 patch now.

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.l

[jira] [Commented] (HBASE-4480) Testing script to simplify local testing

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142242#comment-13142242
 ] 

stack commented on HBASE-4480:
--

I like this script.  We should check it in under new 'dev-support' dir?  The 
usage is a bit off.  It says '-n=N' when I think it means to say '-n N'

> Testing script to simplify local testing
> 
>
> Key: HBASE-4480
> URL: https://issues.apache.org/jira/browse/HBASE-4480
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jesse Yates
>Priority: Minor
>  Labels: test
> Attachments: HBASE-4480.patch, HBASE-4480_v2.patch, 
> HBASE-4480_v3.patch, runtest-no-npe-check.sh, runtest.sh, runtest2.sh
>
>
> As mentioned by http://search-hadoop.com/m/r2Ab624ES3e and 
> http://search-hadoop.com/m/cZjDH1ykGIA it would be nice if we could have a 
> script that would handle more of the finer points of running/checking our 
> test suite.
> This script should:
> (1) Allow people to determine which tests are hanging/taking a long time to 
> run
> (2) Allow rerunning of particular tests to make sure it wasn't an artifact of 
> running the whole suite that caused the failure
> (3) Allow people to specify to run just unit tests or also integration tests 
> (essentially wrapping calls to 'maven test' and 'maven verify').
> This script should just be a convenience script - running tests directly from 
> maven should not be impacted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4518) TestServerCustomProtocol is flaky

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142244#comment-13142244
 ] 

stack commented on HBASE-4518:
--

I shouldn't have said anything.  My mention of this issue caused the test to 
start failing up on jenkins again.

> TestServerCustomProtocol is flaky
> -
>
> Key: HBASE-4518
> URL: https://issues.apache.org/jira/browse/HBASE-4518
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors, test
>Affects Versions: 0.92.0
>Reporter: Gary Helmling
> Fix For: 0.92.0
>
> Attachments: 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol-output.txt, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.txt
>
>
> TestServerCustomProtocol has been showing some intermittent failures in 
> Jenkins due to what looks like region transitions.
> Here is the most recent failure:
> {noformat}
> Results :
> Failed tests:   
> testRowRange(org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol): 
> Results should contain region 
> test,bbb,1317332645939.aea9154349b9e0dc207e2e9476702763. for row 'bbb'
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142249#comment-13142249
 ] 

nkeywal commented on HBASE-4724:


I recloned the repo, and now it works all the time.

I have this is in the logs, it seems to be new to me:
{noformat}
2011-11-02 03:36:17,271 DEBUG [main] zookeeper.ZKUtil(1034): hconnection 
Retrieved 29 byte(s) of data from znode /hbase/root-region-server and set 
watcher; localhost,39688,1320229746489
2011-11-02 03:36:17,272 DEBUG [Finalizer] 
client.HConnectionManager$HConnectionImplementation(1715): The connection to 
null has been closed.
2011-11-02 03:36:17,272 DEBUG [Finalizer] 
client.HConnectionManager$HConnectionImplementation(1734): The connection to 
null was closed by the finalize method.
2011-11-02 03:36:17,272 DEBUG [Finalizer] 
client.HConnectionManager$HConnectionImplementation(1715): The connection to 
null has been closed.
2011-11-02 03:36:17,272 DEBUG [Finalizer] 
client.HConnectionManager$HConnectionImplementation(1734): The connection to 
null was closed by the finalize method.
{noformat}

Could it have a side effect in some cases?

FWIW, I also got this once, but the test case succeeded anyway. It something I 
saw in the past already.
{noformat}
2011-11-02 02:08:13,311 ERROR 
[MASTER_OPEN_REGION-localhost,40499,132022456-3] 
executor.EventHandler(171): Caught throwable while processing event 
RS_ZK_REGION_OPENED
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.master.AssignmentManager.updateTimers(AssignmentManager.java:1059)
at 
org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:1033)
at 
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler.process(OpenedRegionHandler.java:105)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2011-11-02 02:08:13,311 INFO  [RS_OPEN_REGION-localhost,47967,132022999-2] 
regionserver.HRegion(402): Setting up tabledescriptor config now ...
{noformat}

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not wri

[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142251#comment-13142251
 ] 

stack commented on HBASE-4724:
--

Let me look at the above.

I reran the TestAdmin a few times and got this again:

{code}
---
Test set: org.apache.hadoop.hbase.client.TestAdmin
---
Tests run: 33, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 509.79 sec <<< 
FAILURE!
testEnableDisableAddColumnDeleteColumn(org.apache.hadoop.hbase.client.TestAdmin)
  Time elapsed: 0.841 sec  <<< ERROR!
org.apache.hadoop.hbase.TableNotEnabledException: 
org.apache.hadoop.hbase.TableNotEnabledException: testMasterAdmin
{code}

You think some of the cuts in timers too aggressive still?

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.jav

[jira] [Commented] (HBASE-4609) ThriftServer.getRegionInfo() is expecting old ServerName format, need to use new Addressing class instead

2011-11-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142255#comment-13142255
 ] 

Hadoop QA commented on HBASE-4609:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12501968/4609-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -165 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 42 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  org.apache.hadoop.hbase.master.TestMasterFailover

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/139//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/139//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/139//console

This message is automatically generated.

> ThriftServer.getRegionInfo() is expecting old ServerName format, need to use 
> new Addressing class instead
> -
>
> Key: HBASE-4609
> URL: https://issues.apache.org/jira/browse/HBASE-4609
> Project: HBase
>  Issue Type: Bug
>  Components: thrift
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>Assignee: Jonathan Gray
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: 4609-v2.txt, HBASE-4609-v1.patch
>
>
> ThriftServer.getRegionInfo() is expecting the old ServerName that doesn't 
> include start code.  Need to fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142257#comment-13142257
 ] 

Ted Yu commented on HBASE-4716:
---

Integrated to 0.92 and TRUNK.

Thanks for the review Todd and Stack.

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4716:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4725) NPE in AM#updateTimers

2011-11-02 Thread stack (Created) (JIRA)
NPE in AM#updateTimers
--

 Key: HBASE-4725
 URL: https://issues.apache.org/jira/browse/HBASE-4725
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4725) NPE in AM#updateTimers

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142267#comment-13142267
 ] 

stack commented on HBASE-4725:
--

{code}
2011-11-02 02:08:13,311 ERROR 
[MASTER_OPEN_REGION-localhost,40499,132022456-3] 
executor.EventHandler(171): Caught throwable while processing event 
RS_ZK_REGION_OPENED
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.master.AssignmentManager.updateTimers(AssignmentManager.java:1059)
at 
org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:1033)
at 
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler.process(OpenedRegionHandler.java:105)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2011-11-02 02:08:13,311 INFO  [RS_OPEN_REGION-localho
{code}

> NPE in AM#updateTimers
> --
>
> Key: HBASE-4725
> URL: https://issues.apache.org/jira/browse/HBASE-4725
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: am.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4725) NPE in AM#updateTimers

2011-11-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4725:
-

Status: Patch Available  (was: Open)

> NPE in AM#updateTimers
> --
>
> Key: HBASE-4725
> URL: https://issues.apache.org/jira/browse/HBASE-4725
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: am.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4725) NPE in AM#updateTimers

2011-11-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4725:
-

Attachment: am.txt

> NPE in AM#updateTimers
> --
>
> Key: HBASE-4725
> URL: https://issues.apache.org/jira/browse/HBASE-4725
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: am.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142269#comment-13142269
 ] 

stack commented on HBASE-4724:
--

I made HBASE-4725 for the NPE.  Looking at the null connection...

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131)
> 
> o

[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142272#comment-13142272
 ] 

nkeywal commented on HBASE-4724:


The timer I changed on the admin are all with a condition; so instead of 
checking once per second they check 5 times per second. This should not change 
the final behaviour, and I took care of not changing the final timemout. 
testEnableDisableAddColumnDeleteColumn has not been directly impacted by 4703, 
there is no timer there. 

What's the line that's throwing this exception?


> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217

[jira] [Commented] (HBASE-4415) Add configuration script for setup HBase (hbase-setup-conf.sh)

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142284#comment-13142284
 ] 

stack commented on HBASE-4415:
--

Ditto on what Ted asks above.

Why would we have this duplicated (and now behind) hbase-env.sh over in 
src/packages/templates/conf/hbase-env.sh?  Why would we not copy it from 
original location?

@Andrew I see this patch has:

{code}
+  
+hbase.master.kerberos.principal
+${HBASE_M_K_PRINCIPAL}
+
+  
{code}

We need more than that now?


> Add configuration script for setup HBase (hbase-setup-conf.sh)
> --
>
> Key: HBASE-4415
> URL: https://issues.apache.org/jira/browse/HBASE-4415
> Project: HBase
>  Issue Type: New Feature
>  Components: scripts
>Affects Versions: 0.90.4, 0.92.0
> Environment: Java 6, Linux
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: 0.90.4, 0.92.0
>
> Attachments: HBASE-4415-1.patch, HBASE-4415-2.patch, 
> HBASE-4415-3.patch, HBASE-4415-4.patch, HBASE-4415-5.patch, 
> HBASE-4415-6.patch, HBASE-4415.patch
>
>
> The goal of this jura is to provide a installation script for configuring 
> HBase environment and configuration.  By using the same pattern of 
> *-setup-conf.sh for all Hadoop related projects.  For HBase, the usage of the 
> script looks like this:
> {noformat}
> usage: ./hbase-setup-conf.sh 
>   Optional parameters:
> --hadoop-conf=/etc/hadoopSet Hadoop configuration directory 
> location
> --hadoop-home=/usr   Set Hadoop directory location
> --hadoop-namenode=localhost  Set Hadoop namenode hostname
> --hadoop-replication=3   Set HDFS replication
> --hbase-home=/usrSet HBase directory location
> --hbase-conf=/etc/hbase  Set HBase configuration 
> directory location
> --hbase-log=/var/log/hbase   Set HBase log directory location
> --hbase-pid=/var/run/hbase   Set HBase pid directory location
> --hbase-user=hbase   Set HBase user
> --java-home=/usr/java/defaultSet JAVA_HOME directory location
> --kerberos-realm=KERBEROS.EXAMPLE.COMSet Kerberos realm
> --kerberos-principal-id=_HOSTSet Kerberos principal ID 
> --keytab-dir=/etc/security/keytabs   Set keytab directory
> --regionservers=localhostSet regionservers hostnames
> --zookeeper-home=/usrSet ZooKeeper directory location
> --zookeeper-quorum=localhost Set ZooKeeper Quorum
> --zookeeper-snapshot=/var/lib/zookeeper  Set ZooKeeper snapshot location
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4523) dfs.support.append config should be present in the hadoop configs, we should remove them from hbase so the user is not confused when they see the config in 2 places

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142285#comment-13142285
 ] 

stack commented on HBASE-4523:
--

This patch looks fine.

> dfs.support.append config should be present in the hadoop configs, we should 
> remove them from hbase so the user is not confused when they see the config 
> in 2 places
> 
>
> Key: HBASE-4523
> URL: https://issues.apache.org/jira/browse/HBASE-4523
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.4, 0.92.0
>Reporter: Arpit Gupta
>Assignee: Eric Yang
> Fix For: 0.90.4, 0.92.0
>
> Attachments: HBASE-4523.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4535) hbase-env.sh in hbase rpm does not set HBASE_CONF_DIR

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142286#comment-13142286
 ] 

stack commented on HBASE-4535:
--

Patch looks fine.  Should this be done over in original hbase-env.sh?

> hbase-env.sh in hbase rpm does not set HBASE_CONF_DIR
> -
>
> Key: HBASE-4535
> URL: https://issues.apache.org/jira/browse/HBASE-4535
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.90.3
>Reporter: Ramya Sunil
>Assignee: Eric Yang
> Attachments: HBASE-4535.patch
>
>
> After a hbase rpm install, hbase-env.sh does not define HBASE_CONF_DIR. This 
> needs to be fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4635) Remove dependency of java for rpm/deb packaging

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142288#comment-13142288
 ] 

stack commented on HBASE-4635:
--

This has this change:

{code}
 . /etc/default/hadoop-env.sh
-. /etc/default/zookeeper-env.sh
{code}

... which is a little unrelated.  You are trying to make this hbase-env.sh same 
as the original?

> Remove dependency of java for rpm/deb packaging
> ---
>
> Key: HBASE-4635
> URL: https://issues.apache.org/jira/browse/HBASE-4635
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.92.0
> Environment: Java, Ubuntu, RHEL
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: HBASE-4635.patch
>
>
> Comment from HBASE-3606:
> Eric, it looks like hbase rpm spec file sets dependency on jdk. Can we remove 
> the jdk dependency ? As everyone will not be installing jdk through rpm.
> There are multiple ways to install Java on Linux.  It would be better to 
> remove Java dependency declaration for packaging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4518) TestServerCustomProtocol is flaky

2011-11-02 Thread Gary Helmling (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-4518:
-

Attachment: HBASE-4518.patch

This patch cleans up the config of the PingHandler endpoint and changes the 
mini cluster to use a single region server to avoid region transition issues.  
With this patch I was able to run TestServerCustomProtocol in a batch of 50 
runs with no failures.

> TestServerCustomProtocol is flaky
> -
>
> Key: HBASE-4518
> URL: https://issues.apache.org/jira/browse/HBASE-4518
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors, test
>Affects Versions: 0.92.0
>Reporter: Gary Helmling
> Fix For: 0.92.0
>
> Attachments: HBASE-4518.patch, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol-output.txt, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.txt
>
>
> TestServerCustomProtocol has been showing some intermittent failures in 
> Jenkins due to what looks like region transitions.
> Here is the most recent failure:
> {noformat}
> Results :
> Failed tests:   
> testRowRange(org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol): 
> Results should contain region 
> test,bbb,1317332645939.aea9154349b9e0dc207e2e9476702763. for row 'bbb'
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4518) TestServerCustomProtocol is flaky

2011-11-02 Thread Gary Helmling (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-4518:
-

Assignee: Gary Helmling
  Status: Patch Available  (was: Open)

> TestServerCustomProtocol is flaky
> -
>
> Key: HBASE-4518
> URL: https://issues.apache.org/jira/browse/HBASE-4518
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors, test
>Affects Versions: 0.92.0
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Fix For: 0.92.0
>
> Attachments: HBASE-4518.patch, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol-output.txt, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.txt
>
>
> TestServerCustomProtocol has been showing some intermittent failures in 
> Jenkins due to what looks like region transitions.
> Here is the most recent failure:
> {noformat}
> Results :
> Failed tests:   
> testRowRange(org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol): 
> Results should contain region 
> test,bbb,1317332645939.aea9154349b9e0dc207e2e9476702763. for row 'bbb'
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4518) TestServerCustomProtocol is flaky

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142294#comment-13142294
 ] 

stack commented on HBASE-4518:
--

This line:

{code}
+util.getConfiguration().set(CoprocessorHost.REGION_COPROCESSOR_CONF_KEY,
+PingHandler.class.getName()); 
{code}

does what

{code}
-// TODO: use a test coprocessor for registration (once merged with CP code)
-// sleep here is an ugly hack to allow region transitions to finish
-Thread.sleep(5000);
-for (JVMClusterUtil.RegionServerThread t :
-  cluster.getRegionServerThreads()) {
-  for (HRegionInfo r : t.getRegionServer().getOnlineRegions()) {
-t.getRegionServer().getOnlineRegion(r.getRegionName())
-.registerProtocol(PingProtocol.class, new PingHandler());
-  }
-}  
{code}

... used to do?


If so, +1 on commit if patch-build gives back reasonable results.

> TestServerCustomProtocol is flaky
> -
>
> Key: HBASE-4518
> URL: https://issues.apache.org/jira/browse/HBASE-4518
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors, test
>Affects Versions: 0.92.0
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Fix For: 0.92.0
>
> Attachments: HBASE-4518.patch, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol-output.txt, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.txt
>
>
> TestServerCustomProtocol has been showing some intermittent failures in 
> Jenkins due to what looks like region transitions.
> Here is the most recent failure:
> {noformat}
> Results :
> Failed tests:   
> testRowRange(org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol): 
> Results should contain region 
> test,bbb,1317332645939.aea9154349b9e0dc207e2e9476702763. for row 'bbb'
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4726) RS should close region if it fails to mark it as 'OPENED'.

2011-11-02 Thread Madhuwanti Vaidya (Created) (JIRA)
RS should close region if it fails to mark it as 'OPENED'.
--

 Key: HBASE-4726
 URL: https://issues.apache.org/jira/browse/HBASE-4726
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.89.20100924
Reporter: Madhuwanti Vaidya
Assignee: Madhuwanti Vaidya
Priority: Minor


Currently if a RS fails to mark a region as 'OPENED' it only logs an error. It 
will leave the region open - this has caused duplicate region assignments in 
one of our production clusters. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142299#comment-13142299
 ] 

stack commented on HBASE-4724:
--

@N Trying to reproduce but testing the NPE fix at same time its not failing 
for me now I'll let it run longer.

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.hadoop.hbase.zookeeper.RootRegionTracker@6621477c
>   Stack:
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:485)
> 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUn

[jira] [Updated] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename

2011-11-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4553:
-

Attachment: 4553-v11.txt

Same as v10.

> The update of .tableinfo is not atomic; we remove then rename
> -
>
> Key: HBASE-4553
> URL: https://issues.apache.org/jira/browse/HBASE-4553
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 3446-v8.txt, 4553-v10.txt, 4553-v11.txt, 4553-v5.txt, 
> 4553-v9.txt, HBase-4553-TestAvroServer.patch
>
>
> This comes of HBASE-4547.  The rename in 0.20 hdfs fails if file exists 
> already.  In 0.20+ its better but still 'some' issues if existing reader when 
> file is renamed.  This issue is about fixing this (though we depend on fix 
> first being in hdfs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Work started] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename

2011-11-02 Thread stack (Work started) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-4553 started by stack.

> The update of .tableinfo is not atomic; we remove then rename
> -
>
> Key: HBASE-4553
> URL: https://issues.apache.org/jira/browse/HBASE-4553
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 3446-v8.txt, 4553-v10.txt, 4553-v11.txt, 4553-v5.txt, 
> 4553-v9.txt, HBase-4553-TestAvroServer.patch
>
>
> This comes of HBASE-4547.  The rename in 0.20 hdfs fails if file exists 
> already.  In 0.20+ its better but still 'some' issues if existing reader when 
> file is renamed.  This issue is about fixing this (though we depend on fix 
> first being in hdfs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4553) The update of .tableinfo is not atomic; we remove then rename

2011-11-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4553:
-

Status: Open  (was: Patch Available)

> The update of .tableinfo is not atomic; we remove then rename
> -
>
> Key: HBASE-4553
> URL: https://issues.apache.org/jira/browse/HBASE-4553
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 3446-v8.txt, 4553-v10.txt, 4553-v11.txt, 4553-v5.txt, 
> 4553-v9.txt, HBase-4553-TestAvroServer.patch
>
>
> This comes of HBASE-4547.  The rename in 0.20 hdfs fails if file exists 
> already.  In 0.20+ its better but still 'some' issues if existing reader when 
> file is renamed.  This issue is about fixing this (though we depend on fix 
> first being in hdfs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4518) TestServerCustomProtocol is flaky

2011-11-02 Thread Gary Helmling (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142307#comment-13142307
 ] 

Gary Helmling commented on HBASE-4518:
--

@Stack,

Yes, for some reason in the original code I was being a bit too "clever" and 
registering the protocol handler directly.  Maybe it was before all the 
connecting bits had been filled in yet...  In any case, the manual registration 
in the original code would mean that the PingHandler would not get 
re-registered if a region closed on one RS and was reopened on another.  So 
that is a flaw.  And the test code should really be doing what we tell people 
to do with endpoints, which is to configure them as coprocessors.

> TestServerCustomProtocol is flaky
> -
>
> Key: HBASE-4518
> URL: https://issues.apache.org/jira/browse/HBASE-4518
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors, test
>Affects Versions: 0.92.0
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Fix For: 0.92.0
>
> Attachments: HBASE-4518.patch, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol-output.txt, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.txt
>
>
> TestServerCustomProtocol has been showing some intermittent failures in 
> Jenkins due to what looks like region transitions.
> Here is the most recent failure:
> {noformat}
> Results :
> Failed tests:   
> testRowRange(org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol): 
> Results should contain region 
> test,bbb,1317332645939.aea9154349b9e0dc207e2e9476702763. for row 'bbb'
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4719) HBase script assumes pre-Hadoop 0.21 layout of jar files

2011-11-02 Thread Roman Shaposhnik (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated HBASE-4719:


Status: Patch Available  (was: Open)

> HBase script assumes pre-Hadoop 0.21 layout of jar files
> 
>
> Key: HBASE-4719
> URL: https://issues.apache.org/jira/browse/HBASE-4719
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.0
>Reporter: Roman Shaposhnik
> Attachments: HBASE-4719.patch.txt
>
>
> The following in the bin/hbase:
> {noformat}
> HADOOPCPPATH=$(append_path "${HADOOPCPPATH}" `ls 
> ${HADOOP_HOME}/hadoop-core*.jar`)
> {noformat}
> assumes a pre-21 Hadoop layout. It'll be better to dynamically account for 
> either hadoop-core* or hadoop-common*, hadoop-hdfs*, hadoop-mapreduce*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4727) Don't unconditionally delete UNASSIGNED ZNode for a region.

2011-11-02 Thread Madhuwanti Vaidya (Created) (JIRA)
Don't unconditionally delete UNASSIGNED ZNode for a region.
---

 Key: HBASE-4727
 URL: https://issues.apache.org/jira/browse/HBASE-4727
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.89.20100924
Reporter: Madhuwanti Vaidya
Assignee: Madhuwanti Vaidya
Priority: Minor


Unconditionally deleting an UNASSIGNED ZNode when master processes 
RS2ZK_REGION_OPENED (from the toDo queue) for a region has caused multiply 
assigned regions or unassigned regions. One proposed fix is to check whether 
the ZNode is actually in the state RS2ZK_REGION_OPENED before deleting it. 
Another fix is to not delete the ZNode at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4719) HBase script assumes pre-Hadoop 0.21 layout of jar files

2011-11-02 Thread Roman Shaposhnik (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated HBASE-4719:


Attachment: HBASE-4719.patch.txt

> HBase script assumes pre-Hadoop 0.21 layout of jar files
> 
>
> Key: HBASE-4719
> URL: https://issues.apache.org/jira/browse/HBASE-4719
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.0
>Reporter: Roman Shaposhnik
> Attachments: HBASE-4719.patch.txt
>
>
> The following in the bin/hbase:
> {noformat}
> HADOOPCPPATH=$(append_path "${HADOOPCPPATH}" `ls 
> ${HADOOP_HOME}/hadoop-core*.jar`)
> {noformat}
> assumes a pre-21 Hadoop layout. It'll be better to dynamically account for 
> either hadoop-core* or hadoop-common*, hadoop-hdfs*, hadoop-mapreduce*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142308#comment-13142308
 ] 

nkeywal commented on HBASE-4724:


It happened once out of may be 20 tries.

For the null connection, there is at least a leak in 
{noformat}
  @Before
  public void setUp() throws Exception {
this.admin = new HBaseAdmin(TEST_UTIL.getConfiguration());
  }
{noformat}

But I guess it's not the only one, because I saw quite a lot of lines in the 
logs from Jenkins.

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited count: 0
>   Stack:
> sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
> Thread 152 (Master:0;localhost,39664,1320187706355):
>   State: WAITING
>   Blocked count: 217
>   Waited count: 174
>   Waiting on org.apache.ha

[jira] [Updated] (HBASE-3716) Intermittent TestRegionRebalancing failure

2011-11-02 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-3716:
--

Fix Version/s: 0.92.0

> Intermittent TestRegionRebalancing failure
> --
>
> Key: HBASE-3716
> URL: https://issues.apache.org/jira/browse/HBASE-3716
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.92.0
>
> Attachments: 3716-addendum.txt, 3716.txt
>
>
> See HBase-TRUNK build #1820
> This could be due to HBASE-3681
> In trunk, default value of "hbase.regions.slop" is 20%. It is possible for 
> load balancer to see region distribution which falls within 20% of optimal 
> distribution.
> However, assertRegionsAreBalanced() uses 10% slop.
> One solution is to align the slop in assertRegionsAreBalanced() with 
> "hbase.regions.slop" value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4728) Clean up noisy HBaseAdmin#close messages

2011-11-02 Thread stack (Created) (JIRA)
Clean up noisy HBaseAdmin#close messages


 Key: HBASE-4728
 URL: https://issues.apache.org/jira/browse/HBASE-4728
 Project: HBase
  Issue Type: Bug
Reporter: stack


See tail of HBASE-4724

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4724) TestAdmin hangs randomly in trunk

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142320#comment-13142320
 ] 

stack commented on HBASE-4724:
--

Ok.  Want to make new issue to fix the leak?

It just failed for me with this:

{code}
Tests in error: 
  testHundredsOfTable(org.apache.hadoop.hbase.client.TestAdmin): Call to 
sv4r11s38/10.4.11.38:46152 failed on socket timeout exception: 
java.net.SocketTimeoutException: 1500 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/10.4.11.38:52051 remote=sv4r11s38/10.4.11.38:46152]
{code}

Is 1500ms not enough? Did you change that?

On the message:

{code}

2011-11-02 03:36:17,272 DEBUG [Finalizer] 
client.HConnectionManager$HConnectionImplementation(1715): The connection to 
null has been closed.

{code}

... yeah, it looks like this test is making lots of instances of HBaseAdmin... 
ones it should be cleaning up but it does look like the message is harmless... 
a close of a closed connection.  I  made HBASE-4728 for this.

> TestAdmin hangs randomly in trunk
> -
>
> Key: HBASE-4724
> URL: https://issues.apache.org/jira/browse/HBASE-4724
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 2002_4724_TestAdmin.patch, 
> 2002_4724_TestAdmin.v2.patch
>
>
> fom the logs in my env
> {noformat}
> 2011-11-01 15:48:40,744 WARN  [Master:0;localhost,39664,1320187706355] 
> master.AssignmentManager(1471): Failed assignment of -ROOT-,,0.70236052 to 
> localhost,44046,1320187706849, trying to assign elsewhere instead; retry=1
> org.apache.hadoop.hbase.ipc.HBaseRPC$VersionMismatch: Protocol 
> org.apache.hadoop.hbase.ipc.HRegionInterface version mismatch. (client = 28, 
> server = 29)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:185)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:300)
> {noformat}
> Anyway, after this the logs finishes with:
> {noformat}
> 2011-11-01 15:54:35,132 INFO  
> [Master:0;localhost,39664,1320187706355.oldLogCleaner] hbase.Chore(80): 
> Master:0;localhost,39664,1320187706355.oldLogCleaner exiting
> Process Thread Dump: Automatic Stack Trace every 60 seconds waiting on 
> Master:0;localhost,39664,1320187706355
> {noformat}
> it's in
> {noformat}
> sun.management.ThreadImpl.getThreadInfo1(Native Method)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:156)
> sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:121)
> 
> org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:149)
> 
> org.apache.hadoop.hbase.util.Threads.threadDumpingIsAlive(Threads.java:113)
> org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:405)
> org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:408)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:616)
> 
> org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:590)
> 
> org.apache.hadoop.hbase.client.TestAdmin.tearDownAfterClass(TestAdmin.java:89)
> {noformat}
> So that's at least why adding a timeout wont help and may be why it does not 
> end at all. Adding a maximum retry to Threads#threadDumpingIsAlive could help.
> I also wonder if the root cause of the non ending is my modif on the wal, 
> with some threads surprised to have updates that were not written in the wal. 
> Here is the full stack dump:
> {noformat}
> Thread 354 (IPC Client (47) connection to localhost/127.0.0.1:52227 from 
> nkeywal):
>   State: TIMED_WAITING
>   Blocked count: 360
>   Waited count: 359
>   Stack:
> java.lang.Object.wait(Native Method)
> org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:702)
> org.apache.hadoop.ipc.Client$Connection.run(Client.java:744)
> Thread 272 (Master:0;localhost,39664,1320187706355-EventThread):
>   State: WAITING
>   Blocked count: 0
>   Waited count: 4
>   Waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@107b954b
>   Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
> Thread 271 
> (Master:0;localhost,39664,1320187706355-SendThread(localhost:21819)):
>   State: RUNNABLE
>   Blocked count: 2
>   Waited coun

[jira] [Commented] (HBASE-4719) HBase script assumes pre-Hadoop 0.21 layout of jar files

2011-11-02 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142322#comment-13142322
 ] 

stack commented on HBASE-4719:
--

Have you tried it on hbase trunk and then on an hbase with 0.21+ hadoop plugged 
in Roman?

> HBase script assumes pre-Hadoop 0.21 layout of jar files
> 
>
> Key: HBASE-4719
> URL: https://issues.apache.org/jira/browse/HBASE-4719
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.0
>Reporter: Roman Shaposhnik
> Attachments: HBASE-4719.patch.txt
>
>
> The following in the bin/hbase:
> {noformat}
> HADOOPCPPATH=$(append_path "${HADOOPCPPATH}" `ls 
> ${HADOOP_HOME}/hadoop-core*.jar`)
> {noformat}
> assumes a pre-21 Hadoop layout. It'll be better to dynamically account for 
> either hadoop-core* or hadoop-common*, hadoop-hdfs*, hadoop-mapreduce*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4713) Raise debug level to warn on ExecutionException in HConnectionManager$HConnectionImplementation

2011-11-02 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142323#comment-13142323
 ] 

Hudson commented on HBASE-4713:
---

Integrated in HBase-0.92 #100 (See 
[https://builds.apache.org/job/HBase-0.92/100/])
HBASE-4713 Raise debug level to warn on ExecutionException in 
HConnectionManager

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java


> Raise debug level to warn on ExecutionException in 
> HConnectionManager$HConnectionImplementation
> ---
>
> Key: HBASE-4713
> URL: https://issues.apache.org/jira/browse/HBASE-4713
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Lucian George Iordache
> Fix For: 0.92.0
>
> Attachments: 4713.patch, HBASE-4713-patch.txt
>
>
> The ExecutionException is logged on debug level, and it should be logged on 
> warn. I've met the problem in the next case:
> - hbase.rpc.timeout = 6
> - lease time on region server = 24
> - started a scan that takes more than 60 seconds on the region server ==> 
> SocketTimeoutException logged on debug
> Having the log level on info, the exception was not observable on the client 
> side and it took me a while to figure out what was hapenning.
> See also:
> - https://issues.apache.org/jira/browse/HBASE-3154
> - 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201110.mbox/%3CCANH3+J0athaCjK-ahu-A=hrzoosjyh6s_mtpzm3_qqpfrcs...@mail.gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142326#comment-13142326
 ] 

Todd Lipcon commented on HBASE-4716:


this was already committed, but I just got to my email for the day:
- why does getColumnFamilyType return an enum when all we care about is 
hasMultipleColumnFamilies() (a boolean?) This makes it harder to understand. 
(this function doesn't return a type of column family.)
- even if you use an enum, our style is not ALL_CAPS for enum class names. Only 
ALL_CAPS for the values.
- why is the new function public? it should be private.



> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4729) Race between online altering and splitting kills the master

2011-11-02 Thread Jean-Daniel Cryans (Created) (JIRA)
Race between online altering and splitting kills the master
---

 Key: HBASE-4729
 URL: https://issues.apache.org/jira/browse/HBASE-4729
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
 Fix For: 0.92.0, 0.94.0


I was running an online alter while regions were splitting, and suddenly the 
master died and left my table half-altered (haven't restarted the master yet).

What killed the master:

{quote}
2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
Unexpected ZK exception creating node CLOSING
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
at org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
at 
org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
at 
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
at 
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{quote}

A znode was created because the region server was splitting the region 4 
seconds before:

{quote}
2011-11-02 17:06:40,704 INFO 
org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region 
TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
2011-11-02 17:06:40,704 DEBUG 
org.apache.hadoop.hbase.regionserver.SplitTransaction: 
regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:62023-0x132f043bbde0710 Attempting to transition node 
f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
RS_ZK_REGION_SPLITTING
...
2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
RS_ZK_REGION_SPLIT
2011-11-02 17:06:44,061 INFO 
org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
master to process the split for f7e1783e65ea8d621a4bc96ad310f101
{quote}

Now that the master is dead the region server is spewing those last two lines 
like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142339#comment-13142339
 ] 

Ted Yu commented on HBASE-4716:
---

When creating a method for hasMultipleColumnFamilies(), I found that I should 
deal with the possibility of familyPaths being null. So I created an enum that 
can represent tri-state.
I thought getColumnFamilyType() might be useful in other occasions.
For now, I can change it to private.

I will change spelling for the enum type as well.

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3898) TestSplitTransactionOnCluster broke in TRUNK

2011-11-02 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-3898:
--

Fix Version/s: 0.92.0

> TestSplitTransactionOnCluster broke in TRUNK
> 
>
> Key: HBASE-3898
> URL: https://issues.apache.org/jira/browse/HBASE-3898
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: 3898.txt
>
>
> It hangs for 15 minutes.  I see a NPE trying to split a region.  The splitKey 
> passed is null.  Looks to be by-product of recent compaction refactorings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4719) HBase script assumes pre-Hadoop 0.21 layout of jar files

2011-11-02 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4719:
-

Fix Version/s: 0.92.0
 Assignee: Roman Shaposhnik

> HBase script assumes pre-Hadoop 0.21 layout of jar files
> 
>
> Key: HBASE-4719
> URL: https://issues.apache.org/jira/browse/HBASE-4719
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.92.0
>Reporter: Roman Shaposhnik
>Assignee: Roman Shaposhnik
> Fix For: 0.92.0
>
> Attachments: HBASE-4719.patch.txt
>
>
> The following in the bin/hbase:
> {noformat}
> HADOOPCPPATH=$(append_path "${HADOOPCPPATH}" `ls 
> ${HADOOP_HOME}/hadoop-core*.jar`)
> {noformat}
> assumes a pre-21 Hadoop layout. It'll be better to dynamically account for 
> either hadoop-core* or hadoop-common*, hadoop-hdfs*, hadoop-mapreduce*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3903) A successful write to client write-buffer may be lost or not visible

2011-11-02 Thread Jonathan Hsieh (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-3903:
--

Fix Version/s: 0.92.0

> A successful write to client write-buffer may be lost or not visible
> 
>
> Key: HBASE-3903
> URL: https://issues.apache.org/jira/browse/HBASE-3903
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
> Environment: Any.
>Reporter: Tallat
>Assignee: Doug Meil
>Priority: Minor
>  Labels: documentation
> Fix For: 0.92.0
>
> Attachments: acid-semantics_HBASE_3903.xml.patch, 
> book_HBASE_3903.xml.patch
>
>
> A client can do a write to a client side 'write buffer' if enabled via 
> hTable.setAutoFlush(false). Now, assume a client puts value v under key k. 
> Two wrongs things can happen, violating the ACID semantics  of Hbase given 
> at: http://hbase.apache.org/acid-semantics.html
> 1) Say the client fails immediately after the put succeeds. In this case, the 
> put will be lost, violating the durability property:
>  Any operation that returns a "success" code (eg does not throw an 
> exception) will be made durable. 
>  
> 2) Say the client issues a read for k immediately after writing k. The put 
> will be stored in the client side write buffer, while the read will go to the 
> region server, returning an older value, instead of v, violating the 
> visibility property:
> 
> When a client receives a "success" response for any mutation, that mutation
> is immediately visible to both that client and any client with whom it later
> communicates through side channels.
> 
> Thanks,
> Tallat

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4716:
--

Attachment: 4716.addendum

Addendum that addresses Todd's comments.

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.addendum, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142355#comment-13142355
 ] 

Todd Lipcon commented on HBASE-4716:


Until we have some use for the tri-state, I'd prefer that it just be a boolean. 
Tri-state makes the code harder to follow for no immediate purpose. "Because we 
might need it some day" is not a purpose IMHO.

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.addendum, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142361#comment-13142361
 ] 

Ted Yu commented on HBASE-4716:
---

When familyPaths is null, hasMultipleColumnFamilies() returns false.
Just want to confirm.

Personally I think patch v1 looks cleaner.

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.addendum, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4716:
--

Attachment: (was: 4716.addendum)

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4716:
--

Attachment: 4716.addendum

Removed enum in addendum.
TestLoadIncrementalHFilesSplitRecovery passes.

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.addendum, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4729) Race between online altering and splitting kills the master

2011-11-02 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142378#comment-13142378
 ] 

Jean-Daniel Cryans commented on HBASE-4729:
---

After the master got restarted the split was processed correctly but the table 
is still half-altered (need to see if this is going to work).

> Race between online altering and splitting kills the master
> ---
>
> Key: HBASE-4729
> URL: https://issues.apache.org/jira/browse/HBASE-4729
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
> Fix For: 0.92.0, 0.94.0
>
>
> I was running an online alter while regions were splitting, and suddenly the 
> master died and left my table half-altered (haven't restarted the master yet).
> What killed the master:
> {quote}
> 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unexpected ZK exception creating node CLOSING
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
> at 
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
> at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {quote}
> A znode was created because the region server was splitting the region 4 
> seconds before:
> {quote}
> 2011-11-02 17:06:40,704 INFO 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of 
> region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
> 2011-11-02 17:06:40,704 DEBUG 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: 
> regionserver:62023-0x132f043bbde0710 Creating ephemeral node for 
> f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state
> 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:62023-0x132f043bbde0710 Attempting to transition node 
> f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
> RS_ZK_REGION_SPLITTING
> ...
> 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:62023-0x132f043bbde0710 Successfully transitioned node 
> f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to 
> RS_ZK_REGION_SPLIT
> 2011-11-02 17:06:44,061 INFO 
> org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the 
> master to process the split for f7e1783e65ea8d621a4bc96ad310f101
> {quote}
> Now that the master is dead the region server is spewing those last two lines 
> like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-1744) Thrift server to match the new java api.

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-1744:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Thrift server to match the new java api.
> 
>
> Key: HBASE-1744
> URL: https://issues.apache.org/jira/browse/HBASE-1744
> Project: HBase
>  Issue Type: Improvement
>  Components: thrift
>Reporter: Tim Sell
>Assignee: Tim Sell
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 
> 0001-thrift2-enable-usage-of-.deleteColumns-for-thrift.patch, 1744-trunk.10, 
> 1744.addendum, HBASE-1744.11.patch, HBASE-1744.2.patch, HBASE-1744.3.patch, 
> HBASE-1744.4.patch, HBASE-1744.5.patch, HBASE-1744.6.patch, 
> HBASE-1744.7.patch, HBASE-1744.8.patch, HBASE-1744.9.patch, 
> HBASE-1744.preview.1.patch, thriftexperiment.patch
>
>
> This mutateRows, etc.. is a little confusing compared to the new cleaner java 
> client.
> Thinking of ways to make a thrift client that is just as elegant. something 
> like:
> void put(1:Bytes table, 2:TPut put) throws (1:IOError io)
> with:
> struct TColumn {
>   1:Bytes family,
>   2:Bytes qualifier,
>   3:i64 timestamp
> }
> struct TPut {
>   1:Bytes row,
>   2:map values
> }
> This creates more verbose rpc  than if the columns in TPut were just 
> map>, but that is harder to fit timestamps into and 
> still be intuitive from say python.
> Presumably the goal of a thrift gateway is to be easy first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142382#comment-13142382
 ] 

Todd Lipcon commented on HBASE-4716:


bq. When familyPaths is null, hasMultipleColumnFamilies() returns false.

familyPaths should never be null. It is an error for a user to pass a null 
value to this RPC - the iteration later in bulkLoadsFile (line 2831) would 
throw NPE. So there is no need to concern ourselves with what 
hasMultipleColumnFamilies would return when passed null. Indeed we could add a 
Preconditions.checkNotNull at the top of bulkLoadHFiles.

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.addendum, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4725) NPE in AM#updateTimers

2011-11-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142393#comment-13142393
 ] 

Hadoop QA commented on HBASE-4725:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12501982/am.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -165 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 42 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/140//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/140//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/140//console

This message is automatically generated.

> NPE in AM#updateTimers
> --
>
> Key: HBASE-4725
> URL: https://issues.apache.org/jira/browse/HBASE-4725
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: am.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4716:
--

Attachment: (was: 4716.addendum)

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4716:
--

Attachment: 4716.addendum

Added Preconditions and removed checking for null in 
hasMultipleColumnFamilies().

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.addendum, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4583) Integrate RWCC with Append and Increment operations

2011-11-02 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142398#comment-13142398
 ] 

Lars Hofhansl commented on HBASE-4583:
--

You make a good point. If people want performance they'd pass false as 
wreToWal. Otherwise they will get correct and "slow" behavior. 

> Integrate RWCC with Append and Increment operations
> ---
>
> Key: HBASE-4583
> URL: https://issues.apache.org/jira/browse/HBASE-4583
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 4583-v2.txt, 4583-v3.txt, 4583-v4.txt, 4583.txt
>
>
> Currently Increment and Append operations do not work with RWCC and hence a 
> client could see the results of multiple such operation mixed in the same 
> Get/Scan.
> The semantics might be a bit more interesting here as upsert adds and removes 
> to and from the memstore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4716) Improve locking for single column family bulk load

2011-11-02 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142399#comment-13142399
 ] 

Todd Lipcon commented on HBASE-4716:


+1 on addendum, thanks Ted

> Improve locking for single column family bulk load
> --
>
> Key: HBASE-4716
> URL: https://issues.apache.org/jira/browse/HBASE-4716
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4716-v2.txt, 4716.addendum, 4716.txt
>
>
> HBASE-4552 changed the locking behavior for single column family bulk load, 
> namely we don't need to take write lock.
> A read lock would suffice in this scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4518) TestServerCustomProtocol is flaky

2011-11-02 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142400#comment-13142400
 ] 

Hadoop QA commented on HBASE-4518:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12501985/HBASE-4518.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -165 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 42 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  org.apache.hadoop.hbase.TestGlobalMemStoreSize
  org.apache.hadoop.hbase.master.TestMasterFailover

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/141//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/141//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/141//console

This message is automatically generated.

> TestServerCustomProtocol is flaky
> -
>
> Key: HBASE-4518
> URL: https://issues.apache.org/jira/browse/HBASE-4518
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors, test
>Affects Versions: 0.92.0
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Fix For: 0.92.0
>
> Attachments: HBASE-4518.patch, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol-output.txt, 
> org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.txt
>
>
> TestServerCustomProtocol has been showing some intermittent failures in 
> Jenkins due to what looks like region transitions.
> Here is the most recent failure:
> {noformat}
> Results :
> Failed tests:   
> testRowRange(org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol): 
> Results should contain region 
> test,bbb,1317332645939.aea9154349b9e0dc207e2e9476702763. for row 'bbb'
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows

2011-11-02 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142405#comment-13142405
 ] 

jirapos...@reviews.apache.org commented on HBASE-4536:
--



bq.  On 2011-11-02 06:44:54, Prakash Khemani wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java,
 lines 210-212
bq.  > 
bq.  >
bq.  > Hi Lars, Isn't this early-out problematic? It doesn't take into 
account min-versions. It doesn't take into account the newly introduced 
keepDeletedCells mode.

The early out only happens when miversions is not set.  Check out *ColumnTracker


- Lars


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review3009
---


On 2011-10-18 21:43:38, Lars Hofhansl wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2178/
bq.  ---
bq.  
bq.  (Updated 2011-10-18 21:43:38)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. 
look at the state of the data at any point in the past, provided the data is 
still around.
bq.  This did not work for deletes, however. Deletes would always mask all puts 
in the past.
bq.  This change adds a flag that can be on HColumnDescriptor to enable 
retention of deleted rows.
bq.  These rows are still subject to TTL and/or VERSIONS.
bq.  
bq.  This changes the following:
bq.  1. There is a new flag on HColumnDescriptor enabling that behavior.
bq.  2. Allow gets/scans with a timerange to retrieve rows hidden by a delete 
marker, if the timerange does not include the delete marker.
bq.  3. Do not unconditionally collect all deleted rows during a compaction.
bq.  4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
bq.  
bq.  The change is small'ish, but the logic is intricate, so please review 
carefully.
bq.  
bq.  
bq.  This addresses bug HBASE-4536.
bq.  https://issues.apache.org/jira/browse/HBASE-4536
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
 1185362 
bq.http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 
1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
 1185362 
bq.
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
 1

[jira] [Updated] (HBASE-4213) Support for fault tolerant, instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) through ZK.

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4213:
--

Status: Patch Available  (was: Open)

> Support for fault tolerant, instant schema updates with out master's 
> intervention (i.e with out enable/disable and bulk assign/unassign) through 
> ZK.
> 
>
> Key: HBASE-4213
> URL: https://issues.apache.org/jira/browse/HBASE-4213
> Project: HBase
>  Issue Type: Improvement
>Reporter: Subbu M Iyer
>Assignee: Subbu M Iyer
> Fix For: 0.92.0
>
> Attachments: 
> 4213-101211-Support_instant_schema_changes_through_ZK.patch, 
> 4213-102511.patch, 4213-Instant_Schema_change_through_ZK.patch, 
> 4213-Nov-2-2011_patch_.patch, 
> 4213-V10-Support_instant_schema_changes_through_ZK.patch, 
> 4213-V5-Support_instant_schema_changes_through_ZK.patch, 
> 4213-V7-Support_instant_schema_changes_through_ZK.patch, 
> 4213-V8-Support_instant_schema_changes_through_ZK.patch, 
> 4213-V9-Support_instant_schema_changes_through_ZK.patch, 4213-v9.txt, 
> 4213.v6, HBASE-4213-Instant_schema_change.patch, 
> HBASE-4213_Instant_schema_change_-Version_2_.patch, 
> HBASE_Instant_schema_change-version_3_.patch
>
>
> This Jira is a slight variation in approach to what is being done as part of 
> https://issues.apache.org/jira/browse/HBASE-1730
> Support instant schema updates such as Modify Table, Add Column, Modify 
> Column operations:
> 1. With out enable/disabling the table.
> 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4213) Support for fault tolerant, instant schema updates with out master's intervention (i.e with out enable/disable and bulk assign/unassign) through ZK.

2011-11-02 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4213:
--

Attachment: 4213-Nov-2-2011_patch_.patch

Patch from Subbu for TRUNK.

> Support for fault tolerant, instant schema updates with out master's 
> intervention (i.e with out enable/disable and bulk assign/unassign) through 
> ZK.
> 
>
> Key: HBASE-4213
> URL: https://issues.apache.org/jira/browse/HBASE-4213
> Project: HBase
>  Issue Type: Improvement
>Reporter: Subbu M Iyer
>Assignee: Subbu M Iyer
> Fix For: 0.92.0
>
> Attachments: 
> 4213-101211-Support_instant_schema_changes_through_ZK.patch, 
> 4213-102511.patch, 4213-Instant_Schema_change_through_ZK.patch, 
> 4213-Nov-2-2011_patch_.patch, 
> 4213-V10-Support_instant_schema_changes_through_ZK.patch, 
> 4213-V5-Support_instant_schema_changes_through_ZK.patch, 
> 4213-V7-Support_instant_schema_changes_through_ZK.patch, 
> 4213-V8-Support_instant_schema_changes_through_ZK.patch, 
> 4213-V9-Support_instant_schema_changes_through_ZK.patch, 4213-v9.txt, 
> 4213.v6, HBASE-4213-Instant_schema_change.patch, 
> HBASE-4213_Instant_schema_change_-Version_2_.patch, 
> HBASE_Instant_schema_change-version_3_.patch
>
>
> This Jira is a slight variation in approach to what is being done as part of 
> https://issues.apache.org/jira/browse/HBASE-1730
> Support instant schema updates such as Modify Table, Add Column, Modify 
> Column operations:
> 1. With out enable/disabling the table.
> 2. With out bulk unassign/assign of regions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   3   >