[jira] [Updated] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-9941:
--

   Resolution: Fixed
Fix Version/s: 0.99.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk and 0.98. Thanks Gary and Lars for the reviews and advice.

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0, 0.99.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch, 
> 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862232#comment-13862232
 ] 

Andrew Purtell commented on HBASE-9941:
---

Apparent 'release audit' warnings are these issues unrelated to this patch, 
which only modifies existing files with license headers:
{noformat}
[WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
no dependency information available
[WARNING] Failed to retrieve plugin descriptor for 
org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not be 
resolved: Failed to read artifact descriptor for 
org.eclipse.m2e:lifecycle-mapping:jar:1.0.0
[WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
no dependency information available
[WARNING] Failed to retrieve plugin descriptor for 
org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not be 
resolved: Failed to read artifact descriptor for 
org.eclipse.m2e:lifecycle-mapping:jar:1.0.0
{noformat}

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch, 
> 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862226#comment-13862226
 ] 

Hudson commented on HBASE-10279:


SUCCESS: Integrated in HBase-0.94-security #378 (See 
[https://builds.apache.org/job/HBase-0.94-security/378/])
HBASE-10279 TestStore.testDeleteExpiredStoreFiles is flaky (larsh: rev 1555321)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java


> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.16
>
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862224#comment-13862224
 ] 

Hudson commented on HBASE-10279:


ABORTED: Integrated in HBase-0.94 #1250 (See 
[https://builds.apache.org/job/HBase-0.94/1250/])
HBASE-10279 TestStore.testDeleteExpiredStoreFiles is flaky (larsh: rev 1555321)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java


> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.16
>
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862223#comment-13862223
 ] 

Hudson commented on HBASE-10272:


SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #52 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/52/])
HBASE-10272 Cluster becomes nonoperational if the node hosting the active 
Master AND ROOT/META table goes offline (Tedyu: rev 1555313)
* 
/hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


> Cluster becomes nonoperational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.96.1, 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1386#comment-1386
 ] 

Hudson commented on HBASE-10279:


ABORTED: Integrated in HBase-0.94-JDK7 #15 (See 
[https://builds.apache.org/job/HBase-0.94-JDK7/15/])
HBASE-10279 TestStore.testDeleteExpiredStoreFiles is flaky (larsh: rev 1555321)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java


> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.16
>
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862219#comment-13862219
 ] 

Hudson commented on HBASE-10272:


FAILURE: Integrated in HBase-0.98 #56 (See 
[https://builds.apache.org/job/HBase-0.98/56/])
HBASE-10272 Cluster becomes nonoperational if the node hosting the active 
Master AND ROOT/META table goes offline (Tedyu: rev 1555313)
* 
/hbase/branches/0.98/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


> Cluster becomes nonoperational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.96.1, 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8741) Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs)

2014-01-03 Thread Liyin Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862218#comment-13862218
 ] 

Liyin Tang commented on HBASE-8741:
---

Good to know :) Not sure whether it is worth amending the above implication in 
the java doc. Basically, the latestSequenceNums might contain duplicated keys 
for the same region. In 89-fb, we just use the ConcurrentSkipListMap, just in 
case this map might be reused for other purpose.

Anyway, thanks for the explanation ! Nice feature indeed !


> Scope sequenceid to the region rather than regionserver (WAS: Mutations on 
> Regions in recovery mode might have same sequenceIDs)
> 
>
> Key: HBASE-8741
> URL: https://issues.apache.org/jira/browse/HBASE-8741
> Project: HBase
>  Issue Type: Bug
>  Components: MTTR
>Affects Versions: 0.95.1
>Reporter: Himanshu Vashishtha
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0
>
> Attachments: HBASE-8741-trunk-v6.1-rebased.patch, 
> HBASE-8741-trunk-v6.2.1.patch, HBASE-8741-trunk-v6.2.2.patch, 
> HBASE-8741-trunk-v6.2.2.patch, HBASE-8741-trunk-v6.3.patch, 
> HBASE-8741-trunk-v6.4.patch, HBASE-8741-trunk-v6.patch, HBASE-8741-v0.patch, 
> HBASE-8741-v2.patch, HBASE-8741-v3.patch, HBASE-8741-v4-again.patch, 
> HBASE-8741-v4-again.patch, HBASE-8741-v4.patch, HBASE-8741-v5-again.patch, 
> HBASE-8741-v5.patch
>
>
> Currently, when opening a region, we find the maximum sequence ID from all 
> its HFiles and then set the LogSequenceId of the log (in case the later is at 
> a small value). This works good in recovered.edits case as we are not writing 
> to the region until we have replayed all of its previous edits. 
> With distributed log replay, if we want to enable writes while a region is 
> under recovery, we need to make sure that the logSequenceId > maximum 
> logSequenceId of the old regionserver. Otherwise, we might have a situation 
> where new edits have same (or smaller) sequenceIds. 
> We can store region level information in the WALTrailer, than this scenario 
> could be avoided by:
> a) reading the trailer of the "last completed" file, i.e., last wal file 
> which has a trailer and,
> b) completely reading the last wal file (this file would not have the 
> trailer, so it needs to be read completely).
> In future, if we switch to multi wal file, we could read the trailer for all 
> completed WAL files, and reading the remaining incomplete files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862216#comment-13862216
 ] 

Hudson commented on HBASE-10272:


SUCCESS: Integrated in HBase-TRUNK #4787 (See 
[https://builds.apache.org/job/HBase-TRUNK/4787/])
HBASE-10272 Cluster becomes nonoperational if the node hosting the active 
Master AND ROOT/META table goes offline (Tedyu: rev 1555312)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


> Cluster becomes nonoperational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.96.1, 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862211#comment-13862211
 ] 

Hadoop QA commented on HBASE-9941:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621452/9941.patch
  against trunk revision .
  ATTACHMENT ID: 12621452

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 4 release 
audit warnings (more than the trunk's current 0 warnings).

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8339//console

This message is automatically generated.

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch, 
> 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10275) [89-fb] Guarantee the sequenceID in each Region is strictly monotonic increasing

2014-01-03 Thread Liyin Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862208#comment-13862208
 ] 

Liyin Tang commented on HBASE-10275:


The problem you have described is exactly what we want to resolve. Basically if 
the sequenceID for each region is strictly monotonic increasing, then in the 
case of a region moving from A to B, the replication stream in B would know the 
gap/lag for that region in the previous replication stream A. 

As you mentioned but slightly different:  The fix is to guarantee the old hlog 
entries of a region from the previous region server been fully replicated, 
before starting to replicate this region from a new region server.

> [89-fb] Guarantee the sequenceID in each Region is strictly monotonic 
> increasing
> 
>
> Key: HBASE-10275
> URL: https://issues.apache.org/jira/browse/HBASE-10275
> Project: HBase
>  Issue Type: New Feature
>Reporter: Liyin Tang
>Assignee: Liyin Tang
>
> [HBASE-8741] has implemented the per-region sequence ID. It would be even 
> better to guarantee that the sequencing is strictly monotonic increasing so 
> that HLog-Based Async Replication is able to delivery transactions in order 
> in the case of region movements.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-10279.
---

Resolution: Fixed

Committed to 0.94 only (similar was added in 0.95, so all other branches have 
it)

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10279:
--

Fix Version/s: 0.94.16

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.16
>
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862201#comment-13862201
 ] 

Lars Hofhansl commented on HBASE-10279:
---

This was fixed a while back with HBASE-6832. IncrementingEnvironmentEdge is a 
bit different trunk.
I'll commit by patch to 0.94. Thanks for looking [~apurtell]

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862198#comment-13862198
 ] 

Hadoop QA commented on HBASE-10263:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12621447/HBASE-10263-trunk_v1.patch
  against trunk revision .
  ATTACHMENT ID: 12621447

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 4 release 
audit warnings (more than the trunk's current 0 warnings).

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8338//console

This message is automatically generated.

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of t

[jira] [Commented] (HBASE-10275) [89-fb] Guarantee the sequenceID in each Region is strictly monotonic increasing

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862193#comment-13862193
 ] 

Feng Honghua commented on HBASE-10275:
--

For your reference, 
[HBASE-9465|https://issues.apache.org/jira/browse/HBASE-9465] describes the 
problem of no guarantee of serial transaction delivery to peer in case failover 
or region-move. 
In essence, it's hard to fix if we don't synchronize the previous(or worker 
regionserver which takes over the hlog pushing for the failed regionserver) and 
current hosting regionserver on hlog push. Without synchronization two 
different regionservers can push hlog entries of a same region with different 
pace. Another alternative fix is to guarantee the old hlog entries of a region 
have all been pushed to peer before it can be opened by a new regionserver.

> [89-fb] Guarantee the sequenceID in each Region is strictly monotonic 
> increasing
> 
>
> Key: HBASE-10275
> URL: https://issues.apache.org/jira/browse/HBASE-10275
> Project: HBase
>  Issue Type: New Feature
>Reporter: Liyin Tang
>Assignee: Liyin Tang
>
> [HBASE-8741] has implemented the per-region sequence ID. It would be even 
> better to guarantee that the sequencing is strictly monotonic increasing so 
> that HLog-Based Async Replication is able to delivery transactions in order 
> in the case of region movements.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10275) [89-fb] Guarantee the sequenceID in each Region is strictly monotonic increasing

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862190#comment-13862190
 ] 

Feng Honghua commented on HBASE-10275:
--

To achieve the goal of in-order (hlog) transaction delivery, also need to 
guarantee all the older(smaller) hlog entries in previous regionserver have 
been successfully pushed(replicated) to peer before the region is served by the 
new regionserver, right? otherwise it's still possible the hlog entries with 
smaller sequenceid are pushed(replicated) to peer in previous hosting 
regionserver *after* the ones with greater sequenceid in the new/current 
hosting regionserver, right?

For region movement in case of regionserver failover(if we deem it another kind 
of region movement, though passively), the hlog files containing un-pushed 
entries for the region will be handled by a different regionserver other than 
the region's new hosting regionserver, under this situation, it needs the 
communication/synchronization between these two regionservers to achieve the 
region's in-order transaction delivery from the overall perspective.

> [89-fb] Guarantee the sequenceID in each Region is strictly monotonic 
> increasing
> 
>
> Key: HBASE-10275
> URL: https://issues.apache.org/jira/browse/HBASE-10275
> Project: HBase
>  Issue Type: New Feature
>Reporter: Liyin Tang
>Assignee: Liyin Tang
>
> [HBASE-8741] has implemented the per-region sequence ID. It would be even 
> better to guarantee that the sequencing is strictly monotonic increasing so 
> that HLog-Based Async Replication is able to delivery transactions in order 
> in the case of region movements.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862189#comment-13862189
 ] 

Hudson commented on HBASE-10210:


SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #51 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/51/])
HBASE-10210 during master startup, RS can be you-are-dead-ed by master in error 
(sershe: rev 1555302)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java


> during master startup, RS can be you-are-dead-ed by master in error
> ---
>
> Key: HBASE-10210
> URL: https://issues.apache.org/jira/browse/HBASE-10210
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, 
> HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, 
> HBASE-10210.patch
>
>
> Not sure of the root cause yet, I am at "how did this ever work" stage.
> We see this problem in 0.96.1, but didn't in 0.96.0 + some patches.
> It looks like RS information arriving from 2 sources - ZK and server itself, 
> can conflict. Master doesn't handle such cases (timestamp match), and anyway 
> technically timestamps can collide for two separate servers.
> So, master YouAreDead-s the already-recorded reporting RS, and adds it too. 
> Then it discovers that the new server has died with fatal error!
> Note the threads.
> Addition is called from master initialization and from RPC.
> {noformat}
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Finished waiting for region servers count to settle; checked in 2, slept for 
> 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running.
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Registering 
> server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered 
> server found up in zk but who has not yet reported in: 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Triggering server recovery; existingServer 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> looks stale, new 
> server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Master doesn't enable ServerShutdownHandler during 
> initialization, delay expiring server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> ...
> 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] 
> master.HMaster: Region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> reported a fatal error:
> ABORTING region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: 
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as 
> dead server
> {noformat}
> Presumably some of the recent ZK listener related changes b



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-9941:
--

Attachment: 9941.patch

Latest patch removes the microbenchmark and POM changes. This is what I will 
commit soon.

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch, 
> 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862186#comment-13862186
 ] 

Ted Yu commented on HBASE-10263:


[~apurtell]:
Do you want this in 0.98 ?

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862184#comment-13862184
 ] 

Hudson commented on HBASE-10210:


SUCCESS: Integrated in HBase-0.98 #55 (See 
[https://builds.apache.org/job/HBase-0.98/55/])
HBASE-10210 during master startup, RS can be you-are-dead-ed by master in error 
(sershe: rev 1555302)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java


> during master startup, RS can be you-are-dead-ed by master in error
> ---
>
> Key: HBASE-10210
> URL: https://issues.apache.org/jira/browse/HBASE-10210
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, 
> HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, 
> HBASE-10210.patch
>
>
> Not sure of the root cause yet, I am at "how did this ever work" stage.
> We see this problem in 0.96.1, but didn't in 0.96.0 + some patches.
> It looks like RS information arriving from 2 sources - ZK and server itself, 
> can conflict. Master doesn't handle such cases (timestamp match), and anyway 
> technically timestamps can collide for two separate servers.
> So, master YouAreDead-s the already-recorded reporting RS, and adds it too. 
> Then it discovers that the new server has died with fatal error!
> Note the threads.
> Addition is called from master initialization and from RPC.
> {noformat}
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Finished waiting for region servers count to settle; checked in 2, slept for 
> 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running.
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Registering 
> server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered 
> server found up in zk but who has not yet reported in: 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Triggering server recovery; existingServer 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> looks stale, new 
> server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Master doesn't enable ServerShutdownHandler during 
> initialization, delay expiring server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> ...
> 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] 
> master.HMaster: Region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> reported a fatal error:
> ABORTING region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: 
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as 
> dead server
> {noformat}
> Presumably some of the recent ZK listener related changes b



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862183#comment-13862183
 ] 

Andrew Purtell commented on HBASE-9941:
---

I can see caliper at test scope in the classpath when I run 'mvn 
dependency:build-classpath' on the command line but it is not showing up in the 
generated file, which appears to be generated by running the exact same plugin 
and goal. Beats me. Maven "documentation" offers no clue. If I leave it at test 
scope the microbenchmark will compile but cannot be run from in tree or an 
untarred assembly. 

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10280) Make inMemoryForceMode of LruBlockCache configurable per column-family

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862181#comment-13862181
 ] 

Feng Honghua commented on HBASE-10280:
--

Thanks for the correction of its parent jira number, [~yuzhih...@gmail.com] :-)

> Make inMemoryForceMode of LruBlockCache configurable per column-family
> --
>
> Key: HBASE-10280
> URL: https://issues.apache.org/jira/browse/HBASE-10280
> Project: HBase
>  Issue Type: Improvement
>  Components: io, regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
>
> An extension of 
> [HBASE-10263|https://issues.apache.org/jira/browse/HBASE-10263] per 
> [~yuzhih...@gmail.com]'s suggestion.
> brief description of this extension is as below:
> 1. if the global inMemoryForceMode is on, all in-memory blocks are preemptive 
> no matter what their per-column-family inMemoryForceMode are;
> 2. if the global inMemoryForceMode is off, only in-memory blocks of 
> column-family with inMemoryForceMode on are preemptive; non-preemptive 
> inMemory blocks respect the single/multi/memory ratio;
> In short, global flag dominates, and per-column-family flag can only control 
> its own blocks.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8751) Enable peer cluster to choose/change the ColumnFamilies/Tables it really want to replicate from a source cluster

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862180#comment-13862180
 ] 

Feng Honghua commented on HBASE-8751:
-

The newly added tableCFs config is a per-peer attribute, much like the peer 
state which indicates whether replication for this peer is currently ON or OFF, 
the latter is one of peer's attribute to control the peer's behavior, the newly 
added tableCFs is another peer attribute controlling which data will be 
pushed(replicated) to the peer in a user-defined(finer and more accurate, hence 
more flexible) granularity. Sounds natural this attribute keeps the same 
implementation theme as peer state, right?

Currently, in either 0.94.x or in trunk, the peer state has a permanent node in 
zk, and for the above reason I do want to align the implementation theme of 
tableCFs with peer state. It would look weird if we implement differently for 
these two per-peer attributes. :-)

> Enable peer cluster to choose/change the ColumnFamilies/Tables it really want 
> to replicate from a source cluster
> 
>
> Key: HBASE-8751
> URL: https://issues.apache.org/jira/browse/HBASE-8751
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-8751-0.94-V0.patch, HBASE-8751-0.94-v1.patch
>
>
> Consider scenarios (all cf are with replication-scope=1):
> 1) cluster S has 3 tables, table A has cfA,cfB, table B has cfX,cfY, table C 
> has cf1,cf2.
> 2) cluster X wants to replicate table A : cfA, table B : cfX and table C from 
> cluster S.
> 3) cluster Y wants to replicate table B : cfY, table C : cf2 from cluster S.
> Current replication implementation can't achieve this since it'll push the 
> data of all the replicatable column-families from cluster S to all its peers, 
> X/Y in this scenario.
> This improvement provides a fine-grained replication theme which enable peer 
> cluster to choose the column-families/tables they really want from the source 
> cluster:
> A). Set the table:cf-list for a peer when addPeer:
>   hbase-shell> add_peer '3', "zk:1100:/hbase", "table1; table2:cf1,cf2; 
> table3:cf2"
> B). View the table:cf-list config for a peer using show_peer_tableCFs:
>   hbase-shell> show_peer_tableCFs "1"
> C). Change/set the table:cf-list for a peer using set_peer_tableCFs:
>   hbase-shell> set_peer_tableCFs '2', "table1:cfX; table2:cf1; table3:cf1,cf2"
> In this theme, replication-scope=1 only means a column-family CAN be 
> replicated to other clusters, but only the 'table:cf-list list' determines 
> WHICH cf/table will actually be replicated to a specific peer.
> To provide back-compatibility, empty 'table:cf-list list' will replicate all 
> replicatable cf/table. (this means we don't allow a peer which replicates 
> nothing from a source cluster, we think it's reasonable: if replicating 
> nothing why bother adding a peer?)
> This improvement addresses the exact problem raised  by the first FAQ in 
> "http://hbase.apache.org/replication.html":
>   "GLOBAL means replicate? Any provision to replicate only to cluster X and 
> not to cluster Y? or is that for later?
>   Yes, this is for much later."
> I also noticed somebody mentioned "replication-scope" as integer rather than 
> a boolean is for such fine-grained replication purpose, but I think extending 
> "replication-scope" can't achieve the same replication granularity 
> flexibility as providing above per-peer replication configurations.
> This improvement has been running smoothly in our production clusters 
> (Xiaomi) for several months.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10280) Make inMemoryForceMode of LruBlockCache configurable per column-family

2014-01-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862175#comment-13862175
 ] 

Ted Yu commented on HBASE-10280:


The above description is good.

> Make inMemoryForceMode of LruBlockCache configurable per column-family
> --
>
> Key: HBASE-10280
> URL: https://issues.apache.org/jira/browse/HBASE-10280
> Project: HBase
>  Issue Type: Improvement
>  Components: io, regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
>
> An extension of 
> [HBASE-10263|https://issues.apache.org/jira/browse/HBASE-10263] per 
> [~yuzhih...@gmail.com]'s suggestion.
> brief description of this extension is as below:
> 1. if the global inMemoryForceMode is on, all in-memory blocks are preemptive 
> no matter what their per-column-family inMemoryForceMode are;
> 2. if the global inMemoryForceMode is off, only in-memory blocks of 
> column-family with inMemoryForceMode on are preemptive; non-preemptive 
> inMemory blocks respect the single/multi/memory ratio;
> In short, global flag dominates, and per-column-family flag can only control 
> its own blocks.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10280) Make inMemoryForceMode of LruBlockCache configurable per column-family

2014-01-03 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10280:
---

Description: 
An extension of [HBASE-10263|https://issues.apache.org/jira/browse/HBASE-10263] 
per [~yuzhih...@gmail.com]'s suggestion.

brief description of this extension is as below:
1. if the global inMemoryForceMode is on, all in-memory blocks are preemptive 
no matter what their per-column-family inMemoryForceMode are;
2. if the global inMemoryForceMode is off, only in-memory blocks of 
column-family with inMemoryForceMode on are preemptive; non-preemptive inMemory 
blocks respect the single/multi/memory ratio;

In short, global flag dominates, and per-column-family flag can only control 
its own blocks.

  was:
An extension of [HBASE-10273|https://issues.apache.org/jira/browse/HBASE-10263] 
per [~yuzhih...@gmail.com]'s suggestion.

brief description of this extension is as below:
1. if the global inMemoryForceMode is on, all in-memory blocks are preemptive 
no matter what their per-column-family inMemoryForceMode are;
2. if the global inMemoryForceMode is off, only in-memory blocks of 
column-family with inMemoryForceMode on are preemptive; non-preemptive inMemory 
blocks respect the single/multi/memory ratio;

In short, global flag dominates, and per-column-family flag can only control 
its own blocks.


> Make inMemoryForceMode of LruBlockCache configurable per column-family
> --
>
> Key: HBASE-10280
> URL: https://issues.apache.org/jira/browse/HBASE-10280
> Project: HBase
>  Issue Type: Improvement
>  Components: io, regionserver
>Reporter: Feng Honghua
>Assignee: Feng Honghua
>
> An extension of 
> [HBASE-10263|https://issues.apache.org/jira/browse/HBASE-10263] per 
> [~yuzhih...@gmail.com]'s suggestion.
> brief description of this extension is as below:
> 1. if the global inMemoryForceMode is on, all in-memory blocks are preemptive 
> no matter what their per-column-family inMemoryForceMode are;
> 2. if the global inMemoryForceMode is off, only in-memory blocks of 
> column-family with inMemoryForceMode on are preemptive; non-preemptive 
> inMemory blocks respect the single/multi/memory ratio;
> In short, global flag dominates, and per-column-family flag can only control 
> its own blocks.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10272:
---

Fix Version/s: 0.99.0
   0.98.0
 Hadoop Flags: Reviewed

Integrated to 0.98 and trunk.

Thanks for the reviews.

> Cluster becomes nonoperational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.96.1, 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10272) Cluster becomes nonoperational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10272:
---

Summary: Cluster becomes nonoperational if the node hosting the active 
Master AND ROOT/META table goes offline  (was: Cluster becomes in-operational 
if the node hosting the active Master AND ROOT/META table goes offline)

> Cluster becomes nonoperational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.96.1, 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862170#comment-13862170
 ] 

Hadoop QA commented on HBASE-9941:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621442/9941.patch
  against trunk revision .
  ATTACHMENT ID: 12621442

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 4 release 
audit warnings (more than the trunk's current 0 warnings).

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8337//console

This message is automatically generated.

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862169#comment-13862169
 ] 

Andrew Purtell commented on HBASE-9941:
---

Actually, I agree caliper shouldn't come in except when running tests. Let me 
figure out some Maven way to do what I want while having caliper at test scope.

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862165#comment-13862165
 ] 

Hudson commented on HBASE-10210:


SUCCESS: Integrated in HBase-TRUNK #4786 (See 
[https://builds.apache.org/job/HBase-TRUNK/4786/])
HBASE-10210 during master startup, RS can be you-are-dead-ed by master in error 
(sershe: rev 1555275)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java


> during master startup, RS can be you-are-dead-ed by master in error
> ---
>
> Key: HBASE-10210
> URL: https://issues.apache.org/jira/browse/HBASE-10210
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, 
> HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, 
> HBASE-10210.patch
>
>
> Not sure of the root cause yet, I am at "how did this ever work" stage.
> We see this problem in 0.96.1, but didn't in 0.96.0 + some patches.
> It looks like RS information arriving from 2 sources - ZK and server itself, 
> can conflict. Master doesn't handle such cases (timestamp match), and anyway 
> technically timestamps can collide for two separate servers.
> So, master YouAreDead-s the already-recorded reporting RS, and adds it too. 
> Then it discovers that the new server has died with fatal error!
> Note the threads.
> Addition is called from master initialization and from RPC.
> {noformat}
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Finished waiting for region servers count to settle; checked in 2, slept for 
> 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running.
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Registering 
> server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered 
> server found up in zk but who has not yet reported in: 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Triggering server recovery; existingServer 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> looks stale, new 
> server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Master doesn't enable ServerShutdownHandler during 
> initialization, delay expiring server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> ...
> 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] 
> master.HMaster: Region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> reported a fatal error:
> ABORTING region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: 
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as 
> dead server
> {noformat}
> Presumably some of the recent ZK listener related changes b



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9280) Integration tests should use compression.

2014-01-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862166#comment-13862166
 ] 

Andrew Purtell commented on HBASE-9280:
---

Wait, we have 
http://svn.apache.org/repos/asf/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/ChangeCompressionAction.java.
 Close this issue?

> Integration tests should use compression.
> -
>
> Key: HBASE-9280
> URL: https://issues.apache.org/jira/browse/HBASE-9280
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.98.0
>Reporter: Elliott Clark
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862163#comment-13862163
 ] 

Feng Honghua commented on HBASE-10263:
--

[~yuzhih...@gmail.com], 
[HBASE-10280|https://issues.apache.org/jira/browse/HBASE-10280] is created per 
your suggestion, please check its description to see if the per-column-family 
behavior matches your expectation, thanks agaion :-)

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Feng Honghua (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Honghua updated HBASE-10263:
-

Component/s: io

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>  Components: io
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862162#comment-13862162
 ] 

Andrew Purtell commented on HBASE-9941:
---

bq. Only comment is can the caliper dependency be {{test}}?  
Don't want to ship it as a dependency if it's only for test code.  

If I make it a test only dependency then this happens:
{noformat}
./bin/hbase org.apache.hadoop.hbase.CoprocessorInvocationEvaluation --trials 10
Exception in thread "main" java.lang.NoClassDefFoundError: 
com/google/caliper/SimpleBenchmark
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Caused by: java.lang.ClassNotFoundException: com.google.caliper.SimpleBenchmark
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247
{noformat}

I will commit this shortly to trunk and 0.98 if no objections.

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10280) Make inMemoryForceMode of LruBlockCache configurable per column-family

2014-01-03 Thread Feng Honghua (JIRA)
Feng Honghua created HBASE-10280:


 Summary: Make inMemoryForceMode of LruBlockCache configurable per 
column-family
 Key: HBASE-10280
 URL: https://issues.apache.org/jira/browse/HBASE-10280
 Project: HBase
  Issue Type: Improvement
  Components: io, regionserver
Reporter: Feng Honghua
Assignee: Feng Honghua


An extension of [HBASE-10273|https://issues.apache.org/jira/browse/HBASE-10263] 
per [~yuzhih...@gmail.com]'s suggestion.

brief description of this extension is as below:
1. if the global inMemoryForceMode is on, all in-memory blocks are preemptive 
no matter what their per-column-family inMemoryForceMode are;
2. if the global inMemoryForceMode is off, only in-memory blocks of 
column-family with inMemoryForceMode on are preemptive; non-preemptive inMemory 
blocks respect the single/multi/memory ratio;

In short, global flag dominates, and per-column-family flag can only control 
its own blocks.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862157#comment-13862157
 ] 

Feng Honghua commented on HBASE-10263:
--

[~yuzhih...@gmail.com]
bq.Is there plan to make inMemoryForceMode column-family config ?
==> hmmm...sounds reasonable and feasible, but not sure providing such 
finer-grained control for this flag is desirable for users. Let me create a new 
jira for it, and will implementation it if seeing request or someone wants it, 
thanks for suggestion :-)

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862153#comment-13862153
 ] 

Feng Honghua commented on HBASE-10263:
--

[~stack]
bq.I would suggest that this behavior be ON by default in a major release of 
hbase (0.98 if @apurtell is amenable or 1.0.0 if not); to me, the way this 
patch is more the 'expected' behavior.
==> the single/multi/memory ratio by default is the same as before(without any 
tweak): 25%:50%:25%, but user can change them by setting the new 
configurations, the 'inMemoryForceMode'(preemptive mode for in-memory blocks) 
is by default OFF, you want to turn 'inMemoryForceMode' ON? hmmm. what about we 
firstly make it conservative by keeping it OFF by default, and turn it on if we 
eventually found most of our users tweak it on for their real use :-)
At least we now provide users a new option to control how 'in-memory' 
cached blocks mean and behave, and when it's off we enable users to configure 
the single/multi/memory ratios.
Opinion?

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862152#comment-13862152
 ] 

Gary Helmling commented on HBASE-9941:
--

+1 on the latest patch.

Only comment is can the caliper dependency be {{test}}?  Don't 
want to ship it as a dependency if it's only for test code.  Assuming that 
works, I'm fine to fix that bit on commit.

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862151#comment-13862151
 ] 

Lars Hofhansl commented on HBASE-10279:
---

Arghh. Should've looked there first.

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862149#comment-13862149
 ] 

Andrew Purtell commented on HBASE-10279:


Looks like trunk already has a change like the patch on this issue.

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Feng Honghua (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Honghua updated HBASE-10263:
-

Attachment: HBASE-10263-trunk_v1.patch

new patch removing unused variables in CacheConfig.java per 
[~yuzhih...@gmail.com]'s review, thanks [~yuzhih...@gmail.com] :-)

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862146#comment-13862146
 ] 

Feng Honghua commented on HBASE-10263:
--

They are not used in CacheConfig.java, they are read from conf in constructor 
and surely used in LruBlockCache.java

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862144#comment-13862144
 ] 

Feng Honghua commented on HBASE-10263:
--

[~yuzhih...@gmail.com]:
bq.Is the above variable used(inMemoryForceMode) ?
==> No, they(together with single/multi/memory factors) are not used. There is 
a historical reason for these variables here: this flag(and other 3 factors) 
will be read from *conf* passed as parameter in LruBlockCache constructor, in 
0.94.3(our internal branch) there is a INFO log for max-size before 
constructing the LruBlockCache, and I added these 
'forceMode/single/multi/memory' info in that INFO log as well, they are used 
just for info purpose, but this INFO log in CacheConfig.java doesn't exist in 
trunk code(it's removed), and I forgot to remove these four just-for-info 
variables accordingly. *It won't affect correctness*. Thanks for point this out 
:-)

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862141#comment-13862141
 ] 

Andrew Purtell commented on HBASE-10279:


I assume you pinged me to pick this up for trunk [~lhofhansl], so that's what I 
will do. :-)

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10272) Cluster becomes in-operational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862139#comment-13862139
 ] 

Andrew Purtell commented on HBASE-10272:


+1

> Cluster becomes in-operational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.96.1, 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10272) Cluster becomes in-operational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862137#comment-13862137
 ] 

Ted Yu commented on HBASE-10272:


+1

[~apurtell]:
Do you want this in 0.98 ?

> Cluster becomes in-operational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.96.1, 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reassigned HBASE-10279:
-

Assignee: Lars Hofhansl

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862133#comment-13862133
 ] 

Lars Hofhansl commented on HBASE-10279:
---

[~apurtell], FYI

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10279:
--

Attachment: 10279-0.94.txt

Patch for 0.94. Uses EnvironmentEdge instead.
The change to Store.java is not needed, but good to have.

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10279:
--

Attachment: 10279-0.94.txt

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic

2014-01-03 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862131#comment-13862131
 ] 

Feng Honghua commented on HBASE-5923:
-

[~lhofhansl]: Sounds good

> Cleanup checkAndXXX logic
> -
>
> Key: HBASE-5923
> URL: https://issues.apache.org/jira/browse/HBASE-5923
> Project: HBase
>  Issue Type: Improvement
>  Components: Client, regionserver
>Reporter: Lars Hofhansl
>  Labels: noob
> Attachments: 5923-0.94.txt, 5923-trunk.txt, HBASE-10262-trunk_v0.patch
>
>
> 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via 
> HTable[Interface].
> 2. there is unnecessary duplicate code in the check{Put|Delete} code in 
> HRegionServer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10279:
--

Attachment: (was: 10279-0.94.txt)

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 10279-0.94.txt
>
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862129#comment-13862129
 ] 

Lars Hofhansl commented on HBASE-10279:
---

Or better, use the EnvironmentEdge correctly.

> TestStore.testDeleteExpiredStoreFiles is flaky
> --
>
> Key: HBASE-10279
> URL: https://issues.apache.org/jira/browse/HBASE-10279
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is 
> a blip on the machine running the test, first compaction might be delayed 
> enough in order to compact away multiple of the files, and have the test fail.
> The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10273) AssignmentManager.regions(region to regionserver assignment map) and AssignmentManager.servers(regionserver to regions assignment map) are not always updated in tandem w

2014-01-03 Thread Feng Honghua (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Honghua updated HBASE-10273:
-

Attachment: HBASE-10273-0.94_v1.patch

new patch per [~lhofhansl]'s feedback is attached, thanks [~lhofhansl]

> AssignmentManager.regions(region to regionserver assignment map) and 
> AssignmentManager.servers(regionserver to regions assignment map) are not 
> always updated in tandem with each other
> ---
>
> Key: HBASE-10273
> URL: https://issues.apache.org/jira/browse/HBASE-10273
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.16
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Fix For: 0.94.16
>
> Attachments: HBASE-10273-0.94_v0.patch, HBASE-10273-0.94_v1.patch
>
>
> By definition, AssignmentManager.servers and AssignmentManager.regions are 
> tied and should be updated in tandem with each other under a lock on 
> AssignmentManager.regions, but there are two places where this protocol is 
> broken.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-9941:
--

Attachment: 9941.patch

Updated patch fixes the part Gary pointed out, adds handling of throwables to 
WALObserver upcalls, adds classloader context setup for 
RegionServerCoprocessorHost (missed it previously), and fixes a javadoc nit.

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch, 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10279) TestStore.testDeleteExpiredStoreFiles is flaky

2014-01-03 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-10279:
-

 Summary: TestStore.testDeleteExpiredStoreFiles is flaky
 Key: HBASE-10279
 URL: https://issues.apache.org/jira/browse/HBASE-10279
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


TestStore.testDeleteExpiredStoreFiles relies on wall clock time, if there is a 
blip on the machine running the test, first compaction might be delayed enough 
in order to compact away multiple of the files, and have the test fail.

The simplest fix is to just double the time given from 1s/file to 2s/file.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9977) Define C interface of HBase Client Asynchronous APIs

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862100#comment-13862100
 ] 

Hudson commented on HBASE-9977:
---

SUCCESS: Integrated in HBase-TRUNK #4785 (See 
[https://builds.apache.org/job/HBase-TRUNK/4785/])
HBASE-9977 Define C interface of HBase Client Asynchronous APIs (eclark: rev 
1555272)
* /hbase/trunk/hbase-native-client
* /hbase/trunk/hbase-native-client/.gitignore
* /hbase/trunk/hbase-native-client/CMakeLists.txt
* /hbase/trunk/hbase-native-client/README.md
* /hbase/trunk/hbase-native-client/bin
* /hbase/trunk/hbase-native-client/bin/build-all.sh
* /hbase/trunk/hbase-native-client/bin/build-thirdparty.sh
* /hbase/trunk/hbase-native-client/bin/download-thirdparty.sh
* /hbase/trunk/hbase-native-client/bin/hbase-client-env.sh
* /hbase/trunk/hbase-native-client/cmake_modules
* /hbase/trunk/hbase-native-client/cmake_modules/FindGTest.cmake
* /hbase/trunk/hbase-native-client/cmake_modules/FindLibEv.cmake
* /hbase/trunk/hbase-native-client/src
* /hbase/trunk/hbase-native-client/src/async
* /hbase/trunk/hbase-native-client/src/async/CMakeLists.txt
* /hbase/trunk/hbase-native-client/src/async/get-test.cc
* /hbase/trunk/hbase-native-client/src/async/hbase_admin.cc
* /hbase/trunk/hbase-native-client/src/async/hbase_admin.h
* /hbase/trunk/hbase-native-client/src/async/hbase_client.cc
* /hbase/trunk/hbase-native-client/src/async/hbase_client.h
* /hbase/trunk/hbase-native-client/src/async/hbase_connection.cc
* /hbase/trunk/hbase-native-client/src/async/hbase_connection.h
* /hbase/trunk/hbase-native-client/src/async/hbase_errno.h
* /hbase/trunk/hbase-native-client/src/async/hbase_get.cc
* /hbase/trunk/hbase-native-client/src/async/hbase_get.h
* /hbase/trunk/hbase-native-client/src/async/hbase_mutations.cc
* /hbase/trunk/hbase-native-client/src/async/hbase_mutations.h
* /hbase/trunk/hbase-native-client/src/async/hbase_result.cc
* /hbase/trunk/hbase-native-client/src/async/hbase_result.h
* /hbase/trunk/hbase-native-client/src/async/hbase_scanner.cc
* /hbase/trunk/hbase-native-client/src/async/hbase_scanner.h
* /hbase/trunk/hbase-native-client/src/async/mutations-test.cc
* /hbase/trunk/hbase-native-client/src/core
* /hbase/trunk/hbase-native-client/src/core/CMakeLists.txt
* /hbase/trunk/hbase-native-client/src/core/admin.cc
* /hbase/trunk/hbase-native-client/src/core/admin.h
* /hbase/trunk/hbase-native-client/src/core/client.cc
* /hbase/trunk/hbase-native-client/src/core/client.h
* /hbase/trunk/hbase-native-client/src/core/connection.cc
* /hbase/trunk/hbase-native-client/src/core/connection.h
* /hbase/trunk/hbase-native-client/src/core/connection_attr.h
* /hbase/trunk/hbase-native-client/src/core/delete.cc
* /hbase/trunk/hbase-native-client/src/core/delete.h
* /hbase/trunk/hbase-native-client/src/core/get.cc
* /hbase/trunk/hbase-native-client/src/core/get.h
* /hbase/trunk/hbase-native-client/src/core/hbase_connection_attr.cc
* /hbase/trunk/hbase-native-client/src/core/hbase_connection_attr.h
* /hbase/trunk/hbase-native-client/src/core/hbase_macros.h
* /hbase/trunk/hbase-native-client/src/core/hbase_types.h
* /hbase/trunk/hbase-native-client/src/core/mutation.cc
* /hbase/trunk/hbase-native-client/src/core/mutation.h
* /hbase/trunk/hbase-native-client/src/core/put.cc
* /hbase/trunk/hbase-native-client/src/core/put.h
* /hbase/trunk/hbase-native-client/src/core/scanner.cc
* /hbase/trunk/hbase-native-client/src/core/scanner.h
* /hbase/trunk/hbase-native-client/src/rpc
* /hbase/trunk/hbase-native-client/src/rpc/CMakeLists.txt
* /hbase/trunk/hbase-native-client/src/sync
* /hbase/trunk/hbase-native-client/src/sync/CMakeLists.txt
* /hbase/trunk/hbase-native-client/src/sync/hbase_admin.cc
* /hbase/trunk/hbase-native-client/src/sync/hbase_admin.h
* /hbase/trunk/hbase-native-client/src/sync/hbase_connection.cc
* /hbase/trunk/hbase-native-client/src/sync/hbase_connection.h


> Define C interface of HBase Client Asynchronous APIs
> 
>
> Key: HBASE-9977
> URL: https://issues.apache.org/jira/browse/HBASE-9977
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.99.0
>
> Attachments: HBASE-9977-0.patch, HBASE-9977-1.patch, 
> HBASE-9977-2.patch, HBASE-9977-3.patch, HBASE-9977-4.patch, HBASE-9977-5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10278) Provide better write predictability

2014-01-03 Thread Himanshu Vashishtha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-10278:


Attachment: Multiwaldesigndoc.pdf

> Provide better write predictability
> ---
>
> Key: HBASE-10278
> URL: https://issues.apache.org/jira/browse/HBASE-10278
> Project: HBase
>  Issue Type: New Feature
>Reporter: Himanshu Vashishtha
> Attachments: Multiwaldesigndoc.pdf
>
>
> Currently, HBase has one WAL per region server. 
> Whenever there is any latency in the write pipeline (due to whatever reasons 
> such as n/w blip, a node in the pipeline having a bad disk, etc), the overall 
> write latency suffers. 
> Jonathan Hsieh and I analyzed various approaches to tackle this issue. We 
> also looked at HBASE-5699, which talks about adding concurrent multi WALs. 
> Along with performance numbers, we also focussed on design simplicity, 
> minimum impact on MTTR & Replication, and compatibility with 0.96 and 0.98. 
> Considering all these parameters, we propose a new HLog implementation with 
> WAL Switching functionality.
> Please find attached the design doc for the same. It introduces the WAL 
> Switching feature, and experiments/results of a prototype implementation, 
> showing the benefits of this feature.
> The second goal of this work is to serve as a building block for concurrent 
> multiple WALs feature.
> Please review the doc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10278) Provide better write predictability

2014-01-03 Thread Himanshu Vashishtha (JIRA)
Himanshu Vashishtha created HBASE-10278:
---

 Summary: Provide better write predictability
 Key: HBASE-10278
 URL: https://issues.apache.org/jira/browse/HBASE-10278
 Project: HBase
  Issue Type: New Feature
Reporter: Himanshu Vashishtha


Currently, HBase has one WAL per region server. 
Whenever there is any latency in the write pipeline (due to whatever reasons 
such as n/w blip, a node in the pipeline having a bad disk, etc), the overall 
write latency suffers. 

Jonathan Hsieh and I analyzed various approaches to tackle this issue. We also 
looked at HBASE-5699, which talks about adding concurrent multi WALs. Along 
with performance numbers, we also focussed on design simplicity, minimum impact 
on MTTR & Replication, and compatibility with 0.96 and 0.98. Considering all 
these parameters, we propose a new HLog implementation with WAL Switching 
functionality.

Please find attached the design doc for the same. It introduces the WAL 
Switching feature, and experiments/results of a prototype implementation, 
showing the benefits of this feature.
The second goal of this work is to serve as a building block for concurrent 
multiple WALs feature.

Please review the doc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10272) Cluster becomes in-operational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862095#comment-13862095
 ] 

Hadoop QA commented on HBASE-10272:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621405/HBASE-10272.patch
  against trunk revision .
  ATTACHMENT ID: 12621405

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 4 release 
audit warnings (more than the trunk's current 0 warnings).

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8336//console

This message is automatically generated.

> Cluster becomes in-operational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.96.1, 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop

[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862091#comment-13862091
 ] 

Lars Hofhansl commented on HBASE-5923:
--

Patch looks good. Unless somebody desperately wants this in 0.94/0.96/0.98, 
let's just change this in trunk. 

> Cleanup checkAndXXX logic
> -
>
> Key: HBASE-5923
> URL: https://issues.apache.org/jira/browse/HBASE-5923
> Project: HBase
>  Issue Type: Improvement
>  Components: Client, regionserver
>Reporter: Lars Hofhansl
>  Labels: noob
> Attachments: 5923-0.94.txt, 5923-trunk.txt, HBASE-10262-trunk_v0.patch
>
>
> 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via 
> HTable[Interface].
> 2. there is unnecessary duplicate code in the check{Put|Delete} code in 
> HRegionServer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8751) Enable peer cluster to choose/change the ColumnFamilies/Tables it really want to replicate from a source cluster

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862089#comment-13862089
 ] 

Lars Hofhansl commented on HBASE-8751:
--

Yeah, no persistent state in ZK... Although, we already broke that anyway, we 
store the state of the replication queue there, if we blow that away we will 
lose (not replicate) data in the slave cluster.

> Enable peer cluster to choose/change the ColumnFamilies/Tables it really want 
> to replicate from a source cluster
> 
>
> Key: HBASE-8751
> URL: https://issues.apache.org/jira/browse/HBASE-8751
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-8751-0.94-V0.patch, HBASE-8751-0.94-v1.patch
>
>
> Consider scenarios (all cf are with replication-scope=1):
> 1) cluster S has 3 tables, table A has cfA,cfB, table B has cfX,cfY, table C 
> has cf1,cf2.
> 2) cluster X wants to replicate table A : cfA, table B : cfX and table C from 
> cluster S.
> 3) cluster Y wants to replicate table B : cfY, table C : cf2 from cluster S.
> Current replication implementation can't achieve this since it'll push the 
> data of all the replicatable column-families from cluster S to all its peers, 
> X/Y in this scenario.
> This improvement provides a fine-grained replication theme which enable peer 
> cluster to choose the column-families/tables they really want from the source 
> cluster:
> A). Set the table:cf-list for a peer when addPeer:
>   hbase-shell> add_peer '3', "zk:1100:/hbase", "table1; table2:cf1,cf2; 
> table3:cf2"
> B). View the table:cf-list config for a peer using show_peer_tableCFs:
>   hbase-shell> show_peer_tableCFs "1"
> C). Change/set the table:cf-list for a peer using set_peer_tableCFs:
>   hbase-shell> set_peer_tableCFs '2', "table1:cfX; table2:cf1; table3:cf1,cf2"
> In this theme, replication-scope=1 only means a column-family CAN be 
> replicated to other clusters, but only the 'table:cf-list list' determines 
> WHICH cf/table will actually be replicated to a specific peer.
> To provide back-compatibility, empty 'table:cf-list list' will replicate all 
> replicatable cf/table. (this means we don't allow a peer which replicates 
> nothing from a source cluster, we think it's reasonable: if replicating 
> nothing why bother adding a peer?)
> This improvement addresses the exact problem raised  by the first FAQ in 
> "http://hbase.apache.org/replication.html":
>   "GLOBAL means replicate? Any provision to replicate only to cluster X and 
> not to cluster Y? or is that for later?
>   Yes, this is for much later."
> I also noticed somebody mentioned "replication-scope" as integer rather than 
> a boolean is for such fine-grained replication purpose, but I think extending 
> "replication-scope" can't achieve the same replication granularity 
> flexibility as providing above per-peer replication configurations.
> This improvement has been running smoothly in our production clusters 
> (Xiaomi) for several months.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10249) Intermittent TestReplicationSyncUpTool failure

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862082#comment-13862082
 ] 

Lars Hofhansl commented on HBASE-10249:
---

bq. sorry about all the troubles

That's what the tests are for :)
Replication is in principle asynchronous, it might still just be an issue with 
the test.

> Intermittent TestReplicationSyncUpTool failure
> --
>
> Key: HBASE-10249
> URL: https://issues.apache.org/jira/browse/HBASE-10249
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Demai Ni
> Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0
>
> Attachments: HBASE-10249-trunk-v0.patch
>
>
> New issue to keep track of this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10264) [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862081#comment-13862081
 ] 

Hudson commented on HBASE-10264:


SUCCESS: Integrated in hbase-0.96-hadoop2 #169 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/169/])
HBASE-10264 CompactionTool in mapred mode is missing classes in its classpath 
(Himanshu Vashishtha) (ndimiduk: rev 1555183)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java


> [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath
> --
>
> Key: HBASE-10264
> URL: https://issues.apache.org/jira/browse/HBASE-10264
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, mapreduce
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Aleksandr Shulman
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBase-10264.patch
>
>
> Calling o.a.h.h.regionserver.CompactionTool fails due to classpath-related 
> issues in both MRv1 and MRv2.
> {code}hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
> -major hdfs://`hostname`:8020/hbase/data/default/orig_1388179858868{code}
> Results:
> {code}2013-12-27 13:31:49,478 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1388179525649_0011_m_00_2, Status : FAILED
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.TableInfoMissingException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:115)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:231)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:207)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160){code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10277) refactor AsyncProcess

2014-01-03 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-10277:


 Summary: refactor AsyncProcess
 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
callback and with reuse, and one from HCM/HTable batch call, with callback and 
w/o reuse. In the former case (but not the latter), it also does some 
throttling of actions on initial submit call, limiting the number of 
outstanding actions per server.
The latter case is relatively straightforward. The former appears to be error 
prone due to reuse - if, as javadoc claims should be safe, multiple submit 
calls are performed without waiting for the async part of the previous call to 
finish, fields like hasError become ambiguous and can be used for the wrong 
call; callback for success/failure is called based on "original index" of an 
action in submitted list, but with only one callback supplied to AP in ctor 
it's not clear to which submit call the index belongs, if several are 
outstanding.

I was going to add support for HBASE-10070 to AP, and found that it might be 
difficult to do cleanly.

It would be nice to normalize AP usage patterns; in particular, separate the 
"global" part (load tracking) from per-submit-call part.
Per-submit part can more conveniently track stuff like initialActions, mapping 
of indexes and retry information, that is currently passed around the method 
calls.
I am not sure yet, but maybe sending of the original index to server in 
"ClientProtos.MultiAction" can also be avoided.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Reopened] (HBASE-10249) Intermittent TestReplicationSyncUpTool failure

2014-01-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reopened HBASE-10249:



Let's reopen the issue.

> Intermittent TestReplicationSyncUpTool failure
> --
>
> Key: HBASE-10249
> URL: https://issues.apache.org/jira/browse/HBASE-10249
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Demai Ni
> Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0
>
> Attachments: HBASE-10249-trunk-v0.patch
>
>
> New issue to keep track of this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error

2014-01-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862015#comment-13862015
 ] 

Andrew Purtell commented on HBASE-10210:


+1 for 0.98

> during master startup, RS can be you-are-dead-ed by master in error
> ---
>
> Key: HBASE-10210
> URL: https://issues.apache.org/jira/browse/HBASE-10210
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, 
> HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, 
> HBASE-10210.patch
>
>
> Not sure of the root cause yet, I am at "how did this ever work" stage.
> We see this problem in 0.96.1, but didn't in 0.96.0 + some patches.
> It looks like RS information arriving from 2 sources - ZK and server itself, 
> can conflict. Master doesn't handle such cases (timestamp match), and anyway 
> technically timestamps can collide for two separate servers.
> So, master YouAreDead-s the already-recorded reporting RS, and adds it too. 
> Then it discovers that the new server has died with fatal error!
> Note the threads.
> Addition is called from master initialization and from RPC.
> {noformat}
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Finished waiting for region servers count to settle; checked in 2, slept for 
> 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running.
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Registering 
> server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered 
> server found up in zk but who has not yet reported in: 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Triggering server recovery; existingServer 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> looks stale, new 
> server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Master doesn't enable ServerShutdownHandler during 
> initialization, delay expiring server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> ...
> 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] 
> master.HMaster: Region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> reported a fatal error:
> ABORTING region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: 
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as 
> dead server
> {noformat}
> Presumably some of the recent ZK listener related changes b



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10272) Cluster becomes in-operational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore updated HBASE-10272:
---

Affects Version/s: 0.96.1
   Status: Patch Available  (was: Open)

Submitting to HadoopQA.

> Cluster becomes in-operational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.94.15, 0.96.1
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10249) Intermittent TestReplicationSyncUpTool failure

2014-01-03 Thread Demai Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862012#comment-13862012
 ] 

Demai Ni commented on HBASE-10249:
--

too bad this is a tough one. The debug info shows that the data at source 
is correct. I need to re-exam the logic of both the testcase and the syncup 
Tool. 

sorry about all the troubles. 

> Intermittent TestReplicationSyncUpTool failure
> --
>
> Key: HBASE-10249
> URL: https://issues.apache.org/jira/browse/HBASE-10249
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Demai Ni
> Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0
>
> Attachments: HBASE-10249-trunk-v0.patch
>
>
> New issue to keep track of this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10272) Cluster becomes in-operational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore updated HBASE-10272:
---

Attachment: HBASE-10272.patch

Patch for trunk

> Cluster becomes in-operational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.96.1, 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Attachments: HBASE-10272.patch, HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error

2014-01-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862007#comment-13862007
 ] 

Sergey Shelukhin commented on HBASE-10210:
--

[~stack] should this also be in 96 branch?

> during master startup, RS can be you-are-dead-ed by master in error
> ---
>
> Key: HBASE-10210
> URL: https://issues.apache.org/jira/browse/HBASE-10210
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, 
> HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, 
> HBASE-10210.patch
>
>
> Not sure of the root cause yet, I am at "how did this ever work" stage.
> We see this problem in 0.96.1, but didn't in 0.96.0 + some patches.
> It looks like RS information arriving from 2 sources - ZK and server itself, 
> can conflict. Master doesn't handle such cases (timestamp match), and anyway 
> technically timestamps can collide for two separate servers.
> So, master YouAreDead-s the already-recorded reporting RS, and adds it too. 
> Then it discovers that the new server has died with fatal error!
> Note the threads.
> Addition is called from master initialization and from RPC.
> {noformat}
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Finished waiting for region servers count to settle; checked in 2, slept for 
> 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running.
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Registering 
> server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered 
> server found up in zk but who has not yet reported in: 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Triggering server recovery; existingServer 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> looks stale, new 
> server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Master doesn't enable ServerShutdownHandler during 
> initialization, delay expiring server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> ...
> 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] 
> master.HMaster: Region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> reported a fatal error:
> ABORTING region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: 
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as 
> dead server
> {noformat}
> Presumably some of the recent ZK listener related changes b



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10276) Hook hbase-native-client up to Native profile.

2014-01-03 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-10276:
--

Description: 
{code}mvn clean package -Pnative{code}
 should build the new hbase-native-client module.

  was:
mvn clean package -Pnative
 should build the new hbase-native-client module.


> Hook hbase-native-client up to Native profile.
> --
>
> Key: HBASE-10276
> URL: https://issues.apache.org/jira/browse/HBASE-10276
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, Client
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>
> {code}mvn clean package -Pnative{code}
>  should build the new hbase-native-client module.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8912) [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861999#comment-13861999
 ] 

Lars Hofhansl commented on HBASE-8912:
--

Yeah, would be nice if the AM could retain a history of assignments of a region 
and avoid retrying the same RS over and over, it should also do per region rate 
limiting. Too risky to add this to 0.94, though.

As for the warning... You are probably right. The warning might still be an 
indication for double assignments (i.e. the region was OPEN already as far as 
the AM was concerned and yet it got another OPENED message from ZK).
I think in 0.94 we should leave the warning in, in case we see more issues here 
in the future. In 0.96+ it's not an issue anyway.


> [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to 
> OFFLINE
> --
>
> Key: HBASE-8912
> URL: https://issues.apache.org/jira/browse/HBASE-8912
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.16
>
> Attachments: 8912-0.94-alt2.txt, 8912-0.94.txt, 8912-fix-race.txt, 
> HBASE-8912.patch, HBase-0.94 #1036 test - testRetrying [Jenkins].html, 
> log.txt, org.apache.hadoop.hbase.catalog.TestMetaReaderEditor-output.txt
>
>
> AM throws this exception which subsequently causes the master to abort: 
> {code}
> java.lang.IllegalStateException: Unexpected state : 
> testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b. 
> state=PENDING_OPEN, ts=1372891751912, 
> server=hemera.apache.org,39064,1372891746132 .. Cannot transit it to OFFLINE.
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
>   at 
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>   at java.lang.Thread.run(Thread.java:662)
> {code}
> This exception trace is from the failing test TestMetaReaderEditor which is 
> failing pretty frequently, but looking at the test code, I think this is not 
> a test-only issue, but affects the main code path. 
> https://builds.apache.org/job/HBase-0.94/1036/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10273) AssignmentManager.regions(region to regionserver assignment map) and AssignmentManager.servers(regionserver to regions assignment map) are not always updated in tandem

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861987#comment-13861987
 ] 

Lars Hofhansl commented on HBASE-10273:
---

Do we need the warning here:
{code}
+  if (sn == null) {
+LOG.warn("No region server for " + region);
{code}

Could serverRegions be null?
{code}
+Set serverRegions = this.servers.get(sn);
+if (!serverRegions.remove(region)) {
+  LOG.warn("No " + region + " on " + sn);
{code}

I guess, I'd prefer:
{code}
-  this.regions.remove(region);
+  ServerName sn = this.regions.remove(region);
+  if (sn != null) {
+Set serverRegions = this.servers.get(sn);
+if (serverRegions == null || !serverRegions.remove(region)) {
+  LOG.warn("No " + region + " on " + sn);
+}
+  }
{code}


> AssignmentManager.regions(region to regionserver assignment map) and 
> AssignmentManager.servers(regionserver to regions assignment map) are not 
> always updated in tandem with each other
> ---
>
> Key: HBASE-10273
> URL: https://issues.apache.org/jira/browse/HBASE-10273
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.16
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Fix For: 0.94.16
>
> Attachments: HBASE-10273-0.94_v0.patch
>
>
> By definition, AssignmentManager.servers and AssignmentManager.regions are 
> tied and should be updated in tandem with each other under a lock on 
> AssignmentManager.regions, but there are two places where this protocol is 
> broken.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (HBASE-8912) [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE

2014-01-03 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861925#comment-13861925
 ] 

Jean-Marc Spaggiari edited comment on HBASE-8912 at 1/3/14 10:54 PM:
-

After the first restart, 36 regions are stuck in transition :( But not any 
server crashed.

What I did:
- Restored default balancer to make sure as much regions as possible will move.
- Stop/start HBase
- Run balancer from shell.

Every thing is back up after a 2nd restart.

I get many errors like this one:
{code}
2014-01-03 16:03:03,958 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received FAILED_OPEN for region b75cb9067c3c4456d6198c9237c143b3 from server 
node4.domain.com,60020,1388782921790 but region was in  the state 
page,rf.idua.www\x1Fhttp\x1F-1\x1F/fr/brand/fr/audi_fleet_solutions/contact/contact_transport_personnes.html\x1Fnull,1379103792232.b75cb9067c3c4456d6198c9237c143b3.
 state=CLOSED, ts=1388782983373, server=node4.domain.com,60020,1388782921790 
and not in OFFLINE, PENDING_OPEN or OPENING
{code}

After investigations, I figured that snappy was missing on a server. I fixed 
that, restart: All seems to be fine. So I restored my customized balancer, 
restart, balanced.

Still some warning in the logs:
{code}
2014-01-03 16:21:52,864 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region db8e67acde26bf340da481d3c1b934cd from server 
node4.domain.com,60020,1388784051197 but region was in  the state 
page,moc.tenretnigruoboc.www\x1Fhttp\x1F-1\x1F/cobourg-and-the-web\x1Fnull,1379103844627.db8e67acde26bf340da481d3c1b934cd.
 state=OPEN, ts=1388784100392, server=node4.domain.com,60020,1388784051197 and 
not in expected OFFLINE, PENDING_OPEN or OPENING states
{code}

But this time all the regions are assigned correctly.

I did that one more time (change balancer, stop, start, balance. Change 
balancer, stop, start, balance). I turned loglevel to warn.

{code}
2014-01-03 16:28:51,142 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region 17bee313797fc1ce982c0e31fdb6620c from server 
node8.domain.com,60020,1388784498327 but region was in  the state 
page,rf.ofniecnarf.www\x1Fhttp\x1F-1\x1F/vote/comment/27996/1/vote/zero_vote/c99b0992e5a9cd6bf3a4cfc91769ceeb\x1Fnull,1379104524006.17bee313797fc1ce982c0e31fdb6620c.
 state=OPEN, ts=1388784531048, server=node8.domain.com,60020,1388784498327 and 
not in expected OFFLINE, PENDING_OPEN or OPENING states
2014-01-03 16:28:52,135 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region 6dc6290df1855b319f60bf89faa3da41 from server 
node8.domain.com,60020,1388784498327 but region was in  the state 
page_crc,\x00\x00\x00\x00\xD7\xD9\x97\x8Bvideo.k-wreview.ca,1378042601904.6dc6290df1855b319f60bf89faa3da41.
 state=OPEN, ts=1388784531793, server=node8.domain.com,60020,1388784498327 and 
not in expected OFFLINE, PENDING_OPEN or OPENING states
2014-01-03 16:28:52,712 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region ec4f96b6cedd935aeba279b15d5337af from server 
node8.domain.com,60020,1388784498327 but region was in  the state 
work_proposed,\x98\xBF\xAF\x90\x00\x00\x00\x00http://feedproxy.google.com/~r/WheatWeeds/~3/Of24fZKcpco/the-eighth-day-of-christmas.html,1378975430143.ec4f96b6cedd935aeba279b15d5337af.
 state=OPEN, ts=1388784532540, server=node8.domain.com,60020,1388784498327 and 
not in expected OFFLINE, PENDING_OPEN or OPENING states
2014-01-03 16:28:52,747 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region 4f823b5de664556a89cbd86aa41cd0b0 from server 
node8.domain.com,60020,1388784498327 but region was in  the state 
work_proposed,\x8D4K\xEA\x00\x00\x00\x00http://twitter.com/home?status=CartoonStock%3A++http%3A%2F%2Fwww%2Ecartoonstock%2Ecom%2Fdirectory%2Fc%2Fcream%5Ftea%5Fgifts%2Easp,1378681682935.4f823b5de664556a89cbd86aa41cd0b0.
 state=OPEN, ts=1388784532552, server=node8.domain.com,60020,1388784498327 and 
not in expected OFFLINE, PENDING_OPEN or OPENING states
2014-01-03 16:28:53,244 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region da0bd0a6b7187f731fb34d4ac14ca279 from server 
node8.domain.com,60020,1388784498327 but region was in  the state 
work_proposed,\xB2\xE6\xB6\xBB\x00\x00\x00\x00http://www.canpages.ca/page/QC/notre-dame-des-prairies/concept-beton-design/4550984.html,1378737981443.da0bd0a6b7187f731fb34d4ac14ca279.
 state=OPEN, ts=1388784533203, server=node8.domain.com,60020,1388784498327 and 
not in expected OFFLINE, PENDING_OPEN or OPENING states
{code}

But everything finally got assigned without any restart required, any pretty 
quickly.

Logs from the last run:
{code}
2014-01-03 16:32:20,252 WARN org.apache.hadoop.ipc.HBaseServer: 
(responseTooSlow): {"processingtimems":10969,"call":"balance(), rpc version=1, 
client version=29, 
methodsFingerPrint=1

[jira] [Created] (HBASE-10276) Hook hbase-native-client up to Native profile.

2014-01-03 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-10276:
-

 Summary: Hook hbase-native-client up to Native profile.
 Key: HBASE-10276
 URL: https://issues.apache.org/jira/browse/HBASE-10276
 Project: HBase
  Issue Type: Sub-task
  Components: build, Client
Reporter: Elliott Clark
Assignee: Elliott Clark


mvn clean package -Pnative
 should build the new hbase-native-client module.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8912) [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE

2014-01-03 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861982#comment-13861982
 ] 

Jean-Marc Spaggiari commented on HBASE-8912:


Yep, all 36 regions belonged to a server whith Snappy issues. It will have been 
nice to see them re-assigned successfuly somewhere else. I can reproduce that 
easily, I just need to remove the snappy so file...

Also, regarding the warning, if I clear what, that gives that:
2014-01-03 16:32:21,278 WARN AssignmentManager: Received OPENED for region 
0...a from server node1  but region was in  the state  state=OPEN,  and not in 
expected OFFLINE, PENDING_OPEN or OPENING states

If we get OPENED and region is OPEN, can we not simply discard the warning? 
That mean region is fine, we got a request to open it and it's already done. So 
why should we worry? ;)

> [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to 
> OFFLINE
> --
>
> Key: HBASE-8912
> URL: https://issues.apache.org/jira/browse/HBASE-8912
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.16
>
> Attachments: 8912-0.94-alt2.txt, 8912-0.94.txt, 8912-fix-race.txt, 
> HBASE-8912.patch, HBase-0.94 #1036 test - testRetrying [Jenkins].html, 
> log.txt, org.apache.hadoop.hbase.catalog.TestMetaReaderEditor-output.txt
>
>
> AM throws this exception which subsequently causes the master to abort: 
> {code}
> java.lang.IllegalStateException: Unexpected state : 
> testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b. 
> state=PENDING_OPEN, ts=1372891751912, 
> server=hemera.apache.org,39064,1372891746132 .. Cannot transit it to OFFLINE.
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
>   at 
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>   at java.lang.Thread.run(Thread.java:662)
> {code}
> This exception trace is from the failing test TestMetaReaderEditor which is 
> failing pretty frequently, but looking at the test code, I think this is not 
> a test-only issue, but affects the main code path. 
> https://builds.apache.org/job/HBase-0.94/1036/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HBASE-9977) Define C interface of HBase Client Asynchronous APIs

2014-01-03 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark resolved HBASE-9977.
--

   Resolution: Fixed
Fix Version/s: 0.99.0
 Hadoop Flags: Reviewed

Thanks for all of the reviews.  Now comes the interesting part.

> Define C interface of HBase Client Asynchronous APIs
> 
>
> Key: HBASE-9977
> URL: https://issues.apache.org/jira/browse/HBASE-9977
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.99.0
>
> Attachments: HBASE-9977-0.patch, HBASE-9977-1.patch, 
> HBASE-9977-2.patch, HBASE-9977-3.patch, HBASE-9977-4.patch, HBASE-9977-5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10249) Intermittent TestReplicationSyncUpTool failure

2014-01-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861980#comment-13861980
 ] 

Ted Yu commented on HBASE-10249:


Oops, it failed again:
https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/50/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationSyncUpTool/testSyncUpTool/

> Intermittent TestReplicationSyncUpTool failure
> --
>
> Key: HBASE-10249
> URL: https://issues.apache.org/jira/browse/HBASE-10249
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Demai Ni
> Fix For: 0.98.0, 0.94.16, 0.96.2, 0.99.0
>
> Attachments: HBASE-10249-trunk-v0.patch
>
>
> New issue to keep track of this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8741) Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs)

2014-01-03 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861978#comment-13861978
 ] 

Himanshu Vashishtha commented on HBASE-8741:


Yes, reopening a region is safe.

Re-opening a region involves closing and opening it again. On closing, the 
region is flushed. On flushing, we update the oldestFlushingSeqNums and 
oldestUnFlushedSeqNums (basically, remove its entry from these maps). Let's say 
latestSequenceNums still has two entries for that region. There is no 
corresponding element in oldestUnflushedSeqNums and oldestFlushingSeqNums map 
for the older entry. It will be ignored when considering that WAL file for 
archiving. 
https://github.com/apache/hbase/blob/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java#L676


> Scope sequenceid to the region rather than regionserver (WAS: Mutations on 
> Regions in recovery mode might have same sequenceIDs)
> 
>
> Key: HBASE-8741
> URL: https://issues.apache.org/jira/browse/HBASE-8741
> Project: HBase
>  Issue Type: Bug
>  Components: MTTR
>Affects Versions: 0.95.1
>Reporter: Himanshu Vashishtha
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0
>
> Attachments: HBASE-8741-trunk-v6.1-rebased.patch, 
> HBASE-8741-trunk-v6.2.1.patch, HBASE-8741-trunk-v6.2.2.patch, 
> HBASE-8741-trunk-v6.2.2.patch, HBASE-8741-trunk-v6.3.patch, 
> HBASE-8741-trunk-v6.4.patch, HBASE-8741-trunk-v6.patch, HBASE-8741-v0.patch, 
> HBASE-8741-v2.patch, HBASE-8741-v3.patch, HBASE-8741-v4-again.patch, 
> HBASE-8741-v4-again.patch, HBASE-8741-v4.patch, HBASE-8741-v5-again.patch, 
> HBASE-8741-v5.patch
>
>
> Currently, when opening a region, we find the maximum sequence ID from all 
> its HFiles and then set the LogSequenceId of the log (in case the later is at 
> a small value). This works good in recovered.edits case as we are not writing 
> to the region until we have replayed all of its previous edits. 
> With distributed log replay, if we want to enable writes while a region is 
> under recovery, we need to make sure that the logSequenceId > maximum 
> logSequenceId of the old regionserver. Otherwise, we might have a situation 
> where new edits have same (or smaller) sequenceIds. 
> We can store region level information in the WALTrailer, than this scenario 
> could be avoided by:
> a) reading the trailer of the "last completed" file, i.e., last wal file 
> which has a trailer and,
> b) completely reading the last wal file (this file would not have the 
> trailer, so it needs to be read completely).
> In future, if we switch to multi wal file, we could read the trailer for all 
> completed WAL files, and reading the remaining incomplete files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10264) [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861976#comment-13861976
 ] 

Hudson commented on HBASE-10264:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #50 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/50/])
HBASE-10264 CompactionTool in mapred mode is missing classes in its classpath 
(Himanshu Vashishtha) (ndimiduk: rev 1555182)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java


> [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath
> --
>
> Key: HBASE-10264
> URL: https://issues.apache.org/jira/browse/HBASE-10264
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, mapreduce
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Aleksandr Shulman
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBase-10264.patch
>
>
> Calling o.a.h.h.regionserver.CompactionTool fails due to classpath-related 
> issues in both MRv1 and MRv2.
> {code}hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
> -major hdfs://`hostname`:8020/hbase/data/default/orig_1388179858868{code}
> Results:
> {code}2013-12-27 13:31:49,478 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1388179525649_0011_m_00_2, Status : FAILED
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.TableInfoMissingException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:115)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:231)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:207)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160){code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8912) [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE

2014-01-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861970#comment-13861970
 ] 

Lars Hofhansl commented on HBASE-8912:
--

Cool, thanks JM! The warnings are expected to some extend.
The 36 regions that got stuck after the first restart, were they assigned to 
the RS that had SNAPPY missing?

> [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to 
> OFFLINE
> --
>
> Key: HBASE-8912
> URL: https://issues.apache.org/jira/browse/HBASE-8912
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.16
>
> Attachments: 8912-0.94-alt2.txt, 8912-0.94.txt, 8912-fix-race.txt, 
> HBASE-8912.patch, HBase-0.94 #1036 test - testRetrying [Jenkins].html, 
> log.txt, org.apache.hadoop.hbase.catalog.TestMetaReaderEditor-output.txt
>
>
> AM throws this exception which subsequently causes the master to abort: 
> {code}
> java.lang.IllegalStateException: Unexpected state : 
> testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b. 
> state=PENDING_OPEN, ts=1372891751912, 
> server=hemera.apache.org,39064,1372891746132 .. Cannot transit it to OFFLINE.
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
>   at 
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>   at java.lang.Thread.run(Thread.java:662)
> {code}
> This exception trace is from the failing test TestMetaReaderEditor which is 
> failing pretty frequently, but looking at the test code, I think this is not 
> a test-only issue, but affects the main code path. 
> https://builds.apache.org/job/HBase-0.94/1036/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9977) Define C interface of HBase Client Asynchronous APIs

2014-01-03 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861963#comment-13861963
 ] 

Jean-Daniel Cryans commented on HBASE-9977:
---

+1, also left a comment on review board.

> Define C interface of HBase Client Asynchronous APIs
> 
>
> Key: HBASE-9977
> URL: https://issues.apache.org/jira/browse/HBASE-9977
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-9977-0.patch, HBASE-9977-1.patch, 
> HBASE-9977-2.patch, HBASE-9977-3.patch, HBASE-9977-4.patch, HBASE-9977-5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10275) [89-fb] Guarantee the sequenceID in each Region is strictly monotonic increasing

2014-01-03 Thread Liyin Tang (JIRA)
Liyin Tang created HBASE-10275:
--

 Summary: [89-fb] Guarantee the sequenceID in each Region is 
strictly monotonic increasing
 Key: HBASE-10275
 URL: https://issues.apache.org/jira/browse/HBASE-10275
 Project: HBase
  Issue Type: New Feature
Reporter: Liyin Tang
Assignee: Liyin Tang


[HBASE-8741] has implemented the per-region sequence ID. It would be even 
better to guarantee that the sequencing is strictly monotonic increasing so 
that HLog-Based Async Replication is able to delivery transactions in order in 
the case of region movements.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10272) Cluster becomes in-operational if the node hosting the active Master AND ROOT/META table goes offline

2014-01-03 Thread Aditya Kishore (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861956#comment-13861956
 ] 

Aditya Kishore commented on HBASE-10272:


Couldn't find a way to simulate the entire host becoming offline at once. All 
the kill() and abort() methods close the regions which cleans up the 
information in ZK which leads up to this situation.

> Cluster becomes in-operational if the node hosting the active Master AND 
> ROOT/META table goes offline
> -
>
> Key: HBASE-10272
> URL: https://issues.apache.org/jira/browse/HBASE-10272
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 0.94.15
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Attachments: HBASE-10272_0.94.patch
>
>
> Since HBASE-6364, HBase client caches a connection failure to a server and 
> any subsequent attempt to connect to the server throws a 
> {{FailedServerException}}
> Now if a node which hosted the active Master AND ROOT/META table goes 
> offline, the newly anointed Master's initial attempt to connect to the dead 
> region server will fail with {{NoRouteToHostException}} which it handles but 
> since on second attempt crashes with {{FailedServerException}}
> Here is the log from one such occurance
> {noformat}
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2013-11-20 10:58:00,161 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is 
> in the failed servers list: xxx02/192.168.1.102:60020
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:425)
> at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at $Proxy9.getProtocolVersion(Unknown Source)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:138)
> at 
> org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1294)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1281)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:506)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:383)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:445)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnection(CatalogTracker.java:464)
> at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:624)
> at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:684)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:560)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:376)
> at java.lang.Thread.run(Thread.java:662)
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2013-11-20 10:58:00,162 INFO org.apache.hadoop.ipc.HBaseServer: Stopping 
> server on 6
> {noformat}
> Each of the backup master will crash with same error and restarting them will 
> have the same effect. Once this happens, the cluster will remain 
> in-operational until the node with region server is brought online (or the 
> Zookeeper node containing the root region server and/or META entry from the 
> ROOT table is deleted).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10264) [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861947#comment-13861947
 ] 

Hudson commented on HBASE-10264:


SUCCESS: Integrated in HBase-0.98 #54 (See 
[https://builds.apache.org/job/HBase-0.98/54/])
HBASE-10264 CompactionTool in mapred mode is missing classes in its classpath 
(Himanshu Vashishtha) (ndimiduk: rev 1555182)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java


> [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath
> --
>
> Key: HBASE-10264
> URL: https://issues.apache.org/jira/browse/HBASE-10264
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, mapreduce
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Aleksandr Shulman
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBase-10264.patch
>
>
> Calling o.a.h.h.regionserver.CompactionTool fails due to classpath-related 
> issues in both MRv1 and MRv2.
> {code}hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
> -major hdfs://`hostname`:8020/hbase/data/default/orig_1388179858868{code}
> Results:
> {code}2013-12-27 13:31:49,478 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1388179525649_0011_m_00_2, Status : FAILED
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.TableInfoMissingException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:115)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:231)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:207)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160){code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList

2014-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861946#comment-13861946
 ] 

Hadoop QA commented on HBASE-10078:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621379/hbase-10078.patch
  against trunk revision .
  ATTACHMENT ID: 12621379

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8335//console

This message is automatically generated.

> Dynamic Filter - Not using DynamicClassLoader when using FilterList
> ---
>
> Key: HBASE-10078
> URL: https://issues.apache.org/jira/browse/HBASE-10078
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 0.94.13
>Reporter: Federico Gaule
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 0.94-10078.patch, hbase-10078.patch
>
>
> I've tried to use dynamic jar load 
> (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
> with FilterList. 
> Here is some log from my app where i send a Get with a FilterList containing 
> AFilter and other with BFilter.
> {noformat}
> 2013-12-02 13:55:42,564 DEBUG 
> org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found 
> - using dynamical class loader
> 2013-12-02 13:55:42,564 DEBUG 
> org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter
> 2013-12-02 13:55:42,564 DEBUG 
> org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any
> 2013-12-02 13:55:42,677 DEBUG 
> org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: 
> d.p.AFilter
> 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
> Can't find class d.p.BFilter
> java.lang.ClassNotFoundException: d.p.BFilter
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>   at java.lang.Class.forName0(

[jira] [Commented] (HBASE-10264) [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861939#comment-13861939
 ] 

Hudson commented on HBASE-10264:


FAILURE: Integrated in hbase-0.96 #249 (See 
[https://builds.apache.org/job/hbase-0.96/249/])
HBASE-10264 CompactionTool in mapred mode is missing classes in its classpath 
(Himanshu Vashishtha) (ndimiduk: rev 1555183)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java


> [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath
> --
>
> Key: HBASE-10264
> URL: https://issues.apache.org/jira/browse/HBASE-10264
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, mapreduce
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Aleksandr Shulman
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBase-10264.patch
>
>
> Calling o.a.h.h.regionserver.CompactionTool fails due to classpath-related 
> issues in both MRv1 and MRv2.
> {code}hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
> -major hdfs://`hostname`:8020/hbase/data/default/orig_1388179858868{code}
> Results:
> {code}2013-12-27 13:31:49,478 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1388179525649_0011_m_00_2, Status : FAILED
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.TableInfoMissingException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:115)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:231)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:207)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160){code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-03 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861941#comment-13861941
 ] 

Ted Yu commented on HBASE-10263:


In CacheConfig.java :
{code}
+boolean inMemoryForceMode = conf.getBoolean("hbase.rs.inmemoryforcemode",
+false);
{code}
Is the above variable used ?
{code}
+   * configuration, inMemoryForceMode is a cluster-wide configuration
{code}
Is there plan to make inMemoryForceMode column-family config ?

> make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
> preemptive mode for in-memory type block
> --
>
> Key: HBASE-10263
> URL: https://issues.apache.org/jira/browse/HBASE-10263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Feng Honghua
>Assignee: Feng Honghua
> Attachments: HBASE-10263-trunk_v0.patch
>
>
> currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
> 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
> scenario where in-memory table's read performance is much worse than ordinary 
> table when two tables' data size is almost equal and larger than 
> regionserver's cache size (we ever did some such experiment and verified that 
> in-memory table random read performance is two times worse than ordinary 
> table).
> this patch fixes above issue and provides:
> 1. make single/multi/in-memory ratio user-configurable
> 2. provide a configurable switch which can make in-memory block preemptive, 
> by preemptive means when this switch is on in-memory block can kick out any 
> ordinary block to make room until no ordinary block, when this switch is off 
> (by default) the behavior is the same as previous, using 
> single/multi/in-memory ratio to determine evicting.
> by default, above two changes are both off and the behavior keeps the same as 
> before applying this patch. it's client/user's choice to determine whether or 
> which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9858) Integration test and LoadTestTool support for cell Visibility

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861937#comment-13861937
 ] 

Hudson commented on HBASE-9858:
---

SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #40 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/40/])
HBASE-9858 Integration test and LoadTestTool support for cell Visibility 
(anoopsamjohn: rev 1555145)
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/test/LoadTestDataGenerator.java
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestIngest.java
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestIngestWithTags.java
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestIngestWithVisibilityLabels.java
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestLazyCfLoading.java
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/StripeCompactionsPerformanceEvaluation.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityLabelFilter.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/LoadTestDataGeneratorWithVisibilityLabels.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/LoadTestDataGeneratorWithTags.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedAction.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedReader.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedUpdater.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriter.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/MultiThreadedWriterBase.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/RestartMetaTest.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadParallel.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/test
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/test/LoadTestDataGenerator.java


> Integration test and LoadTestTool support for cell Visibility
> -
>
> Key: HBASE-9858
> URL: https://issues.apache.org/jira/browse/HBASE-9858
> Project: HBase
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-9858.patch, HBASE-9858_V2.patch, 
> HBASE-9858_V3.patch, HBASE-9858_V4.patch
>
>
> Cell level visibility should have an integration test and LoadTestTool 
> support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10264) [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861936#comment-13861936
 ] 

Hudson commented on HBASE-10264:


SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #40 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/40/])
HBASE-10264 CompactionTool in mapred mode is missing classes in its classpath 
(Himanshu Vashishtha) (ndimiduk: rev 1555178)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java


> [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath
> --
>
> Key: HBASE-10264
> URL: https://issues.apache.org/jira/browse/HBASE-10264
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, mapreduce
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Aleksandr Shulman
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBase-10264.patch
>
>
> Calling o.a.h.h.regionserver.CompactionTool fails due to classpath-related 
> issues in both MRv1 and MRv2.
> {code}hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
> -major hdfs://`hostname`:8020/hbase/data/default/orig_1388179858868{code}
> Results:
> {code}2013-12-27 13:31:49,478 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1388179525649_0011_m_00_2, Status : FAILED
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.TableInfoMissingException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:115)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:231)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:207)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160){code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error

2014-01-03 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861927#comment-13861927
 ] 

Jimmy Xiang commented on HBASE-10210:
-

Yes, it is ok with me.

> during master startup, RS can be you-are-dead-ed by master in error
> ---
>
> Key: HBASE-10210
> URL: https://issues.apache.org/jira/browse/HBASE-10210
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, 
> HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, 
> HBASE-10210.patch
>
>
> Not sure of the root cause yet, I am at "how did this ever work" stage.
> We see this problem in 0.96.1, but didn't in 0.96.0 + some patches.
> It looks like RS information arriving from 2 sources - ZK and server itself, 
> can conflict. Master doesn't handle such cases (timestamp match), and anyway 
> technically timestamps can collide for two separate servers.
> So, master YouAreDead-s the already-recorded reporting RS, and adds it too. 
> Then it discovers that the new server has died with fatal error!
> Note the threads.
> Addition is called from master initialization and from RPC.
> {noformat}
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Finished waiting for region servers count to settle; checked in 2, slept for 
> 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running.
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Registering 
> server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered 
> server found up in zk but who has not yet reported in: 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Triggering server recovery; existingServer 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> looks stale, new 
> server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Master doesn't enable ServerShutdownHandler during 
> initialization, delay expiring server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> ...
> 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] 
> master.HMaster: Region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> reported a fatal error:
> ABORTING region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: 
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as 
> dead server
> {noformat}
> Presumably some of the recent ZK listener related changes b



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-8912) [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE

2014-01-03 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861925#comment-13861925
 ] 

Jean-Marc Spaggiari commented on HBASE-8912:


After the first restart, 36 regions are stuck in transition :( But not any 
server crashed.

What I did:
- Restored default balancer to make sure as much regions as possible will move.
- Stop/start HBase
- Run balancer from shell.

Every thing is back up after a 2nd restart.

I get many errors like this one:
{code}
2014-01-03 16:03:03,958 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received FAILED_OPEN for region b75cb9067c3c4456d6198c9237c143b3 from server 
node4.domain.com,60020,1388782921790 but region was in  the state 
page,rf.idua.www\x1Fhttp\x1F-1\x1F/fr/brand/fr/audi_fleet_solutions/contact/contact_transport_personnes.html\x1Fnull,1379103792232.b75cb9067c3c4456d6198c9237c143b3.
 state=CLOSED, ts=1388782983373, server=node4.domain.com,60020,1388782921790 
and not in OFFLINE, PENDING_OPEN or OPENING
{code}

After investigations, I figured that snappy was missing on a server. I fixed 
that, restart: All seems to be fine. So I restored my customized balancer, 
restart, balanced.

Still some warning in the logs:
{code}
2014-01-03 16:21:52,864 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region db8e67acde26bf340da481d3c1b934cd from server 
node4.domain.com,60020,1388784051197 but region was in  the state 
page,moc.tenretnigruoboc.www\x1Fhttp\x1F-1\x1F/cobourg-and-the-web\x1Fnull,1379103844627.db8e67acde26bf340da481d3c1b934cd.
 state=OPEN, ts=1388784100392, server=node4.distparser.com,60020,1388784051197 
and not in expected OFFLINE, PENDING_OPEN or OPENING states
{code}

But this time all the regions are assigned correctly.

I did that one more time (change balancer, stop, start, balance. Change 
balancer, stop, start, balance). I turned loglevel to warn.

{code}
2014-01-03 16:28:51,142 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region 17bee313797fc1ce982c0e31fdb6620c from server 
node8.domain.com,60020,1388784498327 but region was in  the state 
page,rf.ofniecnarf.www\x1Fhttp\x1F-1\x1F/vote/comment/27996/1/vote/zero_vote/c99b0992e5a9cd6bf3a4cfc91769ceeb\x1Fnull,1379104524006.17bee313797fc1ce982c0e31fdb6620c.
 state=OPEN, ts=1388784531048, server=node8.distparser.com,60020,1388784498327 
and not in expected OFFLINE, PENDING_OPEN or OPENING states
2014-01-03 16:28:52,135 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region 6dc6290df1855b319f60bf89faa3da41 from server 
node8.domain.com,60020,1388784498327 but region was in  the state 
page_crc,\x00\x00\x00\x00\xD7\xD9\x97\x8Bvideo.k-wreview.ca,1378042601904.6dc6290df1855b319f60bf89faa3da41.
 state=OPEN, ts=1388784531793, server=node8.distparser.com,60020,1388784498327 
and not in expected OFFLINE, PENDING_OPEN or OPENING states
2014-01-03 16:28:52,712 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region ec4f96b6cedd935aeba279b15d5337af from server 
node8.domain.com,60020,1388784498327 but region was in  the state 
work_proposed,\x98\xBF\xAF\x90\x00\x00\x00\x00http://feedproxy.google.com/~r/WheatWeeds/~3/Of24fZKcpco/the-eighth-day-of-christmas.html,1378975430143.ec4f96b6cedd935aeba279b15d5337af.
 state=OPEN, ts=1388784532540, server=node8.distparser.com,60020,1388784498327 
and not in expected OFFLINE, PENDING_OPEN or OPENING states
2014-01-03 16:28:52,747 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region 4f823b5de664556a89cbd86aa41cd0b0 from server 
node8.distparser.com,60020,1388784498327 but region was in  the state 
work_proposed,\x8D4K\xEA\x00\x00\x00\x00http://twitter.com/home?status=CartoonStock%3A++http%3A%2F%2Fwww%2Ecartoonstock%2Ecom%2Fdirectory%2Fc%2Fcream%5Ftea%5Fgifts%2Easp,1378681682935.4f823b5de664556a89cbd86aa41cd0b0.
 state=OPEN, ts=1388784532552, server=node8.distparser.com,60020,1388784498327 
and not in expected OFFLINE, PENDING_OPEN or OPENING states
2014-01-03 16:28:53,244 WARN org.apache.hadoop.hbase.master.AssignmentManager: 
Received OPENED for region da0bd0a6b7187f731fb34d4ac14ca279 from server 
node8.domain.com,60020,1388784498327 but region was in  the state 
work_proposed,\xB2\xE6\xB6\xBB\x00\x00\x00\x00http://www.canpages.ca/page/QC/notre-dame-des-prairies/concept-beton-design/4550984.html,1378737981443.da0bd0a6b7187f731fb34d4ac14ca279.
 state=OPEN, ts=1388784533203, server=node8.distparser.com,60020,1388784498327 
and not in expected OFFLINE, PENDING_OPEN or OPENING states
{code}

But everything finally got assigned without any restart required, any pretty 
quickly.

Logs from the last run:
{code}
2014-01-03 16:32:20,252 WARN org.apache.hadoop.ipc.HBaseServer: 
(responseTooSlow): {"processingtimems":10969,"call":"balance(), rpc version=1, 
client version=29, 
methodsFingerPrint=1886733559","client":"

[jira] [Commented] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error

2014-01-03 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861920#comment-13861920
 ] 

Enis Soztutar commented on HBASE-10210:
---

I think the last patch is good to go. Any more comments Jimmy, Liang? 

> during master startup, RS can be you-are-dead-ed by master in error
> ---
>
> Key: HBASE-10210
> URL: https://issues.apache.org/jira/browse/HBASE-10210
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, 
> HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, 
> HBASE-10210.patch
>
>
> Not sure of the root cause yet, I am at "how did this ever work" stage.
> We see this problem in 0.96.1, but didn't in 0.96.0 + some patches.
> It looks like RS information arriving from 2 sources - ZK and server itself, 
> can conflict. Master doesn't handle such cases (timestamp match), and anyway 
> technically timestamps can collide for two separate servers.
> So, master YouAreDead-s the already-recorded reporting RS, and adds it too. 
> Then it discovers that the new server has died with fatal error!
> Note the threads.
> Addition is called from master initialization and from RPC.
> {noformat}
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Finished waiting for region servers count to settle; checked in 2, slept for 
> 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running.
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Registering 
> server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered 
> server found up in zk but who has not yet reported in: 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Triggering server recovery; existingServer 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> looks stale, new 
> server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Master doesn't enable ServerShutdownHandler during 
> initialization, delay expiring server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> ...
> 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] 
> master.HMaster: Region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> reported a fatal error:
> ABORTING region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: 
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as 
> dead server
> {noformat}
> Presumably some of the recent ZK listener related changes b



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error

2014-01-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861914#comment-13861914
 ] 

Sergey Shelukhin commented on HBASE-10210:
--

should this be ok to commit?

> during master startup, RS can be you-are-dead-ed by master in error
> ---
>
> Key: HBASE-10210
> URL: https://issues.apache.org/jira/browse/HBASE-10210
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, 
> HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, 
> HBASE-10210.patch
>
>
> Not sure of the root cause yet, I am at "how did this ever work" stage.
> We see this problem in 0.96.1, but didn't in 0.96.0 + some patches.
> It looks like RS information arriving from 2 sources - ZK and server itself, 
> can conflict. Master doesn't handle such cases (timestamp match), and anyway 
> technically timestamps can collide for two separate servers.
> So, master YouAreDead-s the already-recorded reporting RS, and adds it too. 
> Then it discovers that the new server has died with fatal error!
> Note the threads.
> Addition is called from master initialization and from RPC.
> {noformat}
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Finished waiting for region servers count to settle; checked in 2, slept for 
> 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running.
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Registering 
> server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered 
> server found up in zk but who has not yet reported in: 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Triggering server recovery; existingServer 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> looks stale, new 
> server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Master doesn't enable ServerShutdownHandler during 
> initialization, delay expiring server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> ...
> 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] 
> master.HMaster: Region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> reported a fatal error:
> ABORTING region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: 
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as 
> dead server
> {noformat}
> Presumably some of the recent ZK listener related changes b



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10264) [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath

2014-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861910#comment-13861910
 ] 

Hudson commented on HBASE-10264:


SUCCESS: Integrated in HBase-TRUNK #4784 (See 
[https://builds.apache.org/job/HBase-TRUNK/4784/])
HBASE-10264 CompactionTool in mapred mode is missing classes in its classpath 
(Himanshu Vashishtha) (ndimiduk: rev 1555178)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java


> [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath
> --
>
> Key: HBASE-10264
> URL: https://issues.apache.org/jira/browse/HBASE-10264
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, mapreduce
>Affects Versions: 0.98.0, 0.99.0
>Reporter: Aleksandr Shulman
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: HBase-10264.patch
>
>
> Calling o.a.h.h.regionserver.CompactionTool fails due to classpath-related 
> issues in both MRv1 and MRv2.
> {code}hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
> -major hdfs://`hostname`:8020/hbase/data/default/orig_1388179858868{code}
> Results:
> {code}2013-12-27 13:31:49,478 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1388179525649_0011_m_00_2, Status : FAILED
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.TableInfoMissingException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:115)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:231)
>   at 
> org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:207)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160){code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9941) The context ClassLoader isn't set while calling into a coprocessor

2014-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861879#comment-13861879
 ] 

Hadoop QA commented on HBASE-9941:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621363/9941.patch
  against trunk revision .
  ATTACHMENT ID: 12621363

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8334//console

This message is automatically generated.

> The context ClassLoader isn't set while calling into a coprocessor
> --
>
> Key: HBASE-9941
> URL: https://issues.apache.org/jira/browse/HBASE-9941
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Affects Versions: 0.96.0
>Reporter: Benoit Sigoure
>Assignee: Andrew Purtell
> Fix For: 0.98.0
>
> Attachments: 9941.patch, 9941.patch, 9941.patch
>
>
> Whenever one of the methods of a coprocessor is invoked, the context 
> {{ClassLoader}} isn't set to be the {{CoprocessorClassLoader}}.  It's only 
> set properly when calling the coprocessor's {{start}} method.  This means 
> that if the coprocessor code attempts to load classes using the context 
> {{ClassLoader}}, it will fail to find the classes it's looking for.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10210) during master startup, RS can be you-are-dead-ed by master in error

2014-01-03 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10210:
---

Fix Version/s: 0.99.0
   0.98.0

> during master startup, RS can be you-are-dead-ed by master in error
> ---
>
> Key: HBASE-10210
> URL: https://issues.apache.org/jira/browse/HBASE-10210
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.1, 0.99.0, 0.96.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-10210.01.patch, HBASE-10210.02.patch, 
> HBASE-10210.03.patch, HBASE-10210.04.patch, HBASE-10210.05.patch, 
> HBASE-10210.patch
>
>
> Not sure of the root cause yet, I am at "how did this ever work" stage.
> We see this problem in 0.96.1, but didn't in 0.96.0 + some patches.
> It looks like RS information arriving from 2 sources - ZK and server itself, 
> can conflict. Master doesn't handle such cases (timestamp match), and anyway 
> technically timestamps can collide for two separate servers.
> So, master YouAreDead-s the already-recorded reporting RS, and adds it too. 
> Then it discovers that the new server has died with fatal error!
> Note the threads.
> Addition is called from master initialization and from RPC.
> {noformat}
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Finished waiting for region servers count to settle; checked in 2, slept for 
> 18262 ms, expecting minimum of 1, maximum of 2147483647, master is running.
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.ServerManager: 
> Registering 
> server=h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,290 INFO  
> [master:h2-ubuntu12-sec-1387431063-hbase-10:6] master.HMaster: Registered 
> server found up in zk but who has not yet reported in: 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Triggering server recovery; existingServer 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> looks stale, new 
> server:h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> 2013-12-19 11:16:45,380 INFO  [RpcServer.handler=4,port=6] 
> master.ServerManager: Master doesn't enable ServerShutdownHandler during 
> initialization, delay expiring server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800
> ...
> 2013-12-19 11:16:46,925 ERROR [RpcServer.handler=7,port=6] 
> master.HMaster: Region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 
> reported a fatal error:
> ABORTING region server 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800: 
> org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; 
> currently processing 
> h2-ubuntu12-sec-1387431063-hbase-8.cs1cloud.internal,60020,1387451803800 as 
> dead server
> {noformat}
> Presumably some of the recent ZK listener related changes b



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10078) Dynamic Filter - Not using DynamicClassLoader when using FilterList

2014-01-03 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861866#comment-13861866
 ] 

Jimmy Xiang commented on HBASE-10078:
-

The 0.94 patch fixed the issue in 0.94.  The trunk patch just added tests for 
using DynamicClassLoader in FilterList.

> Dynamic Filter - Not using DynamicClassLoader when using FilterList
> ---
>
> Key: HBASE-10078
> URL: https://issues.apache.org/jira/browse/HBASE-10078
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 0.94.13
>Reporter: Federico Gaule
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 0.94-10078.patch, hbase-10078.patch
>
>
> I've tried to use dynamic jar load 
> (https://issues.apache.org/jira/browse/HBASE-1936) but seems to have an issue 
> with FilterList. 
> Here is some log from my app where i send a Get with a FilterList containing 
> AFilter and other with BFilter.
> {noformat}
> 2013-12-02 13:55:42,564 DEBUG 
> org.apache.hadoop.hbase.util.DynamicClassLoader: Class d.p.AFilter not found 
> - using dynamical class loader
> 2013-12-02 13:55:42,564 DEBUG 
> org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class: d.p.AFilter
> 2013-12-02 13:55:42,564 DEBUG 
> org.apache.hadoop.hbase.util.DynamicClassLoader: Loading new jar files, if any
> 2013-12-02 13:55:42,677 DEBUG 
> org.apache.hadoop.hbase.util.DynamicClassLoader: Finding class again: 
> d.p.AFilter
> 2013-12-02 13:55:43,004 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
> Can't find class d.p.BFilter
> java.lang.ClassNotFoundException: d.p.BFilter
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:247)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.getClassByName(HbaseObjectWritable.java:792)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:679)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
>   at 
> org.apache.hadoop.hbase.filter.FilterList.readFields(FilterList.java:324)
>   at org.apache.hadoop.hbase.client.Get.readFields(Get.java:405)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
>   at org.apache.hadoop.hbase.client.Action.readFields(Action.java:101)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:594)
>   at 
> org.apache.hadoop.hbase.client.MultiAction.readFields(MultiAction.java:116)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:690)
>   at 
> org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:126)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1311)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1226)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:748)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:539)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:514)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> {noformat}
> AFilter is not found so it tries with DynamicClassLoader, but when it tries 
> to load AFilter, it uses URLClassLoader and fails without checking out for 
> dynamic jars.
> I think the issue is releated to FilterList#readFields
> {code:title=FilterList.java|borderStyle=solid} 
>  public void readFields(final DataInput in) throws IOException {
> byte opByte = in.readByte();
> operator = Operator.values()[opByte];
> int size = in.readInt();
> if (size > 0) {
>   filters = new ArrayList(size);
>   for (int i = 0; i < size; i++) {
> Filter filter = (Filter)Hbas

  1   2   >