[jira] [Commented] (HBASE-5829) Inconsistency between the "regions" map and the "servers" map in AssignmentManager

2012-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261353#comment-13261353
 ] 

Hadoop QA commented on HBASE-5829:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12524120/HBASE-5829-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 5 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestRegionRebalancing

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1643//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1643//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1643//console

This message is automatically generated.

> Inconsistency between the "regions" map and the "servers" map in 
> AssignmentManager
> --
>
> Key: HBASE-5829
> URL: https://issues.apache.org/jira/browse/HBASE-5829
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1
>Reporter: Maryann Xue
> Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch
>
>
> There are occurrences in AM where this.servers is not kept consistent with 
> this.regions. This might cause balancer to offline a region from the RS that 
> already returned NotServingRegionException at a previous offline attempt.
> In AssignmentManager.unassign(HRegionInfo, boolean)
> try {
>   // TODO: We should consider making this look more like it does for the
>   // region open where we catch all throwables and never abort
>   if (serverManager.sendRegionClose(server, state.getRegion(),
> versionOfClosingNode)) {
> LOG.debug("Sent CLOSE to " + server + " for region " +
>   region.getRegionNameAsString());
> return;
>   }
>   // This never happens. Currently regionserver close always return true.
>   LOG.warn("Server " + server + " region CLOSE RPC returned false for " +
> region.getRegionNameAsString());
> } catch (NotServingRegionException nsre) {
>   LOG.info("Server " + server + " returned " + nsre + " for " +
> region.getRegionNameAsString());
>   // Presume that master has stale data.  Presume remote side just split.
>   // Presume that the split message when it comes in will fix up the 
> master's
>   // in memory cluster state.
> } catch (Throwable t) {
>   if (t instanceof RemoteException) {
> t = ((RemoteException)t).unwrapRemoteException();
> if (t instanceof NotServingRegionException) {
>   if (checkIfRegionBelongsToDisabling(region)) {
> // Remove from the regionsinTransition map
> LOG.info("While trying to recover the table "
> + region.getTableNameAsString()
> + " to DISABLED state the region " + region
> + " was offlined but the table was in DISABLING state");
> synchronized (this.regionsInTransition) {
>   this.regionsInTransition.remove(region.getEncodedName());
> }
> // Remove from the regionsMap
> synchronized (this.regions) {
>   this.regions.remove(region);
> }
> deleteClosingOrClosedNode(region);
>   }
> }
> // RS is already processing this region, only need to update the 
> timestamp
> if (t instanceof RegionAlreadyInTransitionException) {
>   LOG.debug("update " + state + " the timestamp.");
>   state.update(state.getState());
> }
>   }
> In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
> boolean)
>   synchronized (this.regions) {
> this.regions.put(plan.getRegionInfo(), plan.getDestination());
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrat

[jira] [Created] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-5873:
-

 Summary: TimeOut Monitor thread should be started after atleast 
one region server registers.
 Key: HBASE-5873
 URL: https://issues.apache.org/jira/browse/HBASE-5873
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0


Currently timeout monitor thread is started even before the region server has 
registered with the master.
In timeout monitor we depend on the region server to be online 
{code}
boolean allRSsOffline = this.serverManager.getOnlineServersList().
isEmpty();
{code}

Now when the master starts up it sees there are no online servers and hence 
sets 
allRSsOffline to true.
{code}
setAllRegionServersOffline(allRSsOffline);
{code}
So this.allRegionServersOffline is also true.
By this time an RS has come up,
Now timeout comes up again (after 10secs) in the next cycle he sees 
allRSsOffline  as false.
Hence 
{code}
else if (this.allRegionServersOffline && !allRSsOffline) {
// if some RSs just came back online, we can start the
// the assignment right away
actOnTimeOut(regionState);
{code}
This condition makes him to take action based on timeout.
Because of this even if one Region assignment of ROOT is going on, this piece 
of code triggers another assignment and thus we get RegionAlreadyinTransition 
Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5873:
--

Priority: Minor  (was: Major)

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5757) TableInputFormat should handle as many errors as possible

2012-04-25 Thread Jan Lukavsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Lukavsky updated HBASE-5757:


Summary: TableInputFormat should handle as many errors as possible  (was: 
TableInputFormat should handle as much errors as possible)

> TableInputFormat should handle as many errors as possible
> -
>
> Key: HBASE-5757
> URL: https://issues.apache.org/jira/browse/HBASE-5757
> Project: HBase
>  Issue Type: Bug
>  Components: mapred, mapreduce
>Affects Versions: 0.90.6
>Reporter: Jan Lukavsky
> Attachments: HBASE-5757.patch
>
>
> Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
> scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
> handling so that if exception is caught a reconnect is attempted (without 
> bothering the mapred client). After that, HBASE-4269 changed this behavior 
> back, but in both mapred and mapreduce APIs. The question is, is there any 
> reason not to handle all errors that the input format can handle? In other 
> words, why not try to reissue the request after *any* IOException? I see the 
> following disadvantages of current approach
>  * the client may see exceptions like LeaseException and 
> ScannerTimeoutException if he fails to process all fetched data in timeout
>  * to avoid ScannerTimeoutException the client must raise 
> hbase.regionserver.lease.period
>  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
> seems to me a bit redundant, because typically one needs to update both these 
> parameters
>  * I don't see any possibility to get rid of LeaseException (this is 
> configured on server side)
> I think all of these issues would be gone, if the DoNotRetryIOException would 
> not be rethrown. -On the other hand, handling errors in InputFormat has 
> disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
> very big scanner.caching, and I manage to process only a few rows in timeout, 
> I will end up with single row being fetched many times (and will not be 
> explicitly notified about this). Could we solve this problem by adding some 
> counter to the InputFormat?-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5874) The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException.

2012-04-25 Thread fulin wang (JIRA)
fulin wang created HBASE-5874:
-

 Summary: The HBase do not configure the 'fs.default.name' 
attribute, the hbck tool and Merge tool throw IllegalArgumentException.
 Key: HBASE-5874
 URL: https://issues.apache.org/jira/browse/HBASE-5874
 Project: HBase
  Issue Type: Bug
  Components: hbck
Reporter: fulin wang


The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
Merge tool throw IllegalArgumentException.
the hbck tool and Merge tool, we should add 'fs.default.name' attriubte to the 
code.

hbck exception:
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:128)
at 
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:301)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:489)
at 
org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegioninfo(HBaseFsck.java:565)
at 
org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:596)
at 
org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:332)
at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:360)
at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2907)

Merge exception:  
[2012-05-05 10:48:24,830] [ERROR] [main] [org.apache.hadoop.hbase.util.Merge 
381] exiting due to error
java.lang.IllegalArgumentException: Wrong FS: 
hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:823)
at 
org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:415)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2679)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2665)
at 
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2634)
at 
org.apache.hadoop.hbase.util.MetaUtils.openMetaRegion(MetaUtils.java:276)
at 
org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:261)
at org.apache.hadoop.hbase.util.Merge.run(Merge.java:115)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.hbase.util.Merge.main(Merge.java:379)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5871) Usability regression, we don't parse compression algos anymore

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261420#comment-13261420
 ] 

Hudson commented on HBASE-5871:
---

Integrated in HBase-TRUNK #2811 (See 
[https://builds.apache.org/job/HBase-TRUNK/2811/])
HBASE-5871 Usability regression, we don't parse compression algos anymore 
(Revision 1330123)

 Result = FAILURE
larsh : 
Files : 
* /hbase/trunk/src/main/ruby/hbase/admin.rb


> Usability regression, we don't parse compression algos anymore
> --
>
> Key: HBASE-5871
> URL: https://issues.apache.org/jira/browse/HBASE-5871
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5871-0.92.txt, 5871-0.94.txt, 5871-trunk.txt
>
>
> It seems that string with 0.92.0 we can't create tables in the shell by 
> specifying "lzo" anymore. I remember we used to do better parsing than that, 
> but right now if you follow the wiki doing this:
> bq. create 'mytable', {NAME=>'colfam:', COMPRESSION=>'lzo'}
> You'll get:
> bq. ERROR: java.lang.IllegalArgumentException: No enum const class 
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.lzo
> Bad for usability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261421#comment-13261421
 ] 

Hudson commented on HBASE-5849:
---

Integrated in HBase-TRUNK #2811 (See 
[https://builds.apache.org/job/HBase-TRUNK/2811/])
HBASE-5849 On first cluster startup, RS aborts if root znode is not 
available; REAPPLY (Revision 1330116)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java


> On first cluster startup, RS aborts if root znode is not available
> --
>
> Key: HBASE-5849
> URL: https://issues.apache.org/jira/browse/HBASE-5849
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver, zookeeper
>Affects Versions: 0.92.2, 0.96.0, 0.94.1
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.92.2, 0.94.0
>
> Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, 
> HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, 
> HBASE-5849_v4.patch
>
>
> When launching a fresh new cluster, the master has to be started first, which 
> might create race conditions for starting master and rs at the same time. 
> Master startup code is smt like this: 
>  - establish zk connection
>  - create root znodes in zk (/hbase)
>  - create ephemeral node for master /hbase/master, 
>  Region server start up code is smt like this: 
>  - establish zk connection
>  - check whether the root znode (/hbase) is there. If not, shutdown. 
>  - wait for the master to create znodes /hbase/master
> So, the problem is on the very first launch of the cluster, RS aborts to 
> start since /hbase znode might not have been created yet (only the master 
> creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
> subsequent cluster starts, it does not matter which order the servers are 
> started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5871) Usability regression, we don't parse compression algos anymore

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261422#comment-13261422
 ] 

Hudson commented on HBASE-5871:
---

Integrated in HBase-0.94 #148 (See 
[https://builds.apache.org/job/HBase-0.94/148/])
HBASE-5871 Usability regression, we don't parse compression algos anymore 
(Revision 1330124)

 Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/src/main/ruby/hbase/admin.rb


> Usability regression, we don't parse compression algos anymore
> --
>
> Key: HBASE-5871
> URL: https://issues.apache.org/jira/browse/HBASE-5871
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5871-0.92.txt, 5871-0.94.txt, 5871-trunk.txt
>
>
> It seems that string with 0.92.0 we can't create tables in the shell by 
> specifying "lzo" anymore. I remember we used to do better parsing than that, 
> but right now if you follow the wiki doing this:
> bq. create 'mytable', {NAME=>'colfam:', COMPRESSION=>'lzo'}
> You'll get:
> bq. ERROR: java.lang.IllegalArgumentException: No enum const class 
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.lzo
> Bad for usability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261423#comment-13261423
 ] 

Hudson commented on HBASE-5849:
---

Integrated in HBase-0.94 #148 (See 
[https://builds.apache.org/job/HBase-0.94/148/])
HBASE-5849 On first cluster startup, RS aborts if root znode is not 
available; REAPPLY (Revision 1330117)

 Result = SUCCESS
stack : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java


> On first cluster startup, RS aborts if root znode is not available
> --
>
> Key: HBASE-5849
> URL: https://issues.apache.org/jira/browse/HBASE-5849
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver, zookeeper
>Affects Versions: 0.92.2, 0.96.0, 0.94.1
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.92.2, 0.94.0
>
> Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, 
> HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, 
> HBASE-5849_v4.patch
>
>
> When launching a fresh new cluster, the master has to be started first, which 
> might create race conditions for starting master and rs at the same time. 
> Master startup code is smt like this: 
>  - establish zk connection
>  - create root znodes in zk (/hbase)
>  - create ephemeral node for master /hbase/master, 
>  Region server start up code is smt like this: 
>  - establish zk connection
>  - check whether the root znode (/hbase) is there. If not, shutdown. 
>  - wait for the master to create znodes /hbase/master
> So, the problem is on the very first launch of the cluster, RS aborts to 
> start since /hbase znode might not have been created yet (only the master 
> creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
> subsequent cluster starts, it does not matter which order the servers are 
> started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5874) The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException.

2012-04-25 Thread fulin wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fulin wang updated HBASE-5874:
--

Affects Version/s: 0.90.6

> The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
> Merge tool throw IllegalArgumentException.
> 
>
> Key: HBASE-5874
> URL: https://issues.apache.org/jira/browse/HBASE-5874
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.90.6
>Reporter: fulin wang
>
> The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
> Merge tool throw IllegalArgumentException.
> the hbck tool and Merge tool, we should add 'fs.default.name' attriubte to 
> the code.
> hbck exception:
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
> file:///
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:128)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:301)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:489)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegioninfo(HBaseFsck.java:565)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:596)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:332)
>   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:360)
>   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2907)
> 
> Merge exception:  
> [2012-05-05 10:48:24,830] [ERROR] [main] [org.apache.hadoop.hbase.util.Merge 
> 381] exiting due to error
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
> file:///
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:823)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:415)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2679)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2665)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2634)
>   at 
> org.apache.hadoop.hbase.util.MetaUtils.openMetaRegion(MetaUtils.java:276)
>   at 
> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:261)
>   at org.apache.hadoop.hbase.util.Merge.run(Merge.java:115)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>   at org.apache.hadoop.hbase.util.Merge.main(Merge.java:379)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-25 Thread Jieshan Bean (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean reassigned HBASE-5611:
---

Assignee: Jieshan Bean

> Replayed edits from regions that failed to open during recovery aren't 
> removed from the global MemStore size
> 
>
> Key: HBASE-5611
> URL: https://issues.apache.org/jira/browse/HBASE-5611
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: Jean-Daniel Cryans
>Assignee: Jieshan Bean
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1
>
>
> This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
> it's still possible to hit it if a region fails to open for more obscure 
> reasons like HDFS errors.
> Consider a region that just went through distributed splitting and that's now 
> being opened by a new RS. The first thing it does is to read the recovery 
> files and put the edits in the {{MemStores}}. If this process takes a long 
> time, the master will move that region away. At that point the edits are 
> still accounted for in the global {{MemStore}} size but they are dropped when 
> the {{HRegion}} gets cleaned up. It's completely invisible until the 
> {{MemStoreFlusher}} needs to force flush a region and that none of them have 
> edits:
> {noformat}
> 2012-03-21 00:33:39,303 DEBUG 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
> because memory above low water=5.9g
> 2012-03-21 00:33:39,303 ERROR 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
> for entry null
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The {{null}} here is a region. In my case I had so many edits in the 
> {{MemStore}} during recovery that I'm over the low barrier although in fact 
> I'm at 0. It happened yesterday and it still printing this out.
> To fix this we need to be able to decrease the global {{MemStore}} size when 
> the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-04-25 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-5875:
-

 Summary: Process RIT and Master restart may remove an online 
server considering it as a dead server
 Key: HBASE-5875
 URL: https://issues.apache.org/jira/browse/HBASE-5875
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.94.0


If on master restart it finds the ROOT/META to be in RIT state, master tries to 
assign the ROOT region through ProcessRIT.

Master will trigger the assignment and next will try to verify the Root Region 
Location.
Root region location verification is done seeing if the RS has the region in 
its online list.
If the master triggered assignment has not yet been completed in RS then the 
verify root region location will fail.
Because it failed 
{code}
splitLogAndExpireIfOnline(currentRootServer);
{code}
we do split log and also remove the server from online server list. Ideally 
here there is nothing to do in splitlog as no region server was restarted.

So master, though the server is online, master just invalidates the region 
server.
In a special case, if i have only one RS then my cluster will become non 
operative.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5874) The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException.

2012-04-25 Thread fulin wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fulin wang updated HBASE-5874:
--

Attachment: HBASE-5874-0.90.patch

Tests Errors  Failures Skipped Success Rate Time 
16 0 0 0 100% 243.016 

Running org.apache.hadoop.hbase.util.TestMergeTool
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.765 sec

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0




> The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
> Merge tool throw IllegalArgumentException.
> 
>
> Key: HBASE-5874
> URL: https://issues.apache.org/jira/browse/HBASE-5874
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.90.6
>Reporter: fulin wang
> Attachments: HBASE-5874-0.90.patch
>
>
> The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
> Merge tool throw IllegalArgumentException.
> the hbck tool and Merge tool, we should add 'fs.default.name' attriubte to 
> the code.
> hbck exception:
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
> file:///
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:128)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:301)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:489)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegioninfo(HBaseFsck.java:565)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:596)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:332)
>   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:360)
>   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2907)
> 
> Merge exception:  
> [2012-05-05 10:48:24,830] [ERROR] [main] [org.apache.hadoop.hbase.util.Merge 
> 381] exiting due to error
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
> file:///
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:823)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:415)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2679)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2665)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2634)
>   at 
> org.apache.hadoop.hbase.util.MetaUtils.openMetaRegion(MetaUtils.java:276)
>   at 
> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:261)
>   at org.apache.hadoop.hbase.util.Merge.run(Merge.java:115)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>   at org.apache.hadoop.hbase.util.Merge.main(Merge.java:379)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5874) The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException.

2012-04-25 Thread fulin wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fulin wang updated HBASE-5874:
--

Attachment: (was: HBASE-5874-0.90.patch)

> The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
> Merge tool throw IllegalArgumentException.
> 
>
> Key: HBASE-5874
> URL: https://issues.apache.org/jira/browse/HBASE-5874
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.90.6
>Reporter: fulin wang
> Attachments: HBASE-5874-0.90.patch
>
>
> The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
> Merge tool throw IllegalArgumentException.
> the hbck tool and Merge tool, we should add 'fs.default.name' attriubte to 
> the code.
> hbck exception:
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
> file:///
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:128)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:301)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:489)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegioninfo(HBaseFsck.java:565)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:596)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:332)
>   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:360)
>   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2907)
> 
> Merge exception:  
> [2012-05-05 10:48:24,830] [ERROR] [main] [org.apache.hadoop.hbase.util.Merge 
> 381] exiting due to error
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
> file:///
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:823)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:415)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2679)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2665)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2634)
>   at 
> org.apache.hadoop.hbase.util.MetaUtils.openMetaRegion(MetaUtils.java:276)
>   at 
> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:261)
>   at org.apache.hadoop.hbase.util.Merge.run(Merge.java:115)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>   at org.apache.hadoop.hbase.util.Merge.main(Merge.java:379)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5874) The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException.

2012-04-25 Thread fulin wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fulin wang updated HBASE-5874:
--

Attachment: HBASE-5874-0.90.patch

> The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
> Merge tool throw IllegalArgumentException.
> 
>
> Key: HBASE-5874
> URL: https://issues.apache.org/jira/browse/HBASE-5874
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.90.6
>Reporter: fulin wang
> Attachments: HBASE-5874-0.90.patch
>
>
> The HBase do not configure the 'fs.default.name' attribute, the hbck tool and 
> Merge tool throw IllegalArgumentException.
> the hbck tool and Merge tool, we should add 'fs.default.name' attriubte to 
> the code.
> hbck exception:
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
> file:///
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:128)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:301)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:489)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegioninfo(HBaseFsck.java:565)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:596)
>   at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:332)
>   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:360)
>   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2907)
> 
> Merge exception:  
> [2012-05-05 10:48:24,830] [ERROR] [main] [org.apache.hadoop.hbase.util.Merge 
> 381] exiting due to error
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: 
> file:///
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:823)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:415)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2679)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2665)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2634)
>   at 
> org.apache.hadoop.hbase.util.MetaUtils.openMetaRegion(MetaUtils.java:276)
>   at 
> org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:261)
>   at org.apache.hadoop.hbase.util.Merge.run(Merge.java:115)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>   at org.apache.hadoop.hbase.util.Merge.main(Merge.java:379)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261447#comment-13261447
 ] 

Jonathan Hsieh commented on HBASE-5870:
---

In trunk, this bug was introduced in HBASE-5760.

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261469#comment-13261469
 ] 

Jonathan Hsieh commented on HBASE-5870:
---

@LarsH Hm, looks like this got committed here as HBASE-5848:

{code}

r1330105 | larsh | 2012-04-24 22:07:15 -0700 (Tue, 24 Apr 2012) | 1 line

HBASE-5848 Addendum
{code} 

Maybe we should revert that and then recommit with a more appropriate commit 
message?



> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5871) Usability regression, we don't parse compression algos anymore

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261474#comment-13261474
 ] 

Hudson commented on HBASE-5871:
---

Integrated in HBase-0.92 #390 (See 
[https://builds.apache.org/job/HBase-0.92/390/])
HBASE-5871 Usability regression, we don't parse compression algos anymore 
(Revision 1330122)

 Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/ruby/hbase/admin.rb


> Usability regression, we don't parse compression algos anymore
> --
>
> Key: HBASE-5871
> URL: https://issues.apache.org/jira/browse/HBASE-5871
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5871-0.92.txt, 5871-0.94.txt, 5871-trunk.txt
>
>
> It seems that string with 0.92.0 we can't create tables in the shell by 
> specifying "lzo" anymore. I remember we used to do better parsing than that, 
> but right now if you follow the wiki doing this:
> bq. create 'mytable', {NAME=>'colfam:', COMPRESSION=>'lzo'}
> You'll get:
> bq. ERROR: java.lang.IllegalArgumentException: No enum const class 
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.lzo
> Bad for usability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261475#comment-13261475
 ] 

Hudson commented on HBASE-5849:
---

Integrated in HBase-0.92 #390 (See 
[https://builds.apache.org/job/HBase-0.92/390/])
HBASE-5849 On first cluster startup, RS aborts if root znode is not 
available; REAPPLY (Revision 1330119)
HBASE-5849 On first cluster startup, RS aborts if root znode is not available; 
REAPPLY (Revision 1330118)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> On first cluster startup, RS aborts if root znode is not available
> --
>
> Key: HBASE-5849
> URL: https://issues.apache.org/jira/browse/HBASE-5849
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver, zookeeper
>Affects Versions: 0.92.2, 0.96.0, 0.94.1
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.92.2, 0.94.0
>
> Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, 
> HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, 
> HBASE-5849_v4.patch
>
>
> When launching a fresh new cluster, the master has to be started first, which 
> might create race conditions for starting master and rs at the same time. 
> Master startup code is smt like this: 
>  - establish zk connection
>  - create root znodes in zk (/hbase)
>  - create ephemeral node for master /hbase/master, 
>  Region server start up code is smt like this: 
>  - establish zk connection
>  - check whether the root znode (/hbase) is there. If not, shutdown. 
>  - wait for the master to create znodes /hbase/master
> So, the problem is on the very first launch of the cluster, RS aborts to 
> start since /hbase znode might not have been created yet (only the master 
> creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
> subsequent cluster starts, it does not matter which order the servers are 
> started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5861) Hadoop 23 compilation broken due to tests introduced in HBASE-5604

2012-04-25 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261477#comment-13261477
 ] 

Jonathan Hsieh commented on HBASE-5861:
---

Oops, commit message should say HBASE-5604 instead of HBASE-5064.

> Hadoop 23 compilation broken due to tests introduced in HBASE-5604
> --
>
> Key: HBASE-5861
> URL: https://issues.apache.org/jira/browse/HBASE-5861
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5861-v4.patch, 5861.txt, hbase-5861-jon.patch, 
> hbase-5861-v2.patch, hbase-5861-v3.patch
>
>
> When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of 
> compilation error messages:
> {code}
> jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests
> ...
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 18.926s
> [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012
> [INFO] Final Memory: 55M/555M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure: Compilation 
> failure:
> [ERROR] 
> /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46]
>  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
> [ERROR] 
> [ERROR] 
> /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29]
>  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
> [ERROR] 
> [ERROR] 
> /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46]
>  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
> [ERROR] 
> [ERROR] 
> /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29]
>  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
> [ERROR] 
> [ERROR] 
> /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29]
>  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
> [ERROR] 
> [ERROR] 
> /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29]
>  org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be 
> instantiated
> [ERROR] -> [Help 1]
> {code}
> Upon further investigation this issue is due to code introduced in HBASE-5064 
> and is also present in trunk.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-04-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reassigned HBASE-5875:
-

Assignee: ramkrishna.s.vasudevan

> Process RIT and Master restart may remove an online server considering it as 
> a dead server
> --
>
> Key: HBASE-5875
> URL: https://issues.apache.org/jira/browse/HBASE-5875
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.0
>
>
> If on master restart it finds the ROOT/META to be in RIT state, master tries 
> to assign the ROOT region through ProcessRIT.
> Master will trigger the assignment and next will try to verify the Root 
> Region Location.
> Root region location verification is done seeing if the RS has the region in 
> its online list.
> If the master triggered assignment has not yet been completed in RS then the 
> verify root region location will fail.
> Because it failed 
> {code}
> splitLogAndExpireIfOnline(currentRootServer);
> {code}
> we do split log and also remove the server from online server list. Ideally 
> here there is nothing to do in splitlog as no region server was restarted.
> So master, though the server is online, master just invalidates the region 
> server.
> In a special case, if i have only one RS then my cluster will become non 
> operative.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5806) Handle split region related failures on master restart and RS restart

2012-04-25 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam reassigned HBASE-5806:
---

Assignee: Chinna Rao Lalam

> Handle split region related failures on master restart and RS restart
> -
>
> Key: HBASE-5806
> URL: https://issues.apache.org/jira/browse/HBASE-5806
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: ramkrishna.s.vasudevan
>Assignee: Chinna Rao Lalam
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
>
> This issue is raised to solve issues that comes out of partial region split 
> happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and 
> RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261648#comment-13261648
 ] 

Zhihong Yu commented on HBASE-5848:
---

Looks like the addendum wasn't applied to trunk.

> Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
> abort
> 
>
> Key: HBASE-5848
> URL: https://issues.apache.org/jira/browse/HBASE-5848
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
> 5848-addendum-v4.txt, 5848-addendum-v5.txt, 5848-addendum-v6.txt, 
> 5848-addendum-v7.txt, 5848-addendum-v7.txt, HBASE-5848.patch, 
> HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch
>
>
> A coworker of mine just had this scenario. It does not make sense the 
> EMPTY_START_ROW as splitKey (since the region with the empty start key is 
> implicit), but it should not cause the HMaster to abort.
> The abort happens because it tries to bulk assign the same region twice and 
> then runs into race conditions with ZK.
> The same would (presumably) happen when two identical split keys are passed, 
> but the client blocks that. The simplest solution here is to also block 
> passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5864:
--

Attachment: HBASE-5864_3.patch

> Error while reading from hfile in 0.94
> --
>
> Key: HBASE-5864
> URL: https://issues.apache.org/jira/browse/HBASE-5864
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
> HBASE-5864_3.patch, HBASE-5864_test.patch
>
>
> Got the following stacktrace during region split.
> {noformat}
> 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
> Failed getting store size for value
> java.io.IOException: Requested block is out of range: 2906737606134037404, 
> lastDataBlockOffset: 84764558
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
>   at 
> org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261654#comment-13261654
 ] 

ramkrishna.s.vasudevan commented on HBASE-5864:
---

Updated the patch.  All test cases passed.  Verified a scenario where i had 
multiple level indexes and was able to get the correct midkey.
Please review.

> Error while reading from hfile in 0.94
> --
>
> Key: HBASE-5864
> URL: https://issues.apache.org/jira/browse/HBASE-5864
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
> HBASE-5864_3.patch, HBASE-5864_test.patch
>
>
> Got the following stacktrace during region split.
> {noformat}
> 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
> Failed getting store size for value
> java.io.IOException: Requested block is out of range: 2906737606134037404, 
> lastDataBlockOffset: 84764558
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
>   at 
> org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the "regions" map and the "servers" map in AssignmentManager

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261655#comment-13261655
 ] 

Zhihong Yu commented on HBASE-5829:
---

Patch makes sense.
w.r.t. this.servers, I found a useless statement (at least in trunk):
{code}
  void unassignCatalogRegions() {
this.servers.entrySet();
{code}
that should be removed.

> Inconsistency between the "regions" map and the "servers" map in 
> AssignmentManager
> --
>
> Key: HBASE-5829
> URL: https://issues.apache.org/jira/browse/HBASE-5829
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1
>Reporter: Maryann Xue
> Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch
>
>
> There are occurrences in AM where this.servers is not kept consistent with 
> this.regions. This might cause balancer to offline a region from the RS that 
> already returned NotServingRegionException at a previous offline attempt.
> In AssignmentManager.unassign(HRegionInfo, boolean)
> try {
>   // TODO: We should consider making this look more like it does for the
>   // region open where we catch all throwables and never abort
>   if (serverManager.sendRegionClose(server, state.getRegion(),
> versionOfClosingNode)) {
> LOG.debug("Sent CLOSE to " + server + " for region " +
>   region.getRegionNameAsString());
> return;
>   }
>   // This never happens. Currently regionserver close always return true.
>   LOG.warn("Server " + server + " region CLOSE RPC returned false for " +
> region.getRegionNameAsString());
> } catch (NotServingRegionException nsre) {
>   LOG.info("Server " + server + " returned " + nsre + " for " +
> region.getRegionNameAsString());
>   // Presume that master has stale data.  Presume remote side just split.
>   // Presume that the split message when it comes in will fix up the 
> master's
>   // in memory cluster state.
> } catch (Throwable t) {
>   if (t instanceof RemoteException) {
> t = ((RemoteException)t).unwrapRemoteException();
> if (t instanceof NotServingRegionException) {
>   if (checkIfRegionBelongsToDisabling(region)) {
> // Remove from the regionsinTransition map
> LOG.info("While trying to recover the table "
> + region.getTableNameAsString()
> + " to DISABLED state the region " + region
> + " was offlined but the table was in DISABLING state");
> synchronized (this.regionsInTransition) {
>   this.regionsInTransition.remove(region.getEncodedName());
> }
> // Remove from the regionsMap
> synchronized (this.regions) {
>   this.regions.remove(region);
> }
> deleteClosingOrClosedNode(region);
>   }
> }
> // RS is already processing this region, only need to update the 
> timestamp
> if (t instanceof RegionAlreadyInTransitionException) {
>   LOG.debug("update " + state + " the timestamp.");
>   state.update(state.getState());
> }
>   }
> In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
> boolean)
>   synchronized (this.regions) {
> this.regions.put(plan.getRegionInfo(), plan.getDestination());
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261686#comment-13261686
 ] 

Hadoop QA commented on HBASE-5864:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524272/HBASE-5864_3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 5 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestRegionRebalancing

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1644//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1644//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1644//console

This message is automatically generated.

> Error while reading from hfile in 0.94
> --
>
> Key: HBASE-5864
> URL: https://issues.apache.org/jira/browse/HBASE-5864
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
> HBASE-5864_3.patch, HBASE-5864_test.patch
>
>
> Got the following stacktrace during region split.
> {noformat}
> 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
> Failed getting store size for value
> java.io.IOException: Requested block is out of range: 2906737606134037404, 
> lastDataBlockOffset: 84764558
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
>   at 
> org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5840) Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing the old status

2012-04-25 Thread rajeshbabu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-5840:
--

Attachment: HBASE-5840_v2.patch

> Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing 
> the old status
> --
>
> Key: HBASE-5840
> URL: https://issues.apache.org/jira/browse/HBASE-5840
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-5840.patch, HBASE-5840_v2.patch
>
>
> TaskMonitor Status will not be cleared in case Regions FAILED_OPEN. This will 
> keeps showing old status.
> This will miss leads the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread rajeshbabu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-5873:
--

Attachment: HBASE-5873.patch

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread rajeshbabu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-5873:
--

Status: Patch Available  (was: Open)

Attached patch.Please review and provide your comments/suggestions.


> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261713#comment-13261713
 ] 

Hadoop QA commented on HBASE-5873:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524287/HBASE-5873.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1645//console

This message is automatically generated.

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261717#comment-13261717
 ] 

Lars Hofhansl commented on HBASE-5870:
--

Lemme clean up this mess.

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261719#comment-13261719
 ] 

Lars Hofhansl commented on HBASE-5870:
--

Reverted.

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261721#comment-13261721
 ] 

Lars Hofhansl commented on HBASE-5848:
--

Yep, I committed the wrong the patch.
Reverted over in HBASE-5870, and committed the right patch. Sorry about that.

> Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
> abort
> 
>
> Key: HBASE-5848
> URL: https://issues.apache.org/jira/browse/HBASE-5848
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
> 5848-addendum-v4.txt, 5848-addendum-v5.txt, 5848-addendum-v6.txt, 
> 5848-addendum-v7.txt, 5848-addendum-v7.txt, HBASE-5848.patch, 
> HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch
>
>
> A coworker of mine just had this scenario. It does not make sense the 
> EMPTY_START_ROW as splitKey (since the region with the empty start key is 
> implicit), but it should not cause the HMaster to abort.
> The abort happens because it tries to bulk assign the same region twice and 
> then runs into race conditions with ZK.
> The same would (presumably) happen when two identical split keys are passed, 
> but the client blocks that. The simplest solution here is to also block 
> passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261722#comment-13261722
 ] 

Lars Hofhansl commented on HBASE-5870:
--

Test run looked good, though. So I'm +1 on commit.


> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261724#comment-13261724
 ] 

Lars Hofhansl commented on HBASE-5848:
--

Also double checked 0.94. All's good now.

> Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
> abort
> 
>
> Key: HBASE-5848
> URL: https://issues.apache.org/jira/browse/HBASE-5848
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
> 5848-addendum-v4.txt, 5848-addendum-v5.txt, 5848-addendum-v6.txt, 
> 5848-addendum-v7.txt, 5848-addendum-v7.txt, HBASE-5848.patch, 
> HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch
>
>
> A coworker of mine just had this scenario. It does not make sense the 
> EMPTY_START_ROW as splitKey (since the region with the empty start key is 
> implicit), but it should not cause the HMaster to abort.
> The abort happens because it tries to bulk assign the same region twice and 
> then runs into race conditions with ZK.
> The same would (presumably) happen when two identical split keys are passed, 
> but the client blocks that. The simplest solution here is to also block 
> passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5806) Handle split region related failures on master restart and RS restart

2012-04-25 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HBASE-5806:


Attachment: HBASE-5806.patch

> Handle split region related failures on master restart and RS restart
> -
>
> Key: HBASE-5806
> URL: https://issues.apache.org/jira/browse/HBASE-5806
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: ramkrishna.s.vasudevan
>Assignee: Chinna Rao Lalam
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-5806.patch
>
>
> This issue is raised to solve issues that comes out of partial region split 
> happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and 
> RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

2012-04-25 Thread Chinna Rao Lalam (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261730#comment-13261730
 ] 

Chinna Rao Lalam commented on HBASE-5806:
-

Here considered following scenarios:

1.If the regionserver is restarted after removing the parent region from the 
online regions.
2.If Master is restarted while doing the region split in RS and it is in the 
flow of tickling from SPLIT-SPLIT or SPLITTING-SPLIT.
3.If Master is restarted after splitting is completely done and before deleting 
the region from META using catalogjanitor.

In the first scenario the problem is, in ServerShutdownHandler while 
constructing hris it will check whether it is in RIT and is !isClosing and 
!isPendingClose and it will remove from hris. Remaining hris it will try to 
assign and no one will attempt to assign that region.

{code}
  // Skip regions that were in transition unless CLOSING or PENDING_CLOSE
  for (RegionState rit : regionsInTransition) {
if (!rit.isClosing() && !rit.isPendingClose() && !rit.isSplitting()) {
  LOG.debug("Removed " + rit.getRegion().getRegionNameAsString() +
  " from list of regions to assign because in RIT; region state: " +
  rit.getState());
  if (hris != null) hris.remove(rit.getRegion());
}
  }
{code}


In the second scenario the problem is, in AssignmentManager while 
rebuildUserRegions() it should not consider the region which will have the 
znode with Split or Splitting state because region split might be completed or 
partially done.

In the third scenario the problem is, region split is completely done and it is 
not yet deleted from META using catalogjanitor so in 
AssignmentManager while rebuildUserRegions() it should not consider this region.

> Handle split region related failures on master restart and RS restart
> -
>
> Key: HBASE-5806
> URL: https://issues.apache.org/jira/browse/HBASE-5806
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: ramkrishna.s.vasudevan
>Assignee: Chinna Rao Lalam
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-5806.patch
>
>
> This issue is raised to solve issues that comes out of partial region split 
> happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and 
> RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

2012-04-25 Thread Chinna Rao Lalam (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261732#comment-13261732
 ] 

Chinna Rao Lalam commented on HBASE-5806:
-

I uploaded initial patch not yet tested. I am analyzing more on this.

> Handle split region related failures on master restart and RS restart
> -
>
> Key: HBASE-5806
> URL: https://issues.apache.org/jira/browse/HBASE-5806
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: ramkrishna.s.vasudevan
>Assignee: Chinna Rao Lalam
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-5806.patch
>
>
> This issue is raised to solve issues that comes out of partial region split 
> happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and 
> RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5732) Remove the SecureRPCEngine and merge the security-related logic in the core engine

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261738#comment-13261738
 ] 

Zhihong Yu commented on HBASE-5732:
---

Since AccessController and TokenProvider coprocessors remain after this merge, 
my point was that we need to keep security profile for running the unit tests 
related to these coprocessors.

> Remove the SecureRPCEngine and merge the security-related logic in the core 
> engine
> --
>
> Key: HBASE-5732
> URL: https://issues.apache.org/jira/browse/HBASE-5732
> Project: HBase
>  Issue Type: Improvement
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: rpcengine-merge.3.patch, rpcengine-merge.patch
>
>
> Remove the SecureRPCEngine and merge the security-related logic in the core 
> engine. Follow up to HBASE-5727.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu reassigned HBASE-5870:
-

Assignee: Zhihong Yu

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261742#comment-13261742
 ] 

rajeshbabu commented on HBASE-5873:
---

This patch uploaded for 0.94

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-25 Thread Jieshan Bean (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5611:


Attachment: HBASE-5611-trunk.patch

Patch for review. Thanks.

> Replayed edits from regions that failed to open during recovery aren't 
> removed from the global MemStore size
> 
>
> Key: HBASE-5611
> URL: https://issues.apache.org/jira/browse/HBASE-5611
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: Jean-Daniel Cryans
>Assignee: Jieshan Bean
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-5611-trunk.patch
>
>
> This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
> it's still possible to hit it if a region fails to open for more obscure 
> reasons like HDFS errors.
> Consider a region that just went through distributed splitting and that's now 
> being opened by a new RS. The first thing it does is to read the recovery 
> files and put the edits in the {{MemStores}}. If this process takes a long 
> time, the master will move that region away. At that point the edits are 
> still accounted for in the global {{MemStore}} size but they are dropped when 
> the {{HRegion}} gets cleaned up. It's completely invisible until the 
> {{MemStoreFlusher}} needs to force flush a region and that none of them have 
> edits:
> {noformat}
> 2012-03-21 00:33:39,303 DEBUG 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
> because memory above low water=5.9g
> 2012-03-21 00:33:39,303 ERROR 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
> for entry null
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The {{null}} here is a region. In my case I had so many edits in the 
> {{MemStore}} during recovery that I'm over the low barrier although in fact 
> I'm at 0. It happened yesterday and it still printing this out.
> To fix this we need to be able to decrease the global {{MemStore}} size when 
> the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-25 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261749#comment-13261749
 ] 

jirapos...@reviews.apache.org commented on HBASE-5625:
--



bq.  On 2012-04-02 17:34:38, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 531
bq.  > 
bq.  >
bq.  > How do we know this buffer is big enough?  Maybe should add an 
override that takes an offset into the buffer?
bq.  
bq.  Tudor Scurtu wrote:
bq.  Added overload. Added exception comment for when there is insufficient 
space remaining in the buffer.
bq.  
bq.  Michael Stack wrote:
bq.  So, the way this works, we just allocate N and hope that stuff fits 
inside N?  If it doesn't we throw an exception?  There is no correlation 
between data that comes across and the N allocation?

Please see below.


bq.  On 2012-04-02 17:34:38, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/KeyValue.java, line 616
bq.  > 
bq.  >
bq.  > How do I know the buffer is big enough?
bq.  
bq.  Tudor Scurtu wrote:
bq.  Added exception comment for when there is insufficient space remaining 
in the buffer. Is that what you meant?
bq.  
bq.  Michael Stack wrote:
bq.  I am not understanding how the allocation works.   It seems arbitrary 
unrelated to the actual result size that comes over from the server.  Is that 
so?  If so, it seems unfriendly throwing an exception when allocated size and 
what is returned from the server do not match.

Added check with reallocation in 'Result.binarySearch()'. For this I had to add 
two methods in 'KeyValue' that calculate the number of bytes that are taken up 
in a 'KeyValue' object's underlying buffer ('getKeyValueDataStructureSize()' 
and 'getKeyDataStructureSize()'). Is this ok, and if so, how about replacing 
all manual calculations of these values in the project with calls to the new 
methods?


bq.  On 2012-04-02 17:34:38, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 297
bq.  > 
bq.  >
bq.  > This comment should be on the @return javadoc rather than here.
bq.  
bq.  Tudor Scurtu wrote:
bq.  This was copied from original. Should I remove both occurences?
bq.  
bq.  Michael Stack wrote:
bq.  Sorry.  I did not notice it was problem on original.   If you can fix 
it, that'd be sweet.

Fixed.


bq.  On 2012-04-02 17:34:38, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/Result.java, line 431
bq.  > 
bq.  >
bq.  > Why we have both isNonEmptyColumn and isEmptyColumn?  Why not just 
one and then check return with a !?
bq.  
bq.  Tudor Scurtu wrote:
bq.  They are not complementary.
bq.containsColumn = value exists
bq.containsEmptyColumn = value exists & is empty byte array
bq.containsNonEmptyColumn = value exists & is not empty byte array
bq.  The value could be missing, in which case all methods would return 
false.
bq.  
bq.  Michael Stack wrote:
bq.  OK.  If not clear from comments, please add your notes above.  Will 
help those that come after.

Added.


- Tudor


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4607/#review6623
---


On 2012-04-04 17:08:03, Tudor Scurtu wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4607/
bq.  ---
bq.  
bq.  (Updated 2012-04-04 17:08:03)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  When calling Result.getValue(), an extra dummy KeyValue and its associated 
underlying byte array are allocated, as well as a persistent buffer that will 
contain the returned value.
bq.  
bq.  These can be avoided by reusing a static array for the dummy object and by 
passing a ByteBuffer object as a value destination buffer to the read method.
bq.  
bq.  
bq.  This addresses bug HBASE-5625.
bq.  https://issues.apache.org/jira/browse/HBASE-5625
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/KeyValue.java 243d76f 
bq.src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef 
bq.src/test/java/org/apache/hadoop/hbase/TestKeyValue.java fae6902 
bq.src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 
bq.  
bq.  Diff: https://reviews.apache.org/r/4607/diff
bq.  
bq.  
bq.

[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-25 Thread Tudor Scurtu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tudor Scurtu updated HBASE-5625:


Attachment: 5625v7.txt

Added check with reallocation in 'Result.binarySearch()'.

> Avoid byte buffer allocations when reading a value from a Result object
> ---
>
> Key: HBASE-5625
> URL: https://issues.apache.org/jira/browse/HBASE-5625
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.92.1
>Reporter: Tudor Scurtu
>Assignee: Tudor Scurtu
>  Labels: patch
> Fix For: 0.96.0
>
> Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 
> 5625v5.txt, 5625v6.txt, 5625v7.txt
>
>
> When calling Result.getValue(), an extra dummy KeyValue and its associated 
> underlying byte array are allocated, as well as a persistent buffer that will 
> contain the returned value.
> These can be avoided by reusing a static array for the dummy object and by 
> passing a ByteBuffer object as a value destination buffer to the read method.
> The current functionality is maintained, and we have added a separate method 
> call stack that employs the described changes. I will provide more details 
> with the patch.
> Running tests with a profiler, the reduction of read time seems to be of up 
> to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261747#comment-13261747
 ] 

Zhihong Yu commented on HBASE-5870:
---

I ran the patch against 0.23 profile.
I got one test failure:
{code}
testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)  Time 
elapsed: 2.583 sec  <<< ERROR!
java.io.FileNotFoundException: File does not exist: 
/Users/zhihyu/.m2/repository/org/apache/hadoop/hadoop-common/0.23.2-SNAPSHOT/hadoop-common-0.23.2-SNAPSHOT.jar
{code}
But the jar was there:
{code}
-rw-r--r--  1 zhihyu  110088321  1768854 Apr 24 11:23 
/Users/zhihyu/.m2/repository/org/apache/hadoop/hadoop-common/0.23.2-SNAPSHOT/hadoop-common-0.23.2-SNAPSHOT.jar
{code}

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-25 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261750#comment-13261750
 ] 

jirapos...@reviews.apache.org commented on HBASE-5625:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4607/
---

(Updated 2012-04-25 16:01:29.035293)


Review request for hbase.


Changes
---

Added check with reallocation in 'Result.binarySearch()'.


Summary
---

When calling Result.getValue(), an extra dummy KeyValue and its associated 
underlying byte array are allocated, as well as a persistent buffer that will 
contain the returned value.

These can be avoided by reusing a static array for the dummy object and by 
passing a ByteBuffer object as a value destination buffer to the read method.


This addresses bug HBASE-5625.
https://issues.apache.org/jira/browse/HBASE-5625


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/KeyValue.java 9ae9e02 
  src/main/java/org/apache/hadoop/hbase/client/Result.java df0b3ef 
  src/test/java/org/apache/hadoop/hbase/TestKeyValue.java 786d2df 
  src/test/java/org/apache/hadoop/hbase/client/TestResult.java f9e29c2 

Diff: https://reviews.apache.org/r/4607/diff


Testing
---

Added value check to TestResult#testBasic and TestResult.testMultiVersion.


Thanks,

Tudor



> Avoid byte buffer allocations when reading a value from a Result object
> ---
>
> Key: HBASE-5625
> URL: https://issues.apache.org/jira/browse/HBASE-5625
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.92.1
>Reporter: Tudor Scurtu
>Assignee: Tudor Scurtu
>  Labels: patch
> Fix For: 0.96.0
>
> Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 
> 5625v5.txt, 5625v6.txt, 5625v7.txt
>
>
> When calling Result.getValue(), an extra dummy KeyValue and its associated 
> underlying byte array are allocated, as well as a persistent buffer that will 
> contain the returned value.
> These can be avoided by reusing a static array for the dummy object and by 
> passing a ByteBuffer object as a value destination buffer to the read method.
> The current functionality is maintained, and we have added a separate method 
> call stack that employs the described changes. I will provide more details 
> with the patch.
> Running tests with a profiler, the reduction of read time seems to be of up 
> to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5873:
-

Status: Open  (was: Patch Available)

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5873:
-

Attachment: 5873-trunk.txt

Trunk patch for HadoopQA

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5873:
-

Status: Patch Available  (was: Open)

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5840) Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing the old status

2012-04-25 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261751#comment-13261751
 ] 

rajeshbabu commented on HBASE-5840:
---

@Stack 
Thanks for your review. 
bq.Do you have to convert the Exception to an IOE? WHy is that? What does this 
method let out? IOEs only? If so, why we catch Exception? In case its a 
non-checked exception?
Refactored the code in such a way setting status to abort without handling any 
exceptions from initialize.
bq. it looks good too but in the finally you might want to use the new 
HRegion.closeHRegion(region) to clean up the wal log

{code}
region = HRegion.newHRegion(path, null, fs, conf, info, htd, null);
{code}
As am passing null for WAL the close region does not try to do any operation 
related to wal closing.  But i have added it as per your suggestion as it does 
no harm.

> Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing 
> the old status
> --
>
> Key: HBASE-5840
> URL: https://issues.apache.org/jira/browse/HBASE-5840
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-5840.patch, HBASE-5840_v2.patch
>
>
> TaskMonitor Status will not be cleared in case Regions FAILED_OPEN. This will 
> keeps showing old status.
> This will miss leads the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261755#comment-13261755
 ] 

Lars Hofhansl commented on HBASE-5873:
--

Oops, rajesh, we crossed comments.

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5853) java.lang.RuntimeException: readObject can't find class org.apache.hadoop.hdfs.protocol.HdfsFileStatus

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261758#comment-13261758
 ] 

Lars Hofhansl commented on HBASE-5853:
--

@jiafeng: Can you tell us what exactly you did when this happened?

> java.lang.RuntimeException: readObject can't find class 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus
> --
>
> Key: HBASE-5853
> URL: https://issues.apache.org/jira/browse/HBASE-5853
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.1
> Environment: hadoop-0.23.1 hbase-0.92.1 
>Reporter: jiafeng.zhang
> Fix For: 0.92.1, 0.94.0
>
>
> 2012-04-23 12:51:07,474 WARN org.apache.hadoop.ipc.Client: Unexpected error 
> reading responses on connection Thread[IPC Client (1260987126) connection to 
> server121/172.16.40.121:9000 from smp,5,main]
> java.lang.RuntimeException: readObject can't find class 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus
>   at 
> org.apache.hadoop.io.ObjectWritable.loadClass(ObjectWritable.java:372)
>   at 
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:223)
>   at 
> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:832)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:756)
> Caused by: java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus not found
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1151)
>   at 
> org.apache.hadoop.io.ObjectWritable.loadClass(ObjectWritable.java:368)
>   ... 4 more
> 2012-04-23 12:51:07,797 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> server124,60020,1335152900476: Replay of HLog required. Forcing server 
> shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region: 
> hbase_cdr,e0072b2b-5e19-431f-bb69-a6427765eac4,1334902272934.8365a7cbf90dd558f297d70224113c8a.
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1278)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1162)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1104)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:400)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:202)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: Failed on local exception: 
> java.io.IOException: Error reading responses; Host Details : local host is: 
> "server124/172.16.40.124"; destination host is: ""server121":9000; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:724)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1094)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:193)
>   at $Proxy10.getFileInfo(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:100)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:65)
>   at $Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1172)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:725)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:449)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:473)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:595)
>   at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:506)
>   at org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:89)
>   at 
> org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:1905)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1254)
>   ... 6 more
> Caused by: java.io.IOException: Error reading responses
>   at org.apache.hadoop.ipc.Client$Connection.run(

[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261759#comment-13261759
 ] 

Lars Hofhansl commented on HBASE-5875:
--

Can we move this to 0.94.1?

> Process RIT and Master restart may remove an online server considering it as 
> a dead server
> --
>
> Key: HBASE-5875
> URL: https://issues.apache.org/jira/browse/HBASE-5875
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.0
>
>
> If on master restart it finds the ROOT/META to be in RIT state, master tries 
> to assign the ROOT region through ProcessRIT.
> Master will trigger the assignment and next will try to verify the Root 
> Region Location.
> Root region location verification is done seeing if the RS has the region in 
> its online list.
> If the master triggered assignment has not yet been completed in RS then the 
> verify root region location will fail.
> Because it failed 
> {code}
> splitLogAndExpireIfOnline(currentRootServer);
> {code}
> we do split log and also remove the server from online server list. Ideally 
> here there is nothing to do in splitlog as no region server was restarted.
> So master, though the server is online, master just invalidates the region 
> server.
> In a special case, if i have only one RS then my cluster will become non 
> operative.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread rajeshbabu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu reassigned HBASE-5873:
-

Assignee: rajeshbabu

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261763#comment-13261763
 ] 

Zhihong Yu commented on HBASE-5873:
---

+1 if tests pass.

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261769#comment-13261769
 ] 

Jonathan Hsieh commented on HBASE-5870:
---

I ran into something similar but it was complaining about a ZK jar -- when 
testing HBASE-5861.  I think the error is from trying to find the jar in hdfs 
for the MR job.

Is this failure it consistent?

My guess is that something in HBASE-5760 is causing this.

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261773#comment-13261773
 ] 

Zhihong Yu commented on HBASE-5870:
---

The failure is consistent:
{code}
testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)  Time 
elapsed: 2.552 sec  <<< ERROR!
java.io.FileNotFoundException: File does not exist: 
/Users/zhihyu/.m2/repository/org/apache/hadoop/hadoop-common/0.23.2-SNAPSHOT/hadoop-common-0.23.2-SNAPSHOT.jar
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:729)
  at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
  at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
  at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246)
  at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284)
  at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355)
  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221)
  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
  at org.apache.hadoop.mapreduce.Job.submit(Job.java:1218)
  at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1239)
  at 
org.apache.hadoop.hbase.mapreduce.TestImportExport.testSimpleCase(TestImportExport.java:114)
{code}

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261792#comment-13261792
 ] 

Hadoop QA commented on HBASE-5625:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524295/5625v7.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.util.TestHBaseFsck

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1646//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1646//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1646//console

This message is automatically generated.

> Avoid byte buffer allocations when reading a value from a Result object
> ---
>
> Key: HBASE-5625
> URL: https://issues.apache.org/jira/browse/HBASE-5625
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.92.1
>Reporter: Tudor Scurtu
>Assignee: Tudor Scurtu
>  Labels: patch
> Fix For: 0.96.0
>
> Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 
> 5625v5.txt, 5625v6.txt, 5625v7.txt
>
>
> When calling Result.getValue(), an extra dummy KeyValue and its associated 
> underlying byte array are allocated, as well as a persistent buffer that will 
> contain the returned value.
> These can be avoided by reusing a static array for the dummy object and by 
> passing a ByteBuffer object as a value destination buffer to the read method.
> The current functionality is maintained, and we have added a separate method 
> call stack that employs the described changes. I will provide more details 
> with the patch.
> Running tests with a profiler, the reduction of read time seems to be of up 
> to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261796#comment-13261796
 ] 

Hudson commented on HBASE-5848:
---

Integrated in HBase-TRUNK #2812 (See 
[https://builds.apache.org/job/HBase-TRUNK/2812/])
HBASE-5848 Addendum, try 2 (Revision 1330349)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java


> Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
> abort
> 
>
> Key: HBASE-5848
> URL: https://issues.apache.org/jira/browse/HBASE-5848
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
> 5848-addendum-v4.txt, 5848-addendum-v5.txt, 5848-addendum-v6.txt, 
> 5848-addendum-v7.txt, 5848-addendum-v7.txt, HBASE-5848.patch, 
> HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch
>
>
> A coworker of mine just had this scenario. It does not make sense the 
> EMPTY_START_ROW as splitKey (since the region with the empty start key is 
> implicit), but it should not cause the HMaster to abort.
> The abort happens because it tries to bulk assign the same region twice and 
> then runs into race conditions with ZK.
> The same would (presumably) happen when two identical split keys are passed, 
> but the client blocks that. The simplest solution here is to also block 
> passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261799#comment-13261799
 ] 

Zhihong Yu commented on HBASE-5611:
---

{code}
+  // global memstore size once a region opening failed.
{code}
'region opening failed' -> 'region failed opening'.
{code}
+  private final ConcurrentMap replayEditsPerRegion = 
{code}
Do we need HRegionInfo as the key to the Map ? Can we use region name ?
For rollbackRegionReplayEditsSize():
{code}
+  addAndGetGlobalMemstoreSize(-replayEdistsSize.get());
+  clearRegionReplayEditsSize(hri);
{code}
I suggest remembering the value of -replayEdistsSize.get() in a variable so 
that we can exchange the order of the two statements above and return directly 
from the if block.
If replayEdistsSize is null, would that indicate certain race condition ?

> Replayed edits from regions that failed to open during recovery aren't 
> removed from the global MemStore size
> 
>
> Key: HBASE-5611
> URL: https://issues.apache.org/jira/browse/HBASE-5611
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: Jean-Daniel Cryans
>Assignee: Jieshan Bean
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-5611-trunk.patch
>
>
> This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
> it's still possible to hit it if a region fails to open for more obscure 
> reasons like HDFS errors.
> Consider a region that just went through distributed splitting and that's now 
> being opened by a new RS. The first thing it does is to read the recovery 
> files and put the edits in the {{MemStores}}. If this process takes a long 
> time, the master will move that region away. At that point the edits are 
> still accounted for in the global {{MemStore}} size but they are dropped when 
> the {{HRegion}} gets cleaned up. It's completely invisible until the 
> {{MemStoreFlusher}} needs to force flush a region and that none of them have 
> edits:
> {noformat}
> 2012-03-21 00:33:39,303 DEBUG 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
> because memory above low water=5.9g
> 2012-03-21 00:33:39,303 ERROR 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
> for entry null
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The {{null}} here is a region. In my case I had so many edits in the 
> {{MemStore}} during recovery that I'm over the low barrier although in fact 
> I'm at 0. It happened yesterday and it still printing this out.
> To fix this we need to be able to decrease the global {{MemStore}} size when 
> the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object

2012-04-25 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261801#comment-13261801
 ] 

Jimmy Xiang commented on HBASE-5625:


@Ted, the 13% performance gain is from 1 iterations.

The big uncertainty is the unknown size to pre-allocate. Is it possible not to 
copy of value?
For example, return an immutable wrap to the original value?

> Avoid byte buffer allocations when reading a value from a Result object
> ---
>
> Key: HBASE-5625
> URL: https://issues.apache.org/jira/browse/HBASE-5625
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 0.92.1
>Reporter: Tudor Scurtu
>Assignee: Tudor Scurtu
>  Labels: patch
> Fix For: 0.96.0
>
> Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 
> 5625v5.txt, 5625v6.txt, 5625v7.txt
>
>
> When calling Result.getValue(), an extra dummy KeyValue and its associated 
> underlying byte array are allocated, as well as a persistent buffer that will 
> contain the returned value.
> These can be avoided by reusing a static array for the dummy object and by 
> passing a ByteBuffer object as a value destination buffer to the read method.
> The current functionality is maintained, and we have added a separate method 
> call stack that employs the described changes. I will provide more details 
> with the patch.
> Running tests with a profiler, the reduction of read time seems to be of up 
> to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261802#comment-13261802
 ] 

Hadoop QA commented on HBASE-5873:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524297/5873-trunk.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 5 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1647//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1647//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1647//console

This message is automatically generated.

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-04-25 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5875:
--

Fix Version/s: (was: 0.94.0)
   0.94.1

Updated to 0.94.1.  

> Process RIT and Master restart may remove an online server considering it as 
> a dead server
> --
>
> Key: HBASE-5875
> URL: https://issues.apache.org/jira/browse/HBASE-5875
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.1
>
>
> If on master restart it finds the ROOT/META to be in RIT state, master tries 
> to assign the ROOT region through ProcessRIT.
> Master will trigger the assignment and next will try to verify the Root 
> Region Location.
> Root region location verification is done seeing if the RS has the region in 
> its online list.
> If the master triggered assignment has not yet been completed in RS then the 
> verify root region location will fail.
> Because it failed 
> {code}
> splitLogAndExpireIfOnline(currentRootServer);
> {code}
> we do split log and also remove the server from online server list. Ideally 
> here there is nothing to do in splitlog as no region server was restarted.
> So master, though the server is online, master just invalidates the region 
> server.
> In a special case, if i have only one RS then my cluster will become non 
> operative.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-04-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261814#comment-13261814
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-5875 at 4/25/12 5:13 PM:


Updated to 0.94.1.  
{Edit} I will come up with a patch in another couple of days. {Edit}

  was (Author: ram_krish):
Updated to 0.94.1.  
  
> Process RIT and Master restart may remove an online server considering it as 
> a dead server
> --
>
> Key: HBASE-5875
> URL: https://issues.apache.org/jira/browse/HBASE-5875
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.1
>
>
> If on master restart it finds the ROOT/META to be in RIT state, master tries 
> to assign the ROOT region through ProcessRIT.
> Master will trigger the assignment and next will try to verify the Root 
> Region Location.
> Root region location verification is done seeing if the RS has the region in 
> its online list.
> If the master triggered assignment has not yet been completed in RS then the 
> verify root region location will fail.
> Because it failed 
> {code}
> splitLogAndExpireIfOnline(currentRootServer);
> {code}
> we do split log and also remove the server from online server list. Ideally 
> here there is nothing to do in splitlog as no region server was restarted.
> So master, though the server is online, master just invalidates the region 
> server.
> In a special case, if i have only one RS then my cluster will become non 
> operative.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261816#comment-13261816
 ] 

ramkrishna.s.vasudevan commented on HBASE-5611:
---

@Lars
Want this in 0.94.0? 

> Replayed edits from regions that failed to open during recovery aren't 
> removed from the global MemStore size
> 
>
> Key: HBASE-5611
> URL: https://issues.apache.org/jira/browse/HBASE-5611
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: Jean-Daniel Cryans
>Assignee: Jieshan Bean
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-5611-trunk.patch
>
>
> This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
> it's still possible to hit it if a region fails to open for more obscure 
> reasons like HDFS errors.
> Consider a region that just went through distributed splitting and that's now 
> being opened by a new RS. The first thing it does is to read the recovery 
> files and put the edits in the {{MemStores}}. If this process takes a long 
> time, the master will move that region away. At that point the edits are 
> still accounted for in the global {{MemStore}} size but they are dropped when 
> the {{HRegion}} gets cleaned up. It's completely invisible until the 
> {{MemStoreFlusher}} needs to force flush a region and that none of them have 
> edits:
> {noformat}
> 2012-03-21 00:33:39,303 DEBUG 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
> because memory above low water=5.9g
> 2012-03-21 00:33:39,303 ERROR 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
> for entry null
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The {{null}} here is a region. In my case I had so many edits in the 
> {{MemStore}} during recovery that I'm over the low barrier although in fact 
> I'm at 0. It happened yesterday and it still printing this out.
> To fix this we need to be able to decrease the global {{MemStore}} size when 
> the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4554) Allow set/unset coprocessor table attributes from shell.

2012-04-25 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-4554:
---

Attachment: HBase-4554-trunk-0.92-3.patch

Attaching the patch that was committed to the ticket for persistence.

(Disclaimer: I'm not the author of the patch. I merely grabbed it from 
https://reviews.apache.org/r/2350/diff/raw/ so that it exists as an object in 
the issue tracker. I'm hence not checking the grant button.)

> Allow set/unset coprocessor table attributes from shell.
> 
>
> Key: HBASE-4554
> URL: https://issues.apache.org/jira/browse/HBASE-4554
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Reporter: Mingjie Lai
>Assignee: Mingjie Lai
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBase-4554-trunk-0.92-3.patch
>
>
> Table/region level coprocessor -- RegionObserver -- can be configured by 
> setting a HTD's attribute which matches Coprocessor$*. 
> Current shell -- alter -- cannot support to set/unset a table's arbitrary 
> attribute. We need it in order to configure region level coprocessors to a 
> table. 
> Proposed new shell:
> {code}
> hbase shell > alter 't1', METHOD => 'table_att', COPROCESSOR$1 => 
> 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|'
> hbase shell > describe 't1'
>  {NAME => 't1', COPROCESSOR$1 => 
> 'hdfs://cp/foo.jar|org.apache.hadoop.hbase.sample|1|', MAX_FILESIZE => 
> '134217728', ...}
> hbase shell > alter 't1', METHOD => 'table_att_unset', COPROCESSOR$1
> hbase shell > describe 't1'
>  {NAME => 't1', MAX_FILESIZE => '134217728', ...}
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5870:
--

Attachment: 5870-v2.txt

Patch v2 fills in obtainJobConf() for MapreduceV2Shim.

getJobTrackerConf() creates a new JobConf. So setting config param in the 
returned JobConf is not effective.

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870-v2.txt, 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-25 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261831#comment-13261831
 ] 

dhruba borthakur commented on HBASE-5864:
-

Thanks for finding this ramkrishna. I can see how the bug is occuring, very 
good analysis and thanks for finding it. I m trying to digest the fix you are 
providing.

> Error while reading from hfile in 0.94
> --
>
> Key: HBASE-5864
> URL: https://issues.apache.org/jira/browse/HBASE-5864
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
> HBASE-5864_3.patch, HBASE-5864_test.patch
>
>
> Got the following stacktrace during region split.
> {noformat}
> 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
> Failed getting store size for value
> java.io.IOException: Requested block is out of range: 2906737606134037404, 
> lastDataBlockOffset: 84764558
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
>   at 
> org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261845#comment-13261845
 ] 

Zhihong Yu commented on HBASE-5870:
---

Even in build #136 TestImportExport failed, due to a different exception:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-on-Hadoop-23/136/testReport/org.apache.hadoop.hbase.mapreduce/TestImportExport/org_apache_hadoop_hbase_mapreduce_TestImportExport/

I suggest checking in patch v2 and investigate TestImportExport using another 
JIRA.

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870-v2.txt, 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5732) Remove the SecureRPCEngine and merge the security-related logic in the core engine

2012-04-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261853#comment-13261853
 ] 

Devaraj Das commented on HBASE-5732:


In the last patch, I missed adding the AuthenticationTokenSecretManager 
instantiation in the default RPC engine (and a security unit test failed). I've 
taken that into consideration now..

Andrew, how about instantiating the AuthenticationTokenSecretManager (that has 
dependency on ZK) only if isSecurityEnabled() returns true.. The problem with 
this is that the unit tests also won't instantiate the manager.. for unit 
tests, maybe we can have a minimal RpcEngine implementation that returns a 
Server object that internally instantiates the AuthenticationTokenSecretManager 
unconditionally.. Would that work? 

> Remove the SecureRPCEngine and merge the security-related logic in the core 
> engine
> --
>
> Key: HBASE-5732
> URL: https://issues.apache.org/jira/browse/HBASE-5732
> Project: HBase
>  Issue Type: Improvement
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Attachments: rpcengine-merge.3.patch, rpcengine-merge.patch
>
>
> Remove the SecureRPCEngine and merge the security-related logic in the core 
> engine. Follow up to HBASE-5727.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4821) A fully automated comprehensive distributed integration test for HBase

2012-04-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261855#comment-13261855
 ] 

Enis Soztutar commented on HBASE-4821:
--

DD and I also want to commit some resources into developing/maintaining/running 
such tests. We are also willing to allocate some  cluster resources into 
running the tests for extended periods of time. 

@Mikhail, do you have anything planned yet? To go further with this, I think a 
short test design doc would be a great start, wdyt? 

@Keith, @Stack, do you think we should port goraci inside hbase or bigtop? 

@Roman, I love the idea that bigtop provides services for deployment, and 
running e2e (end to end) tests. But in my experience, maintaining the actual 
tests (code, logic, etc) will be a lot easier if the code resides inside hbase. 
Does bigtop provide that kind of use case?

> A fully automated comprehensive distributed integration test for HBase
> --
>
> Key: HBASE-4821
> URL: https://issues.apache.org/jira/browse/HBASE-4821
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Critical
>
> To properly verify that a particular version of HBase is good for production 
> deployment we need a better way to do real cluster testing after incremental 
> changes. Running unit tests is good, but we also need to deploy HBase to a 
> cluster, run integration tests, load tests, Thrift server tests, kill some 
> region servers, kill the master, and produce a report. All of this needs to 
> happen in 20-30 minutes with minimal manual intervention. I think this way we 
> can combine agile development with high stability of the codebase. I am 
> envisioning a high-level framework written in a scripting language (e.g. 
> Python) that would abstract external operations such as "deploy to test 
> cluster", "kill a particular server", "run load test A", "run load test B" 
> (we already have a few kinds of load tests implemented in Java, and we could 
> write a Thrift load test in Python). This tool should also produce 
> intermediate output, allowing to catch problems early and restart the test.
> No implementation has yet been done. Any ideas or suggestions are welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5611:
-

Fix Version/s: (was: 0.94.1)
   0.94.0

Yep... Looks bad.

> Replayed edits from regions that failed to open during recovery aren't 
> removed from the global MemStore size
> 
>
> Key: HBASE-5611
> URL: https://issues.apache.org/jira/browse/HBASE-5611
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: Jean-Daniel Cryans
>Assignee: Jieshan Bean
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5611-trunk.patch
>
>
> This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
> it's still possible to hit it if a region fails to open for more obscure 
> reasons like HDFS errors.
> Consider a region that just went through distributed splitting and that's now 
> being opened by a new RS. The first thing it does is to read the recovery 
> files and put the edits in the {{MemStores}}. If this process takes a long 
> time, the master will move that region away. At that point the edits are 
> still accounted for in the global {{MemStore}} size but they are dropped when 
> the {{HRegion}} gets cleaned up. It's completely invisible until the 
> {{MemStoreFlusher}} needs to force flush a region and that none of them have 
> edits:
> {noformat}
> 2012-03-21 00:33:39,303 DEBUG 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
> because memory above low water=5.9g
> 2012-03-21 00:33:39,303 ERROR 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
> for entry null
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The {{null}} here is a region. In my case I had so many edits in the 
> {{MemStore}} during recovery that I'm over the low barrier although in fact 
> I'm at 0. It happened yesterday and it still printing this out.
> To fix this we need to be able to decrease the global {{MemStore}} size when 
> the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4821) A fully automated comprehensive distributed integration test for HBase

2012-04-25 Thread Roman Shaposhnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261885#comment-13261885
 ] 

Roman Shaposhnik commented on HBASE-4821:
-

@Enis,

I had a really nice chat with Jon yesterday and we arrived at a common 
understanding that the tests in general fall into 3 distinct categories (please 
note that this categories classify test *implementation* and not whether they 
are used as part of the mvn test, mvn verify or Bigtop's test infra -- more on 
that later): 
  # pure unit tests -- they reach into the guts of the implementation and use 
non-public APIs. There's absolutely no way to run that testcode on anything but 
MiniHBase/MiniDFS/MiniMR. Bigtop has no role to play in helping HBase community 
with developing/maintaining/executing those tests.
  # HBase-specific functional tests -- these are the tests that only use public 
APIs and don't muck about with internals. They are, however, only concerned 
with HBase itself. IOW, a test that wants to verify that you can submit an 
Oozie workflow that has Hive->HBASE->Pig pipeline does not fall into this 
category
  # Integration tests -- these are the multi-component tests that exercise  not 
just HBase but a # of different components. An above example of the Oozie 
workflow falls into this category.

Here's how an ideal situation looks from Bigtop's perspective: 
  * you guys totally take care of #1 and you implement it as usual unit tests. 
  * Bigtop (with your help) takes care of #3. It simply makes no sense to 
reproduce the same infra at the HBase level.
  * A proposal on #2 is this -- these tests belong to HBase. However, they have 
to be clearly marked as belonging to the functional class AND they have to 
utilize a very thin shim layer so you can use them in an mvn verify context and 
we can reuse them in Bigtop running against a fully distributed beefy clusters. 
At this point I'm convinced that TestLoadAndVerify should be the first example 
of this class and it should reside in HBase codebase (yet be available to 
Bigtop).

Let me know if this makes sense.

> A fully automated comprehensive distributed integration test for HBase
> --
>
> Key: HBASE-4821
> URL: https://issues.apache.org/jira/browse/HBASE-4821
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Critical
>
> To properly verify that a particular version of HBase is good for production 
> deployment we need a better way to do real cluster testing after incremental 
> changes. Running unit tests is good, but we also need to deploy HBase to a 
> cluster, run integration tests, load tests, Thrift server tests, kill some 
> region servers, kill the master, and produce a report. All of this needs to 
> happen in 20-30 minutes with minimal manual intervention. I think this way we 
> can combine agile development with high stability of the codebase. I am 
> envisioning a high-level framework written in a scripting language (e.g. 
> Python) that would abstract external operations such as "deploy to test 
> cluster", "kill a particular server", "run load test A", "run load test B" 
> (we already have a few kinds of load tests implemented in Java, and we could 
> write a Thrift load test in Python). This tool should also produce 
> intermediate output, allowing to catch problems early and restart the test.
> No implementation has yet been done. Any ideas or suggestions are welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261887#comment-13261887
 ] 

Hadoop QA commented on HBASE-5870:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524309/5870-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 5 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestRegionRebalancing
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1648//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1648//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1648//console

This message is automatically generated.

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870-v2.txt, 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5871) Usability regression, we don't parse compression algos anymore

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261895#comment-13261895
 ] 

Hudson commented on HBASE-5871:
---

Integrated in HBase-0.94-security #21 (See 
[https://builds.apache.org/job/HBase-0.94-security/21/])
HBASE-5871 Usability regression, we don't parse compression algos anymore 
(Revision 1330124)

 Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/src/main/ruby/hbase/admin.rb


> Usability regression, we don't parse compression algos anymore
> --
>
> Key: HBASE-5871
> URL: https://issues.apache.org/jira/browse/HBASE-5871
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5871-0.92.txt, 5871-0.94.txt, 5871-trunk.txt
>
>
> It seems that string with 0.92.0 we can't create tables in the shell by 
> specifying "lzo" anymore. I remember we used to do better parsing than that, 
> but right now if you follow the wiki doing this:
> bq. create 'mytable', {NAME=>'colfam:', COMPRESSION=>'lzo'}
> You'll get:
> bq. ERROR: java.lang.IllegalArgumentException: No enum const class 
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.lzo
> Bad for usability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261896#comment-13261896
 ] 

Hudson commented on HBASE-5849:
---

Integrated in HBase-0.94-security #21 (See 
[https://builds.apache.org/job/HBase-0.94-security/21/])
HBASE-5849 On first cluster startup, RS aborts if root znode is not 
available; REAPPLY (Revision 1330117)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java


> On first cluster startup, RS aborts if root znode is not available
> --
>
> Key: HBASE-5849
> URL: https://issues.apache.org/jira/browse/HBASE-5849
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver, zookeeper
>Affects Versions: 0.92.2, 0.96.0, 0.94.1
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.92.2, 0.94.0
>
> Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, 
> HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, 
> HBASE-5849_v4.patch
>
>
> When launching a fresh new cluster, the master has to be started first, which 
> might create race conditions for starting master and rs at the same time. 
> Master startup code is smt like this: 
>  - establish zk connection
>  - create root znodes in zk (/hbase)
>  - create ephemeral node for master /hbase/master, 
>  Region server start up code is smt like this: 
>  - establish zk connection
>  - check whether the root znode (/hbase) is there. If not, shutdown. 
>  - wait for the master to create znodes /hbase/master
> So, the problem is on the very first launch of the cluster, RS aborts to 
> start since /hbase znode might not have been created yet (only the master 
> creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
> subsequent cluster starts, it does not matter which order the servers are 
> started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort

2012-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261897#comment-13261897
 ] 

Hudson commented on HBASE-5848:
---

Integrated in HBase-0.94-security #21 (See 
[https://builds.apache.org/job/HBase-0.94-security/21/])
HBASE-5848 Addendum (Revision 1330106)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java


> Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
> abort
> 
>
> Key: HBASE-5848
> URL: https://issues.apache.org/jira/browse/HBASE-5848
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 
> 5848-addendum-v4.txt, 5848-addendum-v5.txt, 5848-addendum-v6.txt, 
> 5848-addendum-v7.txt, 5848-addendum-v7.txt, HBASE-5848.patch, 
> HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch
>
>
> A coworker of mine just had this scenario. It does not make sense the 
> EMPTY_START_ROW as splitKey (since the region with the empty start key is 
> implicit), but it should not cause the HMaster to abort.
> The abort happens because it tries to bulk assign the same region twice and 
> then runs into race conditions with ZK.
> The same would (presumably) happen when two identical split keys are passed, 
> but the client blocks that. The simplest solution here is to also block 
> passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261908#comment-13261908
 ] 

Lars Hofhansl commented on HBASE-5873:
--

Looks good, tests pass. +1 from me.

> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5611:
-

Fix Version/s: (was: 0.94.0)
   0.94.1

Actually... It seems we had this problem since 0.90.
I'll pull it in if it gets done in time, otherwise it'll be in the next point 
release.

> Replayed edits from regions that failed to open during recovery aren't 
> removed from the global MemStore size
> 
>
> Key: HBASE-5611
> URL: https://issues.apache.org/jira/browse/HBASE-5611
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: Jean-Daniel Cryans
>Assignee: Jieshan Bean
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-5611-trunk.patch
>
>
> This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
> it's still possible to hit it if a region fails to open for more obscure 
> reasons like HDFS errors.
> Consider a region that just went through distributed splitting and that's now 
> being opened by a new RS. The first thing it does is to read the recovery 
> files and put the edits in the {{MemStores}}. If this process takes a long 
> time, the master will move that region away. At that point the edits are 
> still accounted for in the global {{MemStore}} size but they are dropped when 
> the {{HRegion}} gets cleaned up. It's completely invisible until the 
> {{MemStoreFlusher}} needs to force flush a region and that none of them have 
> edits:
> {noformat}
> 2012-03-21 00:33:39,303 DEBUG 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
> because memory above low water=5.9g
> 2012-03-21 00:33:39,303 ERROR 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
> for entry null
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The {{null}} here is a region. In my case I had so many edits in the 
> {{MemStore}} during recovery that I'm over the low barrier although in fact 
> I'm at 0. It happened yesterday and it still printing this out.
> To fix this we need to be able to decrease the global {{MemStore}} size when 
> the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5853) java.lang.RuntimeException: readObject can't find class org.apache.hadoop.hdfs.protocol.HdfsFileStatus

2012-04-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5853:
-

Fix Version/s: (was: 0.94.0)
   0.94.1

Moving this out until somebody confirms that it is serious problem.

> java.lang.RuntimeException: readObject can't find class 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus
> --
>
> Key: HBASE-5853
> URL: https://issues.apache.org/jira/browse/HBASE-5853
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.1
> Environment: hadoop-0.23.1 hbase-0.92.1 
>Reporter: jiafeng.zhang
> Fix For: 0.92.1, 0.94.1
>
>
> 2012-04-23 12:51:07,474 WARN org.apache.hadoop.ipc.Client: Unexpected error 
> reading responses on connection Thread[IPC Client (1260987126) connection to 
> server121/172.16.40.121:9000 from smp,5,main]
> java.lang.RuntimeException: readObject can't find class 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus
>   at 
> org.apache.hadoop.io.ObjectWritable.loadClass(ObjectWritable.java:372)
>   at 
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:223)
>   at 
> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:832)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:756)
> Caused by: java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus not found
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1151)
>   at 
> org.apache.hadoop.io.ObjectWritable.loadClass(ObjectWritable.java:368)
>   ... 4 more
> 2012-04-23 12:51:07,797 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> server124,60020,1335152900476: Replay of HLog required. Forcing server 
> shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region: 
> hbase_cdr,e0072b2b-5e19-431f-bb69-a6427765eac4,1334902272934.8365a7cbf90dd558f297d70224113c8a.
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1278)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1162)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1104)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:400)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:202)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: Failed on local exception: 
> java.io.IOException: Error reading responses; Host Details : local host is: 
> "server124/172.16.40.124"; destination host is: ""server121":9000; 
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:724)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1094)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:193)
>   at $Proxy10.getFileInfo(Unknown Source)
>   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:100)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:65)
>   at $Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1172)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:725)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:449)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:473)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548)
>   at 
> org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:595)
>   at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:506)
>   at org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:89)
>   at 
> org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:1905)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1254)
>   ... 6 more
> Caused by: java.io.IOException: Error reading responses
>   at org.apache.hadoop.ipc.Client$Connect

[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261917#comment-13261917
 ] 

Lars Hofhansl commented on HBASE-5864:
--

TestRegionRebalancing is unrelated (HBASE-5848)

> Error while reading from hfile in 0.94
> --
>
> Key: HBASE-5864
> URL: https://issues.apache.org/jira/browse/HBASE-5864
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
> HBASE-5864_3.patch, HBASE-5864_test.patch
>
>
> Got the following stacktrace during region split.
> {noformat}
> 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
> Failed getting store size for value
> java.io.IOException: Requested block is out of range: 2906737606134037404, 
> lastDataBlockOffset: 84764558
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
>   at 
> org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-25 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261928#comment-13261928
 ] 

Elliott Clark commented on HBASE-5862:
--

Testing on both 1.0 and 0.23 both work (after applying patch from HBASE-5870).  
MetricsRecord has always been an interface as far as I can tell.  What did you 
think needed a shim ?

> After Region Close remove the Operation Metrics.
> 
>
> Key: HBASE-5862
> URL: https://issues.apache.org/jira/browse/HBASE-5862
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
> HBASE-5862-2.patch, HBASE-5862-3.patch
>
>
> If a region is closed then Hadoop metrics shouldn't still be reporting about 
> that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5867) Improve Compaction Throttle Default

2012-04-25 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5867:
---

Attachment: D2943.1.patch

nspiegelberg requested code review of "[jira] [HBASE-5867] [89-fb] Improve 
Compaction Throttle Default".
Reviewers: JIRA, Kannan, Liyin

  We recently had a production issue where our compactions fell
  behind because our compaction throttle was improperly tuned and
  accidentally upgraded all compactions to the large pool. The default
  from HBASE-3877 makes 1 bad assumption: the default number of flushed
  files in a compaction. MinFilesToCompact should be taken into
  consideration. As a default, it is less damaging for the large thread to
  be slightly higher than it needs to be and only get timed-majors versus
  having everything accidentally promoted.

TEST PLAN
   - mvn test

REVISION DETAIL
  https://reviews.facebook.net/D2943

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/6717/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


> Improve Compaction Throttle Default
> ---
>
> Key: HBASE-5867
> URL: https://issues.apache.org/jira/browse/HBASE-5867
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>Priority: Minor
> Attachments: D2943.1.patch
>
>
> We recently had a production issue where our compactions fell behind because 
> our compaction throttle was improperly tuned and accidentally upgraded all 
> compactions to the large pool.  The default from HBASE-3877 makes 1 bad 
> assumption: the default number of flushed files in a compaction.  Currently 
> the algorithm is:
> throttleSize ~= flushSize * 2
> This assumes that the basic compaction utilizes 3 files and that all 3 files 
> are compressed.  In this case, "hbase.hstore.compaction.min" == 6 && the 
> values were not very compressible.  Both conditions should be taken into 
> consideration.  As a default, it is less damaging for the large thread to be 
> slightly higher than it needs to be versus having everything accidentally 
> promoted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261934#comment-13261934
 ] 

Zhihong Yu commented on HBASE-5870:
---

The two failed tests passed locally.

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870-v2.txt, 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5870:
--

Attachment: 5870-v2.txt

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870-v2.txt, 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5870:
--

Attachment: (was: 5870-v2.txt)

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870-v2.txt, 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261935#comment-13261935
 ] 

Zhihong Yu commented on HBASE-5862:
---

I didn't check MetricsRecord in hadoop 1.0
Thanks for the confirmation.

> After Region Close remove the Operation Metrics.
> 
>
> Key: HBASE-5862
> URL: https://issues.apache.org/jira/browse/HBASE-5862
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
> HBASE-5862-2.patch, HBASE-5862-3.patch
>
>
> If a region is closed then Hadoop metrics shouldn't still be reporting about 
> that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found

2012-04-25 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261937#comment-13261937
 ] 

Jonathan Hsieh commented on HBASE-5870:
---

+1.  (either as is or after addressing this following nit).

nit: Would throwing IllegalXxxException make sense instead of returning null?  
Since this this is testing only I don't feel strongly about this but if it does 
end up going into the production code it would matter more.

Filing follow-on issue sounds good to me.

> Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method 
> is not found
> -
>
> Key: HBASE-5870
> URL: https://issues.apache.org/jira/browse/HBASE-5870
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jonathan Hsieh
>Assignee: Zhihong Yu
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5870-v2.txt, 5870.txt
>
>
> After HBASE-5861 on 0.94 we are left with this issue on trunk.
> {code}
> $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23
> ...
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
> (default-testCompile) on project hbase: Compilation failure
> [ERROR] 
> /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
>  cannot find symbol
> [ERROR] symbol  : method getJobTracker()
> [ERROR] location: class 
> org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
> [ERROR] -> [Help 1]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5104) Provide a reliable intra-row pagination mechanism

2012-04-25 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261939#comment-13261939
 ] 

Phabricator commented on HBASE-5104:


jxcn01 has commented on the revision "[jira] [HBASE-5104] Provide a reliable 
intra-row pagination mechanism".

  Looks good, just some minor things.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java:386 Can 
we set it only if scan.getMaxResultsPerColumnFamily() >= 0?
  src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java:387 Can 
we set it only if the offset is > 0?
  src/main/java/org/apache/hadoop/hbase/client/Scan.java:638 Can we check: 
this.storeOffset > 0 || this.storeLimit > -1?
  I assume the offset should be position, and store limit is non-negative.

  The other choice is to add some checking in the corresponding set methods.
  src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java:931 ditto
  src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java:932 ditto
  src/main/protobuf/Client.proto:49 uint32 should be better, with no default. 
If it is not set, then it is -1.
  src/main/protobuf/Client.proto:50 uint32 is preferred, with no default.  If 
it is not set, then it is 0.
  src/main/protobuf/Client.proto:199 ditto
  src/main/protobuf/Client.proto:200 ditto
  src/main/java/org/apache/hadoop/hbase/client/Get.java:471 ditto

REVISION DETAIL
  https://reviews.facebook.net/D2799


> Provide a reliable intra-row pagination mechanism
> -
>
> Key: HBASE-5104
> URL: https://issues.apache.org/jira/browse/HBASE-5104
> Project: HBase
>  Issue Type: Bug
>Reporter: Kannan Muthukkaruppan
>Assignee: Madhuwanti Vaidya
> Attachments: D2799.1.patch, D2799.2.patch, D2799.3.patch, 
> D2799.4.patch, 
> jira-HBASE-5104-Provide-a-reliable-intra-row-paginat-2012-04-16_12_39_42.patch,
>  testFilterList.rb
>
>
> Addendum:
> Doing pagination (retrieving at most "limit" number of KVs at a particular 
> "offset") is currently supported via the ColumnPaginationFilter. However, it 
> is not a very clean way of supporting pagination.  Some of the problems with 
> it are:
> * Normally, one would expect a query with (Filter(A) AND Filter(B)) to have 
> same results as (query with Filter(A)) INTERSECT (query with Filter(B)). This 
> is not the case for ColumnPaginationFilter as its internal state gets updated 
> depending on whether or not Filter(A) returns TRUE/FALSE for a particular 
> cell.
> * When this Filter is used in combination with other filters (e.g., doing AND 
> with another filter using FilterList), the behavior of the query depends on 
> the order of filters in the FilterList. This is not ideal.
> * ColumnPaginationFilter is a stateful filter which ends up counting multiple 
> versions of the cell as separate values even if another filter upstream or 
> the ScanQueryMatcher is going to reject the value for other reasons.
> Seems like we need a reliable way to do pagination. The particular use case 
> that prompted this JIRA is pagination within the same rowKey. For example, 
> for a given row key R, get columns with prefix P, starting at offset X (among 
> columns which have prefix P) and limit Y. Some possible fixes might be:
> 1) enhance ColumnPrefixFilter to support another constructor which supports 
> limit/offset.
> 2) Support pagination (limit/offset) at the Scan/Get API level (rather than 
> as a filter) [Like SQL].
> Original Post:
> Thanks Jiakai Liu for reporting this issue and doing the initial 
> investigation. Email from Jiakai below:
> Assuming that we have an index column family with the following entries:
> "tag0:001:thread1"
> ...
> "tag1:001:thread1"
> "tag1:002:thread2"
> ...
> "tag1:010:thread10"
> ...
> "tag2:001:thread1"
> "tag2:005:thread5"
> ...
> To get threads with "tag1" in range [5, 10), I tried the following code:
> ColumnPrefixFilter filter1 = new 
> ColumnPrefixFilter(Bytes.toBytes("tag1"));
> ColumnPaginationFilter filter2 = new ColumnPaginationFilter(5 /* limit 
> */, 5 /* offset */);
> FilterList filters = new FilterList(Operator.MUST_PASS_ALL);
> filters.addFilter(filter1);
> filters.addFilter(filter2);
> Get get = new Get(USER);
> get.addFamily(COLUMN_FAMILY);
> get.setMaxVersions(1);
> get.setFilter(filters);
> Somehow it didn't work as expected. It returned the entries as if the filter1 
> were not set.
> Turns out the ColumnPrefixFilter returns SEEK_NEXT_USING_HINT in some cases. 
> The FilterList filter does not handle this return code properly (treat it as 
> INCLUDE).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.j

[jira] [Commented] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.

2012-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261940#comment-13261940
 ] 

Lars Hofhansl commented on HBASE-5873:
--

This change does violate encapsulation a bit.
I double checked where in the code we create instances of AssignmentManager. 
Besides the HMaster it is only from tests (and they all pass it's good).


> TimeOut Monitor thread should be started after atleast one region server 
> registers.
> ---
>
> Key: HBASE-5873
> URL: https://issues.apache.org/jira/browse/HBASE-5873
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
>Reporter: ramkrishna.s.vasudevan
>Assignee: rajeshbabu
>Priority: Minor
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5873-trunk.txt, HBASE-5873.patch
>
>
> Currently timeout monitor thread is started even before the region server has 
> registered with the master.
> In timeout monitor we depend on the region server to be online 
> {code}
> boolean allRSsOffline = this.serverManager.getOnlineServersList().
> isEmpty();
> {code}
> Now when the master starts up it sees there are no online servers and hence 
> sets 
> allRSsOffline to true.
> {code}
> setAllRegionServersOffline(allRSsOffline);
> {code}
> So this.allRegionServersOffline is also true.
> By this time an RS has come up,
> Now timeout comes up again (after 10secs) in the next cycle he sees 
> allRSsOffline  as false.
> Hence 
> {code}
> else if (this.allRegionServersOffline && !allRSsOffline) {
> // if some RSs just came back online, we can start the
> // the assignment right away
> actOnTimeOut(regionState);
> {code}
> This condition makes him to take action based on timeout.
> Because of this even if one Region assignment of ROOT is going on, this piece 
> of code triggers another assignment and thus we get RegionAlreadyinTransition 
> Exception. Later we need to wait for 30 mins for assigning ROOT itself.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5104) Provide a reliable intra-row pagination mechanism

2012-04-25 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261941#comment-13261941
 ] 

Jimmy Xiang commented on HBASE-5104:


I commented on phabricator.  Looks good to me, just some minor things.

> Provide a reliable intra-row pagination mechanism
> -
>
> Key: HBASE-5104
> URL: https://issues.apache.org/jira/browse/HBASE-5104
> Project: HBase
>  Issue Type: Bug
>Reporter: Kannan Muthukkaruppan
>Assignee: Madhuwanti Vaidya
> Attachments: D2799.1.patch, D2799.2.patch, D2799.3.patch, 
> D2799.4.patch, 
> jira-HBASE-5104-Provide-a-reliable-intra-row-paginat-2012-04-16_12_39_42.patch,
>  testFilterList.rb
>
>
> Addendum:
> Doing pagination (retrieving at most "limit" number of KVs at a particular 
> "offset") is currently supported via the ColumnPaginationFilter. However, it 
> is not a very clean way of supporting pagination.  Some of the problems with 
> it are:
> * Normally, one would expect a query with (Filter(A) AND Filter(B)) to have 
> same results as (query with Filter(A)) INTERSECT (query with Filter(B)). This 
> is not the case for ColumnPaginationFilter as its internal state gets updated 
> depending on whether or not Filter(A) returns TRUE/FALSE for a particular 
> cell.
> * When this Filter is used in combination with other filters (e.g., doing AND 
> with another filter using FilterList), the behavior of the query depends on 
> the order of filters in the FilterList. This is not ideal.
> * ColumnPaginationFilter is a stateful filter which ends up counting multiple 
> versions of the cell as separate values even if another filter upstream or 
> the ScanQueryMatcher is going to reject the value for other reasons.
> Seems like we need a reliable way to do pagination. The particular use case 
> that prompted this JIRA is pagination within the same rowKey. For example, 
> for a given row key R, get columns with prefix P, starting at offset X (among 
> columns which have prefix P) and limit Y. Some possible fixes might be:
> 1) enhance ColumnPrefixFilter to support another constructor which supports 
> limit/offset.
> 2) Support pagination (limit/offset) at the Scan/Get API level (rather than 
> as a filter) [Like SQL].
> Original Post:
> Thanks Jiakai Liu for reporting this issue and doing the initial 
> investigation. Email from Jiakai below:
> Assuming that we have an index column family with the following entries:
> "tag0:001:thread1"
> ...
> "tag1:001:thread1"
> "tag1:002:thread2"
> ...
> "tag1:010:thread10"
> ...
> "tag2:001:thread1"
> "tag2:005:thread5"
> ...
> To get threads with "tag1" in range [5, 10), I tried the following code:
> ColumnPrefixFilter filter1 = new 
> ColumnPrefixFilter(Bytes.toBytes("tag1"));
> ColumnPaginationFilter filter2 = new ColumnPaginationFilter(5 /* limit 
> */, 5 /* offset */);
> FilterList filters = new FilterList(Operator.MUST_PASS_ALL);
> filters.addFilter(filter1);
> filters.addFilter(filter2);
> Get get = new Get(USER);
> get.addFamily(COLUMN_FAMILY);
> get.setMaxVersions(1);
> get.setFilter(filters);
> Somehow it didn't work as expected. It returned the entries as if the filter1 
> were not set.
> Turns out the ColumnPrefixFilter returns SEEK_NEXT_USING_HINT in some cases. 
> The FilterList filter does not handle this return code properly (treat it as 
> INCLUDE).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-25 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261945#comment-13261945
 ] 

Elliott Clark commented on HBASE-5862:
--

Thanks for the check.

> After Region Close remove the Operation Metrics.
> 
>
> Key: HBASE-5862
> URL: https://issues.apache.org/jira/browse/HBASE-5862
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
> HBASE-5862-2.patch, HBASE-5862-3.patch
>
>
> If a region is closed then Hadoop metrics shouldn't still be reporting about 
> that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-25 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261948#comment-13261948
 ] 

Zhihong Yu commented on HBASE-5862:
---

+1 on patch v3.

> After Region Close remove the Operation Metrics.
> 
>
> Key: HBASE-5862
> URL: https://issues.apache.org/jira/browse/HBASE-5862
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
> HBASE-5862-2.patch, HBASE-5862-3.patch
>
>
> If a region is closed then Hadoop metrics shouldn't still be reporting about 
> that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-25 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5862:
--

Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

> After Region Close remove the Operation Metrics.
> 
>
> Key: HBASE-5862
> URL: https://issues.apache.org/jira/browse/HBASE-5862
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
> HBASE-5862-2.patch, HBASE-5862-3.patch
>
>
> If a region is closed then Hadoop metrics shouldn't still be reporting about 
> that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5876) TestImportExport has been failing against hadoop 0.23 profile

2012-04-25 Thread Zhihong Yu (JIRA)
Zhihong Yu created HBASE-5876:
-

 Summary: TestImportExport has been failing against hadoop 0.23 
profile
 Key: HBASE-5876
 URL: https://issues.apache.org/jira/browse/HBASE-5876
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Yu


TestImportExport has been failing against hadoop 0.23 profile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4821) A fully automated comprehensive distributed integration test for HBase

2012-04-25 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261960#comment-13261960
 ] 

Jonathan Hsieh commented on HBASE-4821:
---

TestLoadAndVerify is a Bigtop test currently, but others that might fit into 
Roman's category #2 include any of the HBase MR tests or tool-sy tests like 
TestImportTsv, TestImportExport, (possibly the thrift/rest/avro servers) and 
some of the other long running external-api only tests like TestAcidGuarantee. 

Also another purpose of the shim layer is to provide an abstraction layer so 
the same code is used against a minicluster when run in a HBase context or 
against a real cluster in the Bigtop context.  It would a thinner interface 
than Mini*Cluster that does not expose internals.  I haven't thought this out 
completely yet but it could potentially be useful for dealing Hadoop1 vs 
Hadoop2 issues as well.

> A fully automated comprehensive distributed integration test for HBase
> --
>
> Key: HBASE-4821
> URL: https://issues.apache.org/jira/browse/HBASE-4821
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Critical
>
> To properly verify that a particular version of HBase is good for production 
> deployment we need a better way to do real cluster testing after incremental 
> changes. Running unit tests is good, but we also need to deploy HBase to a 
> cluster, run integration tests, load tests, Thrift server tests, kill some 
> region servers, kill the master, and produce a report. All of this needs to 
> happen in 20-30 minutes with minimal manual intervention. I think this way we 
> can combine agile development with high stability of the codebase. I am 
> envisioning a high-level framework written in a scripting language (e.g. 
> Python) that would abstract external operations such as "deploy to test 
> cluster", "kill a particular server", "run load test A", "run load test B" 
> (we already have a few kinds of load tests implemented in Java, and we could 
> write a Thrift load test in Python). This tool should also produce 
> intermediate output, allowing to catch problems early and restart the test.
> No implementation has yet been done. Any ideas or suggestions are welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-25 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-5862:
-

Status: Patch Available  (was: Open)

> After Region Close remove the Operation Metrics.
> 
>
> Key: HBASE-5862
> URL: https://issues.apache.org/jira/browse/HBASE-5862
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
> HBASE-5862-2.patch, HBASE-5862-3.patch
>
>
> If a region is closed then Hadoop metrics shouldn't still be reporting about 
> that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5877) When a query fails because the region has moved, let the regionserver returns the new address to the client

2012-04-25 Thread nkeywal (JIRA)
nkeywal created HBASE-5877:
--

 Summary: When a query fails because the region has moved, let the 
regionserver returns the new address to the client
 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


This is mainly useful when we do a rolling restart. This will decrease the load 
on the master and the network load.

Note that a region is not immediately opened after a close. So:
- it seems preferable to wait before retrying on the other server. An 
optimisation would be to have an heuristic depending on when the region was 
closed.
- during a rolling restart, the server moves the regions then stops. So we may 
have failures when the server is stopped, and this patch won't help.


The implementation in the first patch does:
- on the region move, there is an added parameter on the regionserver#close to 
say where we are sending the region
- the regionserver keeps a list of what was moved. Each entry is kept 100 
seconds.
- the regionserver sends a specific exception when it receives a query on a 
moved region. This exception contains the new address.
- the client analyses the exeptions and update its cache accordingly...






--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver returns the new address to the client

2012-04-25 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5877:
---

Attachment: 5877.v1.patch

> When a query fails because the region has moved, let the regionserver returns 
> the new address to the client
> ---
>
> Key: HBASE-5877
> URL: https://issues.apache.org/jira/browse/HBASE-5877
> Project: HBase
>  Issue Type: Improvement
>  Components: client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Attachments: 5877.v1.patch
>
>
> This is mainly useful when we do a rolling restart. This will decrease the 
> load on the master and the network load.
> Note that a region is not immediately opened after a close. So:
> - it seems preferable to wait before retrying on the other server. An 
> optimisation would be to have an heuristic depending on when the region was 
> closed.
> - during a rolling restart, the server moves the regions then stops. So we 
> may have failures when the server is stopped, and this patch won't help.
> The implementation in the first patch does:
> - on the region move, there is an added parameter on the regionserver#close 
> to say where we are sending the region
> - the regionserver keeps a list of what was moved. Each entry is kept 100 
> seconds.
> - the regionserver sends a specific exception when it receives a query on a 
> moved region. This exception contains the new address.
> - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >